integers

integers
Photo by Markus Spiske / Unsplash

Signed and Unsigned Integers are pretty core to programming in a not-assembler level language. For STZ I'd like to represent Signed and Unsigned Integers as, internally, binary data.

bits: (Binary of: 8) = 0b01010101.

How many bits you have determines what kind of storage it can fit in to. The fixed bit sizes of 8, 16, 32, 64, 128, 256, 512 can all fit in registers. Anything less than power of two will take up the space of the power of two. Anything over the maximum register size for the machine has to be stored as an aggregate.

This gives us a very fundamental type - an array of booleans. Booleans can be represented as a single bit Binary, eg:

Boolean :: {bit: (Binary of: 1)}.
false   :: {Boolean | bit: 0b0}.
true    :: {Boolean | bit: 0b1}.

This gives us our first primitive operations in the language. We want as few of those as possible as they must be implemented in every backend (the interpreter, the transpiler to C, and may be some other kind of compiler too).

[a & b]    [a: Binary, b: Binary -> Binary | #primitive].
[a | b]    [a: Binary, b: Binary -> Binary | #primitive].
[a invert] [a: Binary -> Binary | #primitive].

This gives us enough to implement more of Boolean:

[a = b] [a: Boolean, b: Boolean -> Boolean | {bit: (a & b)}].
[a not] [a: Boolean, b: Boolean -> Boolean | {bit: a invert}].
[a & b] [a: Boolean, b: Boolean -> Boolean | {bit: (a & b)}].
[a | b] [a: Boolean, b: Boolean -> Boolean | {bit: (a | b)}].

We can even define some flow control now:

[a then: true-block] [a: Boolean, true-block: (-> ø) -> ø |
  {[], true-block}[a bit] evaluate.

[a else: false-block] [a: Boolean, false-block: (-> ø) -> ø |
  {false-block, []}[a bit] evaluate.

[a then: true-block else: false-block]
  [a: Boolean, true-block: (-> ø), false-block: (-> ø) -> ø |
    {false-block, true-block}[a bit] evaluate].

That squares away the majority of Boolean behaviour we immediately care about. It does require the ability to access things in memory. That will also require a primitive. But let's move back to Integers.

PlatformIntegerSize :: 64.
UnsignedInteger :: {
  #bit-size: UnsignedInteger = PlatformIntegerSize,
  ...Number,
  bits: (Binary of: #bit-size)}.

It is fun to see the definition of a thing being used by its own definition. Because we have a default value for #bit-size, we can use UnsignedInteger to get a 64-bit unsigned integer; or we can specify a size ourselves eg (UnsignedInteger of: 8) to get a single byte.

To get the true Smalltalk experience we'd' want LargeUnsignedInteger so that overflows don't break the program in exciting C-like ways. I'm not going to focus on that here, but because Binary can be any bit size there's nothing wrong with the result of an addition begin (UnsignedInteger of: 1024) etc.

Similarly we can defined signed integers the same way:

Integer :: {
  #bit-size: UnsignedInteger = PlatformIntegerSize,
  ...Number,
  bits: (Binary of: #bit-size)}.

Actually adding and subtracting integers, as well as the other maths functions, does require primitives. It's probably the most primitives-rich part of the system. That also includes signed and unsigned extending from Binary of: N to binary of: M.

One other fun thing we can do is define enumerations with powers-of-two values and use them as switches in a Binary.

[a[index]] [binary: Binary, index: UnsignedInteger -> Boolean |
  {bit: ((a & (index ^ 2)) shift-right: index) & 2b1}].

[a[index] = b] [a: Binary, index: UnsignedInteger, b: Boolean -> Binary |
  (b shift-left: index not) invert & a].

Options :: {OptionA = 1, OptionB = 2, OptionC = 4}.

settings: Options = OptionB + OptionC.
settings[OptionA] = true.
settings[OptionC] = false.

Or something like that. That was a bit off the cuff and not at all tested - but it gives us a good idea of how we can utilise Binary, Boolean, UnsignedInteger, and Integer together. It also gives us a more fundamental attachment to the magic-rock all this is running on.