layer zero of st zero
Instead of writing an interpreter, what if we built the language so it compiles every layer from the very bottom to the very top. What limitations would we need to place on ourselves to make that possible?
The first thing we need to consider is what the output of the program is - a custom binary format that describes all the data? or an xml format for all the data? or perhaps it's a dynamically linked library that you can call an API on to get the data and the data is just structs stored in the library itself?
The next thing we need to consider is what subset of the language can we use to make the layer 0 libraries?
Right now only parts of the program run while the rest gets prepped for compilation. Let's look at the difference between the two:
module :: 'core.layer0'.
[int increment] [Integer -> Integer | Integer + 1].
5 increment
Here's the breakdown. Everything first gets parsed and then types are resolved. Types are code that run, which means all the bits that bold must be compiled and run in what we'd call layer -1. The results of that execution feed back in to the parse tree.
The first thing we note is if this is layer 0 we cannot run anything because nothing but primitives exist - not even any types. Integer does not exist. We have to define what that is in layer0 itself for it to exist.
The only kind of type we can support in layer 0 is one that doesn't need to execute and can be resolved to a type in our current compilation unit. That also means any type that requires generics cannot be used either. The 'boot strap' language must be very simple.
The next thing we note is any types or methods we create in this layer are meant to be used by the layer that imports this module, not this module itself. We therefore cannot call increment here in layer0. Layer 0 cannot support any 'execution' at all.
We want compilation to run fast and we want compilation to be incremental. Re-parsing a bunch of xml or json seems counter productive to that ideal, therefore we either do a data dump or a dll/so. The point of the compiler is to produce executable code.
The output is either going to be a static library, a shared library, a main executable, or a compilation unit. If it's a compilation unit it will output a single callable method that provides access to all the compilation results.
Because we've declared types that specify a known type directly and don't need to be executed are the only ones allowed in layer 0, we can state that there is no layer -1. We can also state that it'll remain that way for any subsequent layer. It's not just an optimisation, but a necessity for layer 0 to exist.
Now that we know what the compilation unit is we also know what an import does - it loads the library and calls the function to get all the points to the compilation data.
As I'm writing this I'm realising there's two compilations going on - one temporary one to resolve all the types and scripting for the module and a second one to produce the re-usable module.
But there we have it - if we can make a layer 0 library (which can do imports) then we can make STZ without ever writing an interpreter. Programs that write programs that write programs until eventually you have your final output.
Dependencies are necessary; when a library is built off other libraries it should store in itself the UUIDs from those libraries so that recompilation can skip whole sections and be as snappy and quick as possible.
What goes in to layer 0? Well weirdly things like Boolean are out. A Boolean is based on an UnsignedInteger of: 8 or a Binary of: 1. UnsignedInteger might be out since it is a representation of a bits, aka Binary.
It's possible we need to build off of baser versions of these types that have no generics, eg: Binary8 and UnsignedInteger8 instead of Binary of: 8 and UnsignedInteger of: 8. These types would not be exported beyond layer0 as they're not as useful as the more general versions.
Until I try and do it for real it's just guess work. I feel hopeful this will work very well and allow the language to kick start itself rapidly. Given the output of each module is a C file and build instructions things can be distributed easily even without the STZ compiler. Is that a useful goal? probably not.