On 2021-06-21 2:35 a.m., Peter Corlett via cctalk wrote:
Code-generation is a whole different can of worms and unlike the
well-trodden path of parsing, is still not a solved problem in computer
science. All of the compiler textbooks I've checked, including the infamous
Dragon book, handwave code generation, introducing the tree-matching
algorithm which works well for straight-line code on their made-up
educational CPU architectures, but leaves a lot on the table with real-world
CPUs. This limitation probably doesn't matter for your CPU.
It seems that you might want to produce code directly from parsing. This
shortcut can work if you're producing code for a stack machine such as
FORTH, but not a register machine (unless you use your registers as a
stack). A common wheeze on the 6502 is to implement a virtual stack machine
interpreter, and then the compiler targets that.
To do code-generation "properly", you need a multi-pass algorithm: parse the
source into a tree, optionally do tree-to-tree transformations to do type
checks and/or apply optimisations, and then apply the tree-matching
algorithm to that to produce machine code. This is all well-described in any
good compiler textbook.
I want to transformatiom before full parsing expressions, as a text process.
string "a+(b+c)*c" becomes "((b+c)*c)a+".
Note that except for trivial expressions, you'll need a stack and/or other
registers to squirrel away intemediate results. Consider "(A*B)-(C*D)": you
need to save the result of one of the multiplications before doing the other
and then the subtraction.
They need to be saved regardless of what machine model you use.
In the case of the tree-matching algorithm, if your CPU (or rather, the
model given to the tree-matcher) is too limited to evaluate a given
expression, it will definitively fail and give you a subtree that it cannot
match. You then have the "joy" of fixing the model, possibly by adding
virtual instructions which are really library calls. For a simple CPU, such
calls might account for most of the generated code, at which point a virtual
machine becomes a good idea.
I just need calls for mult,divide and special operations. The limitation
is just memory and variable space compared to a larger machine. 2
addressing modes, Immediate and indexed keep it simple. The compiled
language I plan to design will be simple, but records of some sort will
be needed. Char,word and Real is all that you get.
The book "Writing interactive compilers and interpreters" may be more
suitable if you're looking for quick hacks to get up and running on a small
machine, assuming you can lay your hands on a copy.
Were are the books, writing assemblers and loaders?
Ben.