Hi,
cxzuk wrote:
Spending some time on the Intermediate Representation on my compiler and was wondering if anyone had any experience or advice to help guide me. A summary or link to pros and cons on each type would be great.
I'm currently considering a stack IR as it seems the simplist.
For which purpose?
For simplicity, the best would be leaving it as an
abstract syntax tree because this is what your parser is likely to generate anyway. This is fine for initial optimisations, and is relatively space efficient to encode as sequential data (you only need 2 bits per entry, one to say if an entry has children and another to say if an entry is its parent's last child).
For making it easier to do more complex optimisations, the best would be
static single assignment form. This is probably the worst option for space efficiency.
For space efficiency, the best would be something like machine code where operations are encoded as (e.g.) "dest opcode src1, src2, ..." and where there's explicit control flow opcodes (jumps, branches and calls); and where you pretend there's an "infinite" number of virtual registers (to avoid the bloat of all the pushes that stack machines cause). This is probably also the easiest form to convert to actual machine code (e.g. beginning with assigning real registers to virtual registers, etc).
Note that you'll probably use multiple representations at different places (e.g. convert source code into AST, do stuff with AST, then convert to SSA, do stuff with SSA, then convert to "machine-code-like", then convert to actual machine code).
Cheers,
Brendan