We will have 'mixed expressions' of sorts but they will only involve logical operations (And Or Xor AndThen OrElse) on boolean variables and boolean expressions - and relational operations (= /= < <= > >=) on integer variables and integer expressions.
The variable we declared in Phase 3, Type_int, will be very useful in this phase, along with its corresponding Type_bol. We will be adding a field to exprtrees so that they have a type. This typeid should be set up to point to Type_int when the exprtree involves integer arith- metic and Type_bol when the exprtree involves boolean arithmetic.
These are your standard binary logical operators. Both children should be
exprtrees with typeid pointing to Type_bol.
and then and or else denote short circuited boolean evalutions.
(see Section 12.7 of our text)
These are your standard binary relational operators. Both children should be exprtrees with typeid pointing to Type_int. "=" and "/=" can check equality (nonequality) of boolean expressions also
is a predefined enumerated type in Ada and therefore this type should be entered in the symbol table before processing begins. You also need to enter the constituent members of this enumerated type in the symbol table so that an integral ordering (FALSE=0, TRUE=1) is available.
This integral order can then be used to compare False < True = True and such for constant folding.
The precedence of operations is: logical
relational
binary add
unary add
multiply
not
This implementation is not all that long, but appears to be so because there are so many operators. Below is a guideline that will make it simpler to implement (and test) in pieces.
Step 1 - implement the lexer and the grammar - check with an
old test input code (i.e. phase3 with declarations) that
the compiler still works. You can write dummy semantic
actions for your grammar that only having entering/leaving
calls in their body.
Step 2 - check that a new test code will have its the new tokens
recognized - i.e. a test case with %%t to turn on tokens
you'll want to turn off code generation, simple -nc testname
Step 3 - check the grammar (simple -d or add %%ss to test code) to see that
the new semantic routines are called at the correct time of the parse
don't worry that errorentries start to occur - try to understand
what causes them, i.e. what your semantic routines are expected
to do to maintain the semantic start. this gives you a feeling
for the timing of the LR grammar. keep code generation off still.
make sure you keep the lr_debug statements going in your simple.lr
grammar. these were not included in the sample grammar mailed
today, but you'll want to keep this "tradition" going for help
with getting the semantics running properly.
Step 4 - replace the dummy semantic routines with your implementation.
(first handle the case where constant folding isn't a
concern - i.e. the leaves are addresses or constants, but
never both leaves are constants)
substep 1: check boolean assignments first -
make sure boolean variables are correctly in symtab
%%dmp with a simple test code only having a boolean
assignment using the boolean constant for rhs
substep 2: check relational expressions are ok -
using the if statement - test it first with an
equality test since this one used to work
then check other relational operations
then check an assignment statement to a relational op
substep 3: check logical expressions -
use the if statement with and's, or's ...
then use the boolean assignment with expressions
substep 4: check the if-statement with a boolean expression
consisting only of a boolean variable
Step 5 - Take care of the extensive need for constant folding for the
boolean expressions
Step 6 - Take care of all the type checking for the new (and old)
operators. The routine performop needs to have its exprtree
have the typeid set to Type_int. And needs to check that
the children are of the proper type before proceeding.
I let the boolean constants (TRUE and FALSE) be represented as integer immediates (1 and 0, resp.) You might want to split this off more.
8 new OPERATIONs - neqop, ltop, leop, gtop, geop, andop, orop, xorop
optionally: orelseop, andthenop
I gave expressions (EXPRTREE) a TYPEKIND typeid; field to help with checking for valid assignments and for valid folding. this is just going to point to Type_int or Type_bol.
semstack.h : I changed iconstentry to be just constentry (then the
entry will have a typeid to tell if its a bconstentry
or iconstentry)
semform.h : Similarly with EXPRDESC - no longer have atree
and btree. One expression tree (with the pointer to
its type can handle both now.
Need to model you work with Integers to have
TYPEKIND Type_und, Type_int, Type_bol;
semform.h : need to add all the operators
and the EXPRDESC has an additional field TYPEKIND typeid;
after the OPERATION op; and before the union.
InitSemantics: You need a new Type_bol type pushed into the symbol table. But there were no other changes to the routines that facilitated variable typing. This was the purpose of having the Type_int type already in the symbol table when we process integer declarations initially.
BoolconstReduce: Very similar to IconstReduce. Make sure you set the type of the expression in both cases appropriately as you build the exprtree.
AddrToPrimary, PerformUnary, DoArithOp: Since expressions have a type field, wherever you created expression trees before, you want to also set the type field properly.
Operator (not changed, but used a lot - might want to change comments): The relational and logical operators are PUSHed onto the semstack. (Used to be only had the one relational eqop which wasn't pushed). This makes the semantic routines DoLogicOp/DoRelatOp more symmetric and very similar to DoArithOp (old performop, which handled all the binary integer operators. When we generate code for the expression trees that DoArithOp/DoLogicOp/DoRelatOp build, we'll handle all the different cases of the different ops.
I let the old semroutine - Operator - push all the new operators on. You could split this up more in the grammar and have separate semantic routines for the different types of operators (e.g. integer ops, relational ops, logical ops) which would modularize the constant folding.
There's a lot more semantic checking to do. Assignments - must check that a boolean variable is assigned a boolean expression; integer variable gets only integer expression.
DoLogicOp, DoRelatOp:
These are very similar to DoArithOp. They build the boolean expression
trees.
DoLogicOp: stack should have | expr (bool) |
---------------
| op (logical)|
---------------
| expr (bool) | -- > | expr(bool)|
--------------- -------------
DoRelatOp: stack should have | expr (arith)|
---------------
| op (relatnl)|
---------------
| expr (arith)| -- > | expr(bool)|
--------------- -------------
PerformNot: should have a boolean exprtree on top of stack
and you replace with a taller boolean exprtree
similar to PerformUnary
Print_Semstack: I would add a lot of debugging information here. Since
you now need to follow a typeid to find out if an
exprtree is boolean or arithmetic; same for constants,
there are a lot of switch statements in the different
cases of the semstack entries.
Also can have the operator entry print out which operator
it is.