We will have 'mixed expressions' of sorts but they will only involve logical operations (And Or Xor AndThen OrElse) on boolean variables and boolean expressions - and relational operations (= /= < <= > >=) on integer variables and integer expressions.
The variable we declared in Phase 3, Type_int, will be very useful in this phase, along with its corresponding Type_bol. We will be adding a field to exprtrees so that they have a type. This typeid should be set up to point to Type_int when the exprtree involves integer arith- metic and Type_bol when the exprtree involves boolean arithmetic.
These are your standard binary logical operators. Both children should be
exprtrees with typeid pointing to Type_bol.
and then and or else denote short circuited boolean evalutions.
(see Section 12.7 of our text)
These are your standard binary relational operators. Both children should be exprtrees with typeid pointing to Type_int. "=" and "/=" can check equality (nonequality) of boolean expressions also
is a predefined enumerated type in Ada and therefore this type should be entered in the symbol table before processing begins. You also need to enter the constituent members of this enumerated type in the symbol table so that an integral ordering (FALSE=0, TRUE=1) is available.
This integral order can then be used to compare False < True = True and such for constant folding.
The precedence of operations is: logical relational binary add unary add multiply not
This implementation is not all that long, but appears to be so because there are so many operators. Below is a guideline that will make it simpler to implement (and test) in pieces.
Step 1 - implement the lexer and the grammar - check with an old test input code (i.e. phase3 with declarations) that the compiler still works. You can write dummy semantic actions for your grammar that only having entering/leaving calls in their body.
Step 2 - check that a new test code will have its the new tokens recognized - i.e. a test case with %%t to turn on tokens you'll want to turn off code generation, simple -nc testname
Step 3 - check the grammar (simple -d or add %%ss to test code) to see that the new semantic routines are called at the correct time of the parse don't worry that errorentries start to occur - try to understand what causes them, i.e. what your semantic routines are expected to do to maintain the semantic start. this gives you a feeling for the timing of the LR grammar. keep code generation off still. make sure you keep the lr_debug statements going in your simple.lr grammar. these were not included in the sample grammar mailed today, but you'll want to keep this "tradition" going for help with getting the semantics running properly.
Step 4 - replace the dummy semantic routines with your implementation. (first handle the case where constant folding isn't a concern - i.e. the leaves are addresses or constants, but never both leaves are constants) substep 1: check boolean assignments first - make sure boolean variables are correctly in symtab %%dmp with a simple test code only having a boolean assignment using the boolean constant for rhs substep 2: check relational expressions are ok - using the if statement - test it first with an equality test since this one used to work then check other relational operations then check an assignment statement to a relational op substep 3: check logical expressions - use the if statement with and's, or's ... then use the boolean assignment with expressions substep 4: check the if-statement with a boolean expression consisting only of a boolean variable
Step 5 - Take care of the extensive need for constant folding for the boolean expressions
Step 6 - Take care of all the type checking for the new (and old) operators. The routine performop needs to have its exprtree have the typeid set to Type_int. And needs to check that the children are of the proper type before proceeding.
I let the boolean constants (TRUE and FALSE) be represented as integer immediates (1 and 0, resp.) You might want to split this off more.
8 new OPERATIONs - neqop, ltop, leop, gtop, geop, andop, orop, xorop
optionally: orelseop, andthenop
I gave expressions (EXPRTREE) a TYPEKIND typeid; field to help with checking for valid assignments and for valid folding. this is just going to point to Type_int or Type_bol.
semstack.h : I changed iconstentry to be just constentry (then the entry will have a typeid to tell if its a bconstentry or iconstentry)
semform.h : Similarly with EXPRDESC - no longer have atree and btree. One expression tree (with the pointer to its type can handle both now.Need to model you work with Integers to have
TYPEKIND Type_und, Type_int, Type_bol;
semform.h : need to add all the operators and the EXPRDESC has an additional field TYPEKIND typeid; after the OPERATION op; and before the union.
InitSemantics: You need a new Type_bol type pushed into the symbol table. But there were no other changes to the routines that facilitated variable typing. This was the purpose of having the Type_int type already in the symbol table when we process integer declarations initially.
BoolconstReduce: Very similar to IconstReduce. Make sure you set the type of the expression in both cases appropriately as you build the exprtree.
AddrToPrimary, PerformUnary, DoArithOp: Since expressions have a type field, wherever you created expression trees before, you want to also set the type field properly.
Operator (not changed, but used a lot - might want to change comments): The relational and logical operators are PUSHed onto the semstack. (Used to be only had the one relational eqop which wasn't pushed). This makes the semantic routines DoLogicOp/DoRelatOp more symmetric and very similar to DoArithOp (old performop, which handled all the binary integer operators. When we generate code for the expression trees that DoArithOp/DoLogicOp/DoRelatOp build, we'll handle all the different cases of the different ops.
I let the old semroutine - Operator - push all the new operators on. You could split this up more in the grammar and have separate semantic routines for the different types of operators (e.g. integer ops, relational ops, logical ops) which would modularize the constant folding.
There's a lot more semantic checking to do. Assignments - must check that a boolean variable is assigned a boolean expression; integer variable gets only integer expression.
DoLogicOp, DoRelatOp: These are very similar to DoArithOp. They build the boolean expression trees. DoLogicOp: stack should have | expr (bool) | --------------- | op (logical)| --------------- | expr (bool) | -- > | expr(bool)| --------------- ------------- DoRelatOp: stack should have | expr (arith)| --------------- | op (relatnl)| --------------- | expr (arith)| -- > | expr(bool)| --------------- -------------
PerformNot: should have a boolean exprtree on top of stack and you replace with a taller boolean exprtree similar to PerformUnary
Print_Semstack: I would add a lot of debugging information here. Since you now need to follow a typeid to find out if an exprtree is boolean or arithmetic; same for constants, there are a lot of switch statements in the different cases of the semstack entries. Also can have the operator entry print out which operator it is.