q0 with Tokens, SemStack, AST, SPIM, may be useful as a guide for your own testing.
Test2 Test2 with Tokens and SemStack
Two different ways simple.lex and simple.lr and codegen.c are described here:
I want to point out a change to the "FrontEnd" that might help with Scanning/Parsing. There are several ways to handle strings and I want to outline two of them now. Our text p. 67 gives two lex-rules to recognize strings and I found \"([^\n\"]|\"\")*\" to be most useful, using the character class [^\n\"] to specify both not EOL and not " ---------------------------------------------------------------- a) simple.lex \"([^\n\"]|\"\")*\" return token(StrConst); along with the stripquotes routine slightly modified to translate the "" in the input buffer, yyytext, to \" and then in simple.lr actparam : expr {lr_debug ("actparam -> expr"); CreateActParam(); } | StrConst {lr_debug ("actparam -> StrConst"); strip_quotes(yytext); PushString(yytext); CreateActParam(); in order to preserve keeping a valid LISTING file ---------------------------------------------------------------- ** OR ** ---------------------------------------------------------------- b) simple.lex \"([^\n\"]|\"\")*\" { stripquotes(); return token( StrConst );} along with the stripquotes routine that only removes the front and end " of the input buffer and then in simple.lr actparam : expr {lr_debug ("actparam -> expr"); CreateActParam(); } | STrConst {lr_debug ("actparam -> Sconst"); PushString( yytext ); CreateActParam(); } and then in codegen.c, the routine ConstStorage can perform the translation of "" to \" just in time for SPIM
You should examine the WriteInstr routine in codegen.c to see how the Wrln instruction is able to generate SPIM code to write the new line string.
11. codegen.c
char *new_line = "\\n";
and the use of
AddStringConstant (new_line);
to add this string to the front of the list is a good model to look at. NOTE: the routine AddStringConstant has already been written for you in sem.c and is used in codegen.c to add the sample of a newline to the initial project. You will be using AddStringConstant a lot more now in the translation of source code with strings, so it did make sense to place this module in sem.c from the beginning. This string will always be the first string (since it's called in codegen.c not sem.c where the new strings from the parse are handled) in the string list.
You don't need to change anything in GenCode.
S0: .asciiz "\n"
to correspond with the new_line string above. Continue this model. This is used to translate writeln differently from the write. Construct a test problem, say
x:=2; write(x); writeln(x);to see the use of the initial string constant S0 above. The write(x); is translated to a WriteProc while the writeln(x); becomes a WritelnProc - both are Standard Procedure Calls (StdProcCall) in the GenStmt routine in codegen.c
CalcConstOffsets traverses the global list of strings that was constructed by sem.c when building the Abstract Syntax Tree for a write or writeln statement. This routine computes a new label number for each string, setting the appropriate field in the string list.
S1, S2 and such will be used in the SPIM code your compiler generates. Note, since the new_line string is added to the list last [in codegen.c after the "FrontEnd" semantic processing by sem.c], it will be the first labelled string and therefore will have label S0. Examine GenStmt for the details on how S0 is used for the WriteLn but not for the Write discussed in the paragraphs above.
label: .asciiz "whatever"
at the very end of your generated SPIM code. This needs no change because the label number has already been generated by CalcConstOffsets. given a pointer into the string list, this routine will write the label address of that string (which is what you need in GenWrites).
A final consideration is to consider the special case of the double quote ("). q3_listdoit provides the sample source code:
-- turn on tokens and semantic stack -- examine the existing "writeln(x);" and the new -- phase 2 writlne with strings and expressions %%t %%ss x:=2; writeln(x); writeln(" ""hi"" there"," x = ",x);and sample target code, with Constant storage area:
# # Finish up by writing out constants .word 0 CONST: #Constant storage area .data S0: .asciiz "\n" .data S1: .asciiz " x = " .data S2: .asciiz " \"hi\" there" # # Reserve space for global variables .word 0 VARS: # space for Global Variables .data _x: .word 0 # Offset at 0 .datasince SPIM handles the double quote using \"
There are several ways to handle processing the embedded double quote. Care must be taken since your compiler "Front End" must generate a valid listing file reflecting the users input source ccde. One possible way is to have the routine ConstStorage in codegen.c check the contents of each string for the "" which the source language, MACRO, requires and replace with the /' which the target language (SPIM) requires. Alternatively, you could have the "Front End" handle this during the scanning by lex and parsing by yacc. An updated email will be sent to the class outlining these choices. listing file