Phase3: Integer Declaration Hints - Spr2008 - 17 March 08
P3 due March 29, 2008 [Friday before Spring Break]
Working in groups of 3 [preferably, but 2 is allowed]

You should begin working in groups of 3 for this phase and subsequent ones.

The system we will use is the POC (point of contact) for the group will notify the instructor [with cc: to group members] when a Phase is ready to be checked. Therefore, groups should select a POC and the POC should notify the instructor [with cc: to group members] of group name. As a first step, the POC should send email to instructor with group name stewart@rohan.sdsu.edu email instructor


Integer Declarations
First of all, there are no changes to codegen.c or to printree.c - this only involves the front-end of the compiler and working with the symbol table.

You would be wise to examine how the symbol table is handled in our current running project.

1. Here are the changes you might make to simple.lr (with implied changes to simple.lex [additional tokens] and sem.c [new semantic actions]). Don't forget to keep the lr_debug statements current in simple.lr

%token Package Is Body Begin End Colonsym
%%
pgm 		: 	Package packname Is body End Termsym
			{lr_debug ("program  -> Package packname Is body End Termsym");
			Done(); }

packname	:	Id
			{lr_debug ("packname -> Id");
			PutPackName(yytext); }

body		:	Body decllist Begin stmtlist
			{lr_debug ("body     -> Body decllist Begin Stmtlist");}
			|
			Body Begin stmtlist    
			{lr_debug ("body     -> Body Begin Stmtlist");}

decllist	:	decllist decl
			{lr_debug ("decllist -> decllist decl");}
			|
			decl
			{lr_debug ("decllist -> decl");}

decl		:	idlist Colonsym type Termsym
			{lr_debug ("decl     -> idlist Colonsym type Termsym");
			EnterIdList(); }

idlist		:	idlist Commasym Id
			{lr_debug ("idlist   -> idlist Commasym Id");
			PushId(yytext); MergeIdList(); }
			|
			Id
			{lr_debug ("idlist   -> Id");
			PushId(yytext);	CreateIdList(); }

type		:	Id
			{lr_debug ("type     -> Id");
			PushId(yytext);
			MakeTypeEntry(); }

2.  Changes to the structures (along with associated #define alloc's)

     a. Two new global variables which are pointers to a TYPEKIND for
          integers and undeclared variables, e.g.

Type_int -----\
               \          (fields of the structure pointed to by TYPEKIND)
               ----------------------
               | id: "integer"      |
               | type: type_integer |    (#define type_integer 1)
               | size: 4            |
               ----------------------

          I would define Type_int and Type_und in symboltb.h since they
          are structures for the symbol table.  You'll notice that
          semform.h includes the symbol table definition information,
          so no further changes are needed.

     b. Add new fields to ATTRPTR (in symboltb.h) for

          i) typeid (which equals either Type_int or Type_und)
          ii) typename (true if a type name, else false)
          iii) initialized (set true when initialized, else false)
          iv) used (set true when used, else false)

     c. New entries for the semantic stack (Semstack in semstack.h):
          i) need a "type" entry of TYPEKIND.  TYPEKIND is a structure
             (which is defined in symboltb.h) and points to a
             structure with fields for "id", "type" and "size" (as above)
          ii) need an "ilist" entry which is a struct with a first
             and last pointer to an IDLIST.  I declared IDLIST in
             semstack.h as a pointer to a struc with fields IDNAME and
             struct idnode *next.


3.  Changes to symtab.c - add fields in the dumpstab routine to print the node's type and whether init'd and/or used

4.  Sem.c - changed actions

     a.  InitSemantics - Create entry in symbol table for the name
         "integer" and its associated TYPEKIND (Type_int).  I also
	 created the TYPEKIND Type_und for use by idtoaddr.
	    
         This way when you later look up a name like "integer", 
         you will know that it is a reserved word used for a 
         type declaration.  Also, when an id is looked up by 
         idtoaddr and not found, you want to insert that id in
         the symbol table with type Type_und, so that you can issue
         only one error message when a user has an undeclared variable
         in their code.

     b.  Change IdToAddr - if name is looked up and not found -
         insert it with the type Undeclared and write out an error
         message, and put an errorentry on the semantic stack.

     c.  AddrToPrimary - mark symtab entry used (think about why 
         this is the appropriate place to mark a name as used)

     d.  Assign, CreateMemParam, MergeMemParam - mark symtab entry 
  	initialized (think about why this is the appropriate place to mark 
	a name as initialized)

     e.  PrintSemstack - print two new entries (type and idlist)

5. Sem.c - new actions

     a. CreateIdList: change Identry on TOS to IdListEntry and
          link up a list with one entry (the Identry)

     b. MergeIdList: Identry on TOS, IdList Top-1; link the
          new entry to the end of the list and Pop 1

     c. MakeTypeEntry: id on TOS - look up in symtab - should
          be there else error.  Retrieve attr->typeid and
          replace TOS with typeentry.

     d. EnterIdList: typeentry on TOS; idlist below.  Basically
          do what IdToAddr used to.  Traverse the list -
          Lookup the id name (shouldn't be there - else error).
          Set up attribute record, with appropriate new fields.
          Lookup (attr) then Insert(attr) and POP 2 from the SemStack
          since we've finished with that information.

There is a lot of error checking that you want to work into this phase.
I would focus only on valid input at first and get things running
without the error checks.  Once valid stuff is working - you can go
back and handle:    marking vars used and initialized.
                    checking var is init'd before used
                    making sure a var is declared before used -
                         if not - issue error message and
                         insert with typeid of undeclared to
                         minimize the number of messages issued
                         when building expressions, make sure the variable
                         was declared and initialized.
                    not allocating storage for typenames - notice in sem.c
                         table to a alphabetic list of names.  modify
                         GetVars to ignore typenames when making the list.
Return to class home page