Basic Implementation Techniques will review Ordered Lists, Unordered Lists, Binary Search Trees and Hash Tables in case you need a reminder. Also this helps set the terminology.
There are 2 main competing structures for implementing symbol tables,
But now we want to look at implementing possibly nested procedures. So let's focus first on the translation process.
We define the visibility rules for names in the presence of multiple scopes:
The implication of these rules is that when a scope is closed, all declarations made within the scope become inaccessible.
Our text has Figure 8.5 Nested Scope Example fully worked out in
Lets use an example which I will built in lecture today and start the discussion of how this will interface with our project - Phase 3: Integer Declaration (with implications for much later Phase 7: Subroutines with local and global variables).
package main is body -- open level 0 ma: integer; procedure a is aa,ab: integer; -- open level 1 procedure b is ba: integer; -- open level 2 procedure c is ca,cb: integer; -- open level 3 begin -- c ... end; -- c -- close level 3 begin -- b ... end; -- b -- close level 2 begin -- a ... end; -- a -- close level 1 begin -- main; ... end; -- main; -- closel level 0
When translating the body of c, we have the following:
---------------- / | table with | ----------- / | ca,cb | | level 3 | ----/ ---------------- ----------- | level 2 | ----------------------------> --------------- ----------- | table with | | level 1 | -----------> --------------- | ba,c | ----------- | table with | --------------- | level 0 | -------\ | aa,ab,b | ----------- \ --------------- | | \ ----------- ---------------- Scope_Stack | table with | | ma, a | ----------------When translating a statement within the body of c, we must search in the symbol tables of all open scope for that name. Supposed we had a statement (in the body of c):
ma := ba + aa + cb;
We first look up "ma" in the level 3 table and do not find it. Then the level 2 table, the level 1 table and finally the level 0 table, where "ma" is found. Therefore we know all references to ma will involve
(level 0, offset)
but at this point of the translation, we only need to build the stmt structure for an assignment which contains a pointer into the symbol table for the l.h.s. of the assignment - therefore a pointer into the "level 0 table".
"ba" will be referenced and begin the PUSHing onto the semantic stack as we build the information for an EXPRTREE. We first search the level 3 table, and find "ba" in the level 2 table, so the pointer in the EXPRTREE will be to the level-2 table.
It's easy to easy that the "visibility rules" are satisfied since we will find the INNERMOST declaration of "name" first.
When we reach the end of translating the body of procedure c, we can just POP the Scope_Stack, making the variables declared in c inaccessible.
NOTE - the "table" could be either a binary tree or a hash table.
Alternatively, we could have a single symbol table and keep track explicitly of the scope number of each name. In the example above, each nested scope had variables with different names, but try to imagine what would happen if each nested scope reused the same name in the following example.
Assuming we using chaining to resolve the conflict in hash table, i.e. when 2 names hash to the same index in the hash table, we add the "newer name" to the FRONT of the linked list of names in the chain. For this example, let's assume:
name hash location ma 3 a 5 aa 4 ab 2 b 1 ba 3 c 5 ca 3 cb 4Not a very good hash function but good for examples. Note, we must store both the "name" and the "scope-number".
----------- 1 | | ------> b (1) -> NULL ----------- 2 | | ------> ab (1) -> NULL ----------- 3 | | ------> ca (3) -> ba (2) -> ma (0) -> NULL ----------- 4 | | ------> cb (3) -> aa (1) -> NULL ----------- 5 | | ------> c (2) -> a (0) -> NULL ----------- hash tableWhen a new scope is entered, we increment the global counter for current_scope_number and proceed with inserting new "name" in the hash table. Note - the higher scope numbered variables will always be the "first find" when we walk the CHAINed LIST of all "name"s that hashed to the same hash table location.
If we are translating a procedure at scope level 3 and are looking for a "name" that hashed to location 2 in the hash table for example, then we need to walk the entire chain to NULL before we can flag the variable as undeclared OR we find the "name" and its scope number.
When we close a scope, then we have to check every entry in the hash table (i=1,2,3,4,5 above) and walk the chains to remove names. Suppose we just closed the procedure c scope (i.e. level 3). We look at each hash table chain in turn. Hash-entry 2 begins with a variable at level 1, so we know we do not have to do anything. Hash-entry 3 has a level 3 name, so we delete it from the front of the list. But the next entry is at level 2, so we are done. Hash table entry 4 has a level 3 var, which we delete, but the next name is level 1, so we are finished with that chain. Since the higher scope-level names are always at the front of the hash-table-chain-list, it is very easy to close a scope. Similarly, we always find the INNERMOST declaration of any "name".
If we use a single binary search tree, then opening and closing scopes is not so simple. The level 3 names will all be at the leaves of the global binary search tree - since they were inserted last!
In summary, if we use "hash tables" then a single hash table will easily handle open/close scope rules.
If we use "binary trees", then a Scope_Stack will easily handle open/close scope rules.
procedure b is ba: integer; procedure c is ca,cb: integer; begin -- c ... end; -- c newone: integer; -- c CANNOT "see" newone! begin -- b ... end; -- b