Lecture next week (23Sept02) will cover our Chapter 3 Memory.
Chapter 2 (Ch 2) of our text introduced us to computer hardware by describing the CISC and RISC philosophy of design. Then our Wednesday lab (Lab1) introduced you to using the campus Rohan machine and the UNIX timer, dtime, which is simpler to use than the timer, etime, mentioned in our text (Ch 6) to obtain timing for a portion of a program
We will examine the "counting of operations" to assess the amount of work.
We are using the Sun SunFire 4800 (Rohan) which is a "scalar machine". This means that when the compiler transforms your Fortran or C commands into the native machine code the basic amount of work performed is associated with a single word of memory. Consider our sample code from lab last week
What is accomplished in the loop you examined in lab last week?
Your first computational experiment, from last week's lab, is to run a variety of tests with your sample code with different input for the length of the loop. You want to investigate how easily you can generate timing data that can be related to the expected amount of work indicated in the program.
The variety of high performance computers available now motivates us to briefly examine some architecture points related to high performance computers as we cover Chapters 2, 6 and 3 of our text. The architecture viewpoint is both "personal" information and opinions from Stewart as well as quotations from
Luckily, time has passed. Computers have gotten faster and MUCH cheaper. Memory is cheaper and therefore more plentiful. Software tools such as compilers have advanced. There is effectively a universal operating system used on all high performance machines and it is called UNIX. The programmer has a reasonable expectation that when a solution has been developed and coded in Fortran or C, that same code will be transportable to other environments. "Reasonable" performance is expected if the compilers on the alternate platforms are good. There may be some improved performance still available by "assisting" the compiler with good programming structures.
---------------- ------------ /--> | C or Fortran |---> | Compiler | --* / | source file | ------------ | / ---------------- | -------------- ---------- ------------ \|/ | Programmer | ------> | M-file | -------> | MATLAB | --* -------------- ---------- ------------ | \ | \ --------------- ------------- \|/ \--> | Assembler | ----> | Assembler |--* | Source file | | | | --------------- ------------- | | \|/ --------------- | | | HARDWARE | | | ---------------This diagram indicates how the programmer is removed from directly working with the particular hardware currently available. This makes the programmer's code writing skills valuable and makes moving from one compute environment to another reasonably straight forward. In the earlier days of computing, a programmer would become "tied" to a particular architecture since most programming had to be performed at the more detailed level of assembler language. The expertise a programmer would develop would not directly transfer to other environments.
Though the applications packages such as compilers and MATLAB relieve the programmer from concern about many of the lower level issues of dealing with the hardware, it is important for all users to have some awareness of what the hardware is actually capable of. With the new compute platforms of High Performance Computing, new concerns are addressed and the compiler cannot take care of everything.
We are taking a brief (hopefully not too specific) look at hardware.
An Idealized Computer ---------------------------------------------------- | CPU Main Memory File Space | | ------------- ----------- -------------- | --------- | | ----- | | OS | | | | | | | |regis| | | | | | | |<--> | I/O | | | ters| | | |Compilers| | User's | | |devices| | | ----- | | | | file | | | | | | ----- | |Applica- | | | | --------- | | |* +| | | tion | | | | | |FPFUs| - | | | Packages| | | | --------- | | | / | | | | | | |<--> |Network| | | ----- | | User | | | | |connect| | | |* +| | |programs | | | | --------- | | IFUs| - | | | | | | | ^ ^ | | |mod| | | User | | | | | | | | ----- | | data | | | | modems | | | |and| | | (arrays)| | | | | | | LFUs| or| | | | | | | | | | |...| | | | | | | other | | ----- | | | | | | computers | | lots of | ----------- -------------- | | | special | | | | hardware | | | ------------- | ----------------------------------------------------Registers hold one word of main memory storage. The functional units can operate on information in the registers to produce a new result, which is stored in a register and then, perhaps in memory. Address computations are integer computations and are performed in the IFUs. The "lots of neat stuff" above can include elaborate, special purpose, hardward components which we won't go into here since they vary greatly from platform to platform.
RISC = Reduced Instruction Set Computers (CDC 6600, 1975)
Our text: High Performance Computing: 2nd Edition by Kevin Dowd and Charles Severange (O'Reilly & Associates, Inc. 1998) states (p. 13).
"Characterizing RISC
RISC is more of a design philosophy that a set of goals. Of course every RISC processor has its own personality. However, there are a number of features commonly found in machine people consider to be RISC.
There are no hardware operations to work with a value in a register and a value in memory on the Cray T90. Therefore, every operand must first be loaded from memory into a register before any arithmetic can be performed.
CISC machines like the VAX and IBM/PC would allow: