Historical Perspective
The variety of high performance computers available now
motivates us to briefly examine some architecture points related
to high performance computers as we cover Chapters 2, 6 and 3 of
our text.
The architecture viewpoint is both "personal" information
and opinions from Stewart as well as quotations from our text.
In the current world of computing, the user is accustomed to
using an interface tool to optimize performance on a particular
hardware platform. Users' time is better spent working in a high
level language, such as C or Fortan or MATLAB, than in working in
assembler language. In the early days of computing (1950's or
so), to achieve good performance all programming had to be done
in "raw machine code", i.e. hex or octal digits depending on the
machine. The computer was a new device and computer time was
scarcer than programmer time, so the programmer was expected to
"speak" the language of the particular machine that was
available.
Luckily, time has passed. Computers have gotten faster and
MUCH cheaper. Memory is cheaper and therefore more plentiful.
Software tools such as compilers have advanced. There is
effectively a universal operating system used on all high
performance machines and it is called UNIX. The programmer has a
reasonable expectation that when a solution has been developed
and coded in Fortran or C, that same code will be transportable
to other environments. "Reasonable" performance is expected if
the compilers on the alternate platforms are good. There may be
some improved performance still available by "assisting" the
compiler with good programming structures.
---------------- ------------
/--> | C or Fortran |---> | Compiler | --*
/ | source file | ------------ |
/ ---------------- |
-------------- ---------- ------------ \|/
| Programmer | ------> | M-file | -------> | MATLAB | --*
-------------- ---------- ------------ |
\ |
\ --------------- ------------- \|/
\--> | Assembler | ----> | Assembler |--*
| Source file | | | |
--------------- ------------- |
|
\|/
---------------
| |
| HARDWARE |
| |
---------------
This diagram indicates how the programmer is removed from
directly working with the particular hardware currently
available. This makes the programmer's code writing skills
valuable and makes moving from one compute environment to another
reasonably straight forward. In the earlier days of computing, a
programmer would become "tied" to a particular architecture since
most programming had to be performed at the more detailed level of
assembler language. The expertise a programmer would develop
would not directly transfer to other environments.
Though the applications packages such as compilers and
MATLAB relieve the programmer from concern about many of the
lower level issues of dealing with the hardware, it is important
for all users to have some awareness of what the hardware is
actually capable of. With the new compute platforms of High
Performance Computing, new concerns are addressed and the
compiler cannot take care of everything.
We are taking a brief (hopefully not too specific) look at hardware.
- CPU = Central Processing Unit
- OS = Operating System
- FPFU = Floating Point Function Units
- IFU = Integer Functional Units
- LFU = Logical Functional Units
An Idealized Computer
----------------------------------------------------
| CPU Main Memory File Space |
| ------------- ----------- -------------- | ---------
| | ----- | | OS | | | | | |
| |regis| | | | | | | |<--> | I/O |
| | ters| | | |Compilers| | User's | | |devices|
| | ----- | | | | file | | | |
| | ----- | |Applica- | | | | ---------
| | |* +| | | tion | | | |
| |FPFUs| - | | | Packages| | | | ---------
| | | / | | | | | | |<--> |Network|
| | ----- | | User | | | | |connect|
| | |* +| | |programs | | | | ---------
| | IFUs| - | | | | | | | ^ ^
| | |mod| | | User | | | | | |
| | ----- | | data | | | | modems |
| | |and| | | (arrays)| | | | |
| | LFUs| or| | | | | | | |
| | |...| | | | | | | other
| | ----- | | | | | | computers
| | lots of | ----------- -------------- |
| | special | |
| | hardware | |
| ------------- |
----------------------------------------------------
Registers hold one word of main memory storage. The functional
units can operate on information in the registers to produce a
new result, which is stored in a register and then, perhaps in
memory. Address computations are integer computations and are
performed in the IFUs. The "lots of neat stuff" above can include
elaborate, special purpose, hardward components which we won't go
into here since they vary greatly from platform to platform.
RISC = Reduced Instruction Set Computers (CDC 6600, 1975)
Our text: High Performance Computing: 2nd Edition by Kevin Dowd and
Charles Severange (O'Reilly & Associates, Inc. 1998) states (p. 13).
"Characterizing RISC
RISC is more of a design philosophy that a set of goals.
Of course every RISC processor has its own
personality. However, there are a number of features
commonly found in machine people consider to be RISC.
- instruction pipelining
- pipelining floating-point execution
- uniform instruction length
- delayed branches
- load store architecture
- simple addressing modes
This list highlights the differences between RISC and
CISC processors. Naturally, the two types of instruction
sets architectures have much in common - each uses
registers, memory, etc. And many of these techiniques
are used in CICS machines too, such as caches
and instruction pipelines. But it's the fundamental
differences that give RISC its speed advanage:
focussing on a smaller set of less powerful instructions makes its
possible to build a faster computer."
There are no hardware operations to work with a value in a
register and a value in memory on the Cray T90. Therefore, every
operand must first be loaded from memory into a register before
any arithmetic can be performed.
CISC machines like the VAX and IBM/PC would allow:
- register-register operations
- register-memory operations
- memory-memory operataions
but this violates the load store architecture listed above for
RISC machines. Also violates the "uniform instruction length"
since any operation involving registers needs to store the
register number (a small integer), while one accessing memory
would needed to have the memory address as part of the
instruction. Depending on the size of memory, this could dictate
large instruction sizes or place a limit on the amount of
possible memory.
Back to CS575