CS 575 Supercomputing - Lecture Outline
Chapter 3: Memory
Sept. 22, 2003
This URL is stewart.sdsu.edu/cs575/lecs/ch3.html
Focus: 
Speed of processors is increasing steadily, performance doubling 
every 18 months, agrees with Moore's Law
(Time-dependent: speed of computers is increasing almost 
10-times in less than five years). 
What Moore actually observed was that the density of transitors on 
a silicon Integrated Circuit (IC) doubles every 18 months, 
since 1962 (George Moore, Intel Engineer and CEO, retired). 
Intel Research / Silicon / Moore's Law
Is Moore's Law Irrelevant? Electronic News 8/14/2003
So what does that really mean?  (18 months = 1.5 Years)
| Year 0 | Year 1.5 | Year 3 | Year 4.5 | Year 6 | 
| Power=1 | Power=2 | Power=4 | Power=8 | Power=16 | 
Power is usually CPU processor speed, but could someday be 
- amount of memory on chip, 
-  throughput of data between CPU and memory or 
-  throughput of data between memory and file system.
In terms of names for numbers:
-  petabytes (PB) of storage (petabyte is 1015 bytes) for files 
and such, 
-  gigabytes (GB) of memory (gigabyte is 109 bytes) for running 
programs.  
 Note: Interpreter + user program + data *or* Compiler processes user 
program to produce object code which must all fit in memory; then 
object code + data runs in memory. More details in a couple of weeks when 
we cover Ch. 5 What a Compiler Does
-  teraflops (TFLOPS) - overall measure of processor speed (teraflop is 1012 flops, or Floating Point Operations per Second). More details \
next week when we cover Ch. 4 Floating Point Numbers 
-  input-output (I/O) capacity of hundreds gigabits (Gb) per second. 
 (NOTE: Gb and GB differ by nearly an order of magnitude (factor of 10)
Memory performance doubling every seven years, therefore need 
alternate mechanisms for successful High Performance Computing, 
because memory speed increase just isn't keeping up with the 
processor speed increase.
(p. 33 Text) Example Memory Technology Access Time: 
1981 IBM XT 8088 access time of commodity 
DRAM (200ns) was shorter than the clock cycle time (4.77 MHz)
 4.77 MHz  ? < ? 210ns 
What does that really mean?  Let's convert 200 ns (nanoseconds) to MegaHertz 
since they are reciprocals:
1/200 ns ~=  1/(200*10-9) = 1/2 * 1/10-7 = 
0.5*10+7 = 5.0 E+6 
this gives a clock cycle time that would allow roughly 5 Million 
Cycles per Second, i.e.  5MHz.  
What memory response is needed to keep up with 1.8 Ghz Pentium4?  
Let's work on the board. 1.8 GHz ~= 2 GHz ~= 1/?
What memory response is needed to keep up with 330 MHz Pentium?
1/[330*106] = 1/.33*109 =~ 1 / [1/3 * 109] = 3 ns memory chip
-->
Economics is a driving force in computing.  Possible approaches to 
memory performance
-  Every memory system component made individually fast enough to 
keep up with each memory access request
-  Slow memory used in round-robin fashion to give the 
effect of faster memory systems
-  Memory system design made wide to each transfer contains 
main bytes of memory
-  System divided into faster and slower portions and arranged so 
fast portion used more often than slow.
In this chapter we will focus on
-  Memory Technology (different access times)
-   DRAM (dynamic random access memory) 
-  SRAM (static random access memory).  
Compare VHS tape (rewinding and fast forward) 
to DVD.
-  Access time
-  Registers
-  Caches (but not the Cache Organization topics)

Fig. 3-1: Cache lines can come from different parts of memory
© O'Reilly Publishers (Used with permission)
Interleaved and Pipelined Memory Systems
Fig. 3-9 Multibanked memory system
 
 
© O'Reilly Publishers (Used with permission)