CS 575 Cray Architecture

The intent is to gain an appreciation for High Performance Computing and to use the Cray C90 as a sample platform. Students come to understand the subtleties of the C90's design from a programmers point of view with the goal to understand "why" the Cray C90 is fast and how to quantify this. Most of these concepts transfer easily to other HPC platforms, even massively parallel platforms since one of the first steps in parallelizing a code is to find the vectorizable loops.

Another excellent source is available from Cray Research Inc. TR-OPT 1.0 (E) CF77 and Cray Standard C Optimization for Parallel-Vector Systems. An excellent, Cray-specific training report covering the Fortran and C compilers. This document provides code examples, diagrams, and explanations of crucial vectorization topics (and the conditions that inhibit the compiler), memory organization, performance tools, common optimization techniques, and much more. More information is available online for Cray from http://www.cray.com, WWW link to Cray Research Inc.