Gaussian Elimination
CS 575 - October 9, 2002

Kris Stewart

NOTE: This is the beginning of a series of handouts to provide the back for your second computational experiment and report. This covers Gaussian Elimination. This mentions the condition number, but doesn't develop it.

We want to examine the algorithm most often used to solve the linear system of equations

A x = b

The algorithm is called Gaussian Elimination and can be posed in many different forms. When you were first introduced to this algorithm, you may have worked with a matrix A whose entries were all integers and the steps were performed using nice fractions (like 1/4, 1/2, even 1/3 - but nothing nasty like 11/37). You may have been taught to rearrange the rows so that you can work with "nicer" values - e.g. when you are doing arithmetic by hand, 1 is a very nice number and 0.48215 is not so easy to work with. This is not the typical situation in scientific computation on a computer.

There is actually a field of mathematics where the solution of integer coefficient lineear systems with integer valued solutions is studied - Diophantine equations. But these are rare.

You will be given code in both C and Fortran to construct and solve a system of linear equations. I'd like to provide a quick summary of the algorithm, just in case it's been a while.

The goal of Gaussian elimination is to change the solve the linear system of equations

A x = b

into an equivalent lineary system

U x = y

where U is an upper triangular matrix. The reason is because it is a straight forward process to solve Ux = y.

          | u11  u12  u13  u14 | | x1 |     | y1 |
          |  0   u22  u23  u24 | | x2 |     | y2 |
          |  0    0   u33  u34 | | x3 | =   | y3 |
          |  0    0    0   u44 | | x4 |     | y4 |

      y4
x4 = ---
     u44

     y3 - u34*x4
x3 = -----------
         u33

     y2 - u24*x4 - u23 * x3
x2 = ----------------------
              u22

     y1 - u14*x4 - u13 * x3 - u12 * x2
x1 = ---------------------------------
                     u11

How to we find this "equivalent problem Ux=y"? Let's take a look at our A matrix:

The available operations are multiplying a row by a scalar (a number) and adding (or subtracting) one row from another.

Step 1: We will cause a21, a31, a41 to become zeros by subtracting the appropriate multiple of the first row of A.

       row2 <- row2 - l21 * row1    l21 = a21/a11
       row3 <- row3 - l31 * row1    l31 = a31/a11
       row4 <- row4 - l41 * row1    l41 = a41/a11
Our matrix will now have the form:
        a11 a12 a13 a14
         0  u22 u23 u24
         0  u32 u33 u34
         0  u42 u43 u44
where we are using "uij" to indicate that the values in the second through fourth rows have been changed due to the row arithmetic indicated above. Also, to be effective, we do not actually "compute" the 0 that is appearing in column one. We know we have found the multipliers (lij's) that will produce these zeros (to the precision of the floating point arithmetic), so that actual arithmetic indicated above is perform only on columns [2:4] of rows 2 through 4.

Step 2: Now we focus on the subproblem containing the 3 by 3 matrix

        a11 a12 a13 a14
           ------------
         0 |u22 u23 u24
         0 |u32 u33 u34
         0 |u42 u43 u44
We repeat similar rows operations to reduce u32 and u42 to zeros, in this case
      row3 <- row3 - l32 * row2     l32 = u32/u22
      row4 <- row4 - l42 * row2     l42 = u42/u22
Again, we don't really work with all of row3 and row4 since we know are in the "midst" of Gaussian elimination, we know that the 0 have already appeared in column 1. Similar to step 1 above, only columns [3:4] will have arithmetic performed because we know we have computed the multipliers (li,j's) so that u32 and u42 will become zero to the precision of the machine.

Step 3: Our matrix now has the form

        a11 a12 a13 a14
         0  u22 u23 u24
         --------------
         0   0 |v33 v34
         0   0 |v43 v44
We use the letter 'v' to indicate that the lower 2 by 2 block has changed from its value in Step 2. There is only one computation to be done now to reduce v43 to zero.
     row4 <- row4 - l43 * row3     l43 = v43/v33
Yielding our upper triangular matrix
         a11 a12 a13 a14
          0  u22 u23 u24
          0   0  v33 v34
          0   0   0  w44
We've used different letters (u, v, w) to help distinguish the changing values of the intermediate stages on the way to the upper triangular form.

What's been left out? - Ux = y - where does y come from?

You may have been introduced to the idea of the "augmented matrix"

[A | b]

where the row operations indicated above in Step1 ... Step(n-1) would be applied not only to A to introduce the zeros in the lower triangle, but also applied to the vector b. The result would be

[U | y]

where y indicates the changes that occur to the vector b by applying the row operations above.

We need to examine the operations applied to the original b to produce y (note we've computed the multipliers in the process of reducing A to upper triangular U):

Step 1: perform n-1 scalar operations

             y2 := b2 - l12 * b1
                  .
                  .
                  .
             yn := bn - l1n * b1

        n-1 multiplies and n-1 adds
Step 2: perform n-2 scalar operations
             y3 <- y3 - l23 * y2
                  .
                  .
                  .
             yn <- yn - l2n * y2

        n-2 multiplies and n-2 adds
Step n-1: perform 1 scalar operation
             yn <- yn - ln-1,n * y(n-1)

                   n(n-1)
The Cost in FLOPS: ------ adds and multiplies
                      2

Last but not least, we have the process of actually solving

Ux = y


      | u11 u12 u12 ... u1n | | x1 |   | y1 |
      |  0  u22 u23 ... u2n | | x2 |   | y2 |
      |         .           | | x3 | = | y3 |
      |           . ...     | | .  |   | .  |
      |             .       | | .  |   | .  |
      |  0    0         unn | | xn |   | yn |

We start from the lower right corner:

        y(n)
x(n) = ------
       u(n,n)

         y(n-1) - u(n-1,n)*x(n)
x(n-1) = ----------------------
               u(n-1,n-1)

         y(n-2) - u(n-2,n)*x(n) - u(n-2,n-1)*x(n-1)
x(n-2) = -----------------------------------------
                      u(n-2,n-2)


                             ...


       y(1) - u(1,n)*x(n) - u(1,n-1)*x(n-1) - ... - u(1,2)*x(2)
x(1) = --------------------------------------------------------
                                u(1,1)

Mathematically, there are several ways to solve Ax=b.

a) Form A-inverse and then x = A-inverse * b

A BAD TECHNIQUE NUMERICALLY! It costs 4n^3/3 to form A-inverse and you still have to perform the matrix-vector multiply on "b", costing n^2 ops. From a sensitivity point of view, this technique can have tire consequences.

b) Factor A = L U and then Solve using [L U b]

This is the technique you will use in your upcoming computational experiment.

Cost for the factor is n^3/3 and for the solve is n^2 - cheaper by a factor of 4 that forming the inverse on the most costly portion!

Back to Report2_Diffusion