We want to examine the algorithm most often used to solve the linear system of equations
The algorithm is called Gaussian Elimination and can be posed in many different forms. When you were first introduced to this algorithm, you may have worked with a matrix A whose entries were all integers and the steps were performed using nice fractions (like 1/4, 1/2, even 1/3 - but nothing nasty like 11/37). You may have been taught to rearrange the rows so that you can work with "nicer" values - e.g. when you are doing arithmetic by hand, 1 is a very nice number and 0.48215 is not so easy to work with. This is not the typical situation in scientific computation on a computer.
There is actually a field of mathematics where the solution of integer coefficient lineear systems with integer valued solutions is studied - Diophantine equations. But these are rare.
You will be given code in both C and Fortran to construct and solve a system of linear equations. I'd like to provide a quick summary of the algorithm, just in case it's been a while.
The goal of Gaussian elimination is to change the solve the linear system of equations
where U is an upper triangular matrix. The reason is because it is a straight forward process to solve Ux = y.
| u11 u12 u13 u14 | | x1 | | y1 |
| 0 u22 u23 u24 | | x2 | | y2 |
| 0 0 u33 u34 | | x3 | = | y3 |
| 0 0 0 u44 | | x4 | | y4 |
y4
x4 = ---
u44
y3 - u34*x4
x3 = -----------
u33
y2 - u24*x4 - u23 * x3
x2 = ----------------------
u22
y1 - u14*x4 - u13 * x3 - u12 * x2
x1 = ---------------------------------
u11
How to we find this "equivalent problem Ux=y"? Let's take a look at our A matrix:
The available operations are multiplying a row by a scalar (a number) and adding (or subtracting) one row from another.
Step 1: We will cause a21, a31, a41 to become zeros by subtracting the appropriate multiple of the first row of A.
row2 <- row2 - l21 * row1 l21 = a21/a11
row3 <- row3 - l31 * row1 l31 = a31/a11
row4 <- row4 - l41 * row1 l41 = a41/a11
Our matrix will now have the form:
a11 a12 a13 a14
0 u22 u23 u24
0 u32 u33 u34
0 u42 u43 u44
where we are using "uij" to indicate that the values in the
second through fourth rows have been changed due to the
row arithmetic indicated above. Also, to be effective, we
do not actually "compute" the 0 that is appearing in column
one. We know we have found the multipliers (lij's) that
will produce these zeros (to the precision of the floating
point arithmetic), so that actual arithmetic indicated above
is perform only on columns [2:4] of rows 2 through 4.
Step 2: Now we focus on the subproblem containing the 3 by 3 matrix
a11 a12 a13 a14
------------
0 |u22 u23 u24
0 |u32 u33 u34
0 |u42 u43 u44
We repeat similar rows operations to reduce u32 and u42 to
zeros, in this case
row3 <- row3 - l32 * row2 l32 = u32/u22
row4 <- row4 - l42 * row2 l42 = u42/u22
Again, we don't really work with all of row3 and row4 since
we know are in the "midst" of Gaussian elimination, we know
that the 0 have already appeared in column 1. Similar to
step 1 above, only columns [3:4] will have arithmetic
performed because we know we have computed the multipliers
(li,j's) so that u32 and u42 will become zero to the
precision of the machine.
Step 3: Our matrix now has the form
a11 a12 a13 a14
0 u22 u23 u24
--------------
0 0 |v33 v34
0 0 |v43 v44
We use the letter 'v' to indicate that the lower 2 by 2
block has changed from its value in Step 2. There is only
one computation to be done now to reduce v43 to zero.
row4 <- row4 - l43 * row3 l43 = v43/v33
Yielding our upper triangular matrix
a11 a12 a13 a14
0 u22 u23 u24
0 0 v33 v34
0 0 0 w44
We've used different letters (u, v, w) to help distinguish
the changing values of the intermediate stages on the way to
the upper triangular form.
What's been left out? - Ux = y - where does y come from?
You may have been introduced to the idea of the "augmented matrix"
where the row operations indicated above in Step1 ... Step(n-1) would be applied not only to A to introduce the zeros in the lower triangle, but also applied to the vector b. The result would be
We need to examine the operations applied to the original b to produce y (note we've computed the multipliers in the process of reducing A to upper triangular U):
Step 1: perform n-1 scalar operations
y2 := b2 - l12 * b1
.
.
.
yn := bn - l1n * b1
n-1 multiplies and n-1 adds
Step 2: perform n-2 scalar operations
y3 <- y3 - l23 * y2
.
.
.
yn <- yn - l2n * y2
n-2 multiplies and n-2 adds
Step n-1: perform 1 scalar operation
yn <- yn - ln-1,n * y(n-1)
n(n-1)
The Cost in FLOPS: ------ adds and multiplies
2
Last but not least, we have the process of actually solving
| u11 u12 u12 ... u1n | | x1 | | y1 |
| 0 u22 u23 ... u2n | | x2 | | y2 |
| . | | x3 | = | y3 |
| . ... | | . | | . |
| . | | . | | . |
| 0 0 unn | | xn | | yn |
We start from the lower right corner:
y(n)
x(n) = ------
u(n,n)
y(n-1) - u(n-1,n)*x(n)
x(n-1) = ----------------------
u(n-1,n-1)
y(n-2) - u(n-2,n)*x(n) - u(n-2,n-1)*x(n-1)
x(n-2) = -----------------------------------------
u(n-2,n-2)
...
y(1) - u(1,n)*x(n) - u(1,n-1)*x(n-1) - ... - u(1,2)*x(2)
x(1) = --------------------------------------------------------
u(1,1)
Mathematically, there are several ways to solve Ax=b.
a) Form A-inverse and then x = A-inverse * b
A BAD TECHNIQUE NUMERICALLY! It costs 4n^3/3 to form A-inverse and you still have to perform the matrix-vector multiply on "b", costing n^2 ops. From a sensitivity point of view, this technique can have tire consequences.
b) Factor A = L U and then Solve using [L U b]
This is the technique you will use in your upcoming computational experiment.
Cost for the factor is n^3/3 and for the solve is n^2 - cheaper by a factor of 4 that forming the inverse on the most costly portion!