We want to examine the algorithm most often used to solve the linear system of equations
The algorithm is called Gaussian Elimination and can be posed in many different forms. When you were first introduced to this algorithm, you may have worked with a matrix A whose entries were all integers and the steps were performed using nice fractions (like 1/4, 1/2, even 1/3 - but nothing nasty like 11/37). You may have been taught to rearrange the rows so that you can work with "nicer" values - e.g. when you are doing arithmetic by hand, 1 is a very nice number and 0.48215 is not so easy to work with. This is not the typical situation in scientific computation on a computer.
There is actually a field of mathematics where the solution of integer coefficient lineear systems with integer valued solutions is studied - Diophantine equations. But these are rare.
You will be given code in both C and Fortran to construct and solve a system of linear equations. I'd like to provide a quick summary of the algorithm, just in case it's been a while.
The goal of Gaussian elimination is to change the solve the linear system of equations
where U is an upper triangular matrix. The reason is because it is a straight forward process to solve Ux = y.
| u11 u12 u13 u14 | | x1 | | y1 | | 0 u22 u23 u24 | | x2 | | y2 | | 0 0 u33 u34 | | x3 | = | y3 | | 0 0 0 u44 | | x4 | | y4 | y4 x4 = --- u44 y3 - u34*x4 x3 = ----------- u33 y2 - u24*x4 - u23 * x3 x2 = ---------------------- u22 y1 - u14*x4 - u13 * x3 - u12 * x2 x1 = --------------------------------- u11
How to we find this "equivalent problem Ux=y"? Let's take a look at our A matrix:
The available operations are multiplying a row by a scalar (a number) and adding (or subtracting) one row from another.
Step 1: We will cause a21, a31, a41 to become zeros by subtracting the appropriate multiple of the first row of A.
row2 <- row2 - l21 * row1 l21 = a21/a11 row3 <- row3 - l31 * row1 l31 = a31/a11 row4 <- row4 - l41 * row1 l41 = a41/a11Our matrix will now have the form:
a11 a12 a13 a14 0 u22 u23 u24 0 u32 u33 u34 0 u42 u43 u44where we are using "uij" to indicate that the values in the second through fourth rows have been changed due to the row arithmetic indicated above. Also, to be effective, we do not actually "compute" the 0 that is appearing in column one. We know we have found the multipliers (lij's) that will produce these zeros (to the precision of the floating point arithmetic), so that actual arithmetic indicated above is perform only on columns [2:4] of rows 2 through 4.
Step 2: Now we focus on the subproblem containing the 3 by 3 matrix
a11 a12 a13 a14 ------------ 0 |u22 u23 u24 0 |u32 u33 u34 0 |u42 u43 u44We repeat similar rows operations to reduce u32 and u42 to zeros, in this case
row3 <- row3 - l32 * row2 l32 = u32/u22 row4 <- row4 - l42 * row2 l42 = u42/u22Again, we don't really work with all of row3 and row4 since we know are in the "midst" of Gaussian elimination, we know that the 0 have already appeared in column 1. Similar to step 1 above, only columns [3:4] will have arithmetic performed because we know we have computed the multipliers (li,j's) so that u32 and u42 will become zero to the precision of the machine.
Step 3: Our matrix now has the form
a11 a12 a13 a14 0 u22 u23 u24 -------------- 0 0 |v33 v34 0 0 |v43 v44We use the letter 'v' to indicate that the lower 2 by 2 block has changed from its value in Step 2. There is only one computation to be done now to reduce v43 to zero.
row4 <- row4 - l43 * row3 l43 = v43/v33Yielding our upper triangular matrix
a11 a12 a13 a14 0 u22 u23 u24 0 0 v33 v34 0 0 0 w44We've used different letters (u, v, w) to help distinguish the changing values of the intermediate stages on the way to the upper triangular form.
What's been left out? - Ux = y - where does y come from?
You may have been introduced to the idea of the "augmented matrix"
where the row operations indicated above in Step1 ... Step(n-1) would be applied not only to A to introduce the zeros in the lower triangle, but also applied to the vector b. The result would be
We need to examine the operations applied to the original b to produce y (note we've computed the multipliers in the process of reducing A to upper triangular U):
Step 1: perform n-1 scalar operations
y2 := b2 - l12 * b1 . . . yn := bn - l1n * b1 n-1 multiplies and n-1 addsStep 2: perform n-2 scalar operations
y3 <- y3 - l23 * y2 . . . yn <- yn - l2n * y2 n-2 multiplies and n-2 addsStep n-1: perform 1 scalar operation
yn <- yn - ln-1,n * y(n-1) n(n-1) The Cost in FLOPS: ------ adds and multiplies 2
Last but not least, we have the process of actually solving
| u11 u12 u12 ... u1n | | x1 | | y1 | | 0 u22 u23 ... u2n | | x2 | | y2 | | . | | x3 | = | y3 | | . ... | | . | | . | | . | | . | | . | | 0 0 unn | | xn | | yn |
We start from the lower right corner:
y(n) x(n) = ------ u(n,n) y(n-1) - u(n-1,n)*x(n) x(n-1) = ---------------------- u(n-1,n-1) y(n-2) - u(n-2,n)*x(n) - u(n-2,n-1)*x(n-1) x(n-2) = ----------------------------------------- u(n-2,n-2) ... y(1) - u(1,n)*x(n) - u(1,n-1)*x(n-1) - ... - u(1,2)*x(2) x(1) = -------------------------------------------------------- u(1,1)
Mathematically, there are several ways to solve Ax=b.
a) Form A-inverse and then x = A-inverse * b
A BAD TECHNIQUE NUMERICALLY! It costs 4n^3/3 to form A-inverse and you still have to perform the matrix-vector multiply on "b", costing n^2 ops. From a sensitivity point of view, this technique can have tire consequences.
b) Factor A = L U and then Solve using [L U b]
This is the technique you will use in your upcoming computational experiment.
Cost for the factor is n^3/3 and for the solve is n^2 - cheaper by a factor of 4 that forming the inverse on the most costly portion!