FNC Sections 2.1 to 2.5

Andrew Pua

2023-08-24

Learning objectives

Learn Julia through solving systems of linear equations numerically
Get a sense of numerical issues arising from solving systems of linear equations
Learn a measure of computational time

Why bother with systems of linear equations?

Motivating example: Polynomial interpolation (this is a bit different from polynomial fitting)
Numerical issues not encountered when solving by hand:
- Near non-invertibility can create problems.
- Subtractive cancellation shows up here too.

Key theory

Let \(\mathbf{A}\) be an \(n\times n\) matrix, \(\mathbf{x}\) and \(\mathbf{b}\) be \(n\times 1\) vectors.
Task of chapter: Solve for \(\mathbf{x}\) in \(\mathbf{A}\mathbf{x}=\mathbf{b}\).
Easy from the armchair: \(\mathbf{x}=\mathbf{A}^{-1}\mathbf{b}\)
In practice: Takes a lot of time to figure out \(\mathbf{A}^{-1}\) even by hand

A way out

Wouldn’t it be easier to solve a linear system if there are zeros in strategic places?
Example: \[x_1=0,\ x_1+x_2=1\]
In matrices: \[\begin{bmatrix}1 & 0 \\ 1 & 1\end{bmatrix}\begin{bmatrix}x_1\\x_2\end{bmatrix}=\begin{bmatrix}0\\1\end{bmatrix}\]

A way out, more generally

Leads to the ideas of forward substitution and backward substitution!
Find an \(\mathbf{LU}\) factorization of \(\mathbf{A}\) so that \(\mathbf{A}=\mathbf{LU}\).
\(\mathbf{L}\): unit lower triangular matrix; \(\mathbf{U}\): upper triangular matrix.
\(\mathbf{Ax}=\mathbf{b} \implies \mathbf{L}\underbrace{\mathbf{Ux}}_{\mathbf{z}}=\mathbf{b}\).
Solve \(\mathbf{L}\mathbf{z}=\mathbf{b}\) and find \(\mathbf{z}\) first.
Then solve \(\mathbf{Ux}=\mathbf{z}\) and find \(\mathbf{x}\).

Aside: Weaving Julia in Rmd/qmd

Install Julia and the book-specific package FundamentalsNumericalComputation using the instructions here. Take note of the path to the Julia binary (find a folder named bin).
Install the latest version of the R package JuliaCall, otherwise, there will be errors described here. Below you will find a code chunk for installing JuliaCall.

install.packages('devtools') # install if you do not have this package
devtools::install_github('Non-Contradiction/JuliaCall')

Point to the Julia binary to ensure that Julia can work within R.

if (!require('JuliaCall')) stop("You do not have JuliaCall. It can be installed using the following command: devtools::install_github('Non-Contradiction/JuliaCall')")
# change this for your situation
julia_setup(JULIA_HOME = "/home/apua/julia-1.9.2/bin")

Example of a Julia code chunk:

a = sqrt(2)

1.4142135623730951

Example of a Julia command inside an R code chunk:

julia_command("a = sqrt(2);")
julia_eval("a")

[1] 1.414214

Learn Julia through Exercise 2.1.2

Problem: Let \(c_1,\ldots,c_4\) be unknown constants. Find a cubic polynomial in \(t\), i.e., \[p\left(t\right)=c_1+c_2t+c_3t^2+c_4t^3\] so that \[p\left(-1\right)=-2,\ p^\prime\left(-1\right)=1,\ p\left(1\right)=0,\ p^\prime\left(1\right)=-1.\]

By hand: You have the following system of 4 equations in 4 unknowns \[\underbrace{\begin{bmatrix}1 & -1 & 1 & -1 \\ 0& 1 & -2 & 3 \\ 1 & 1 & 1 & 1 \\ 0 & 1 & 2 & 3 \end{bmatrix}}_{\mathbf{A}}\underbrace{\begin{bmatrix}c_1\\ c_2\\ c_3\\ c_4\end{bmatrix}}_{\mathbf{x}}=\underbrace{\begin{bmatrix} -2\\1\\0\\1 \end{bmatrix}}_{\mathbf{b}}\] After some algebra, we get \[c_1=-1/2,\ c_2=3/2,\ c_3=-1/2,\ c_4 = -1/2.\]

Using Julia:

b = [-2; 1; 0; -1] # OR y = [-2, 1, 0, -1]

4-element Vector{Int64}:
 -2
  1
  0
 -1

A = [1 -1 1 -1; 0 1 -2 3; 1 1 1 1; 0 1 2 3]

4×4 Matrix{Int64}:
 1  -1   1  -1
 0   1  -2   3
 1   1   1   1
 0   1   2   3

A = [1 -1 1 -1
     0 1 -2 3
     1 1 1 1
     0 1 2 3] # alternative

4×4 Matrix{Int64}:
 1  -1   1  -1
 0   1  -2   3
 1   1   1   1
 0   1   2   3

x = A \ b

4-element Vector{Float64}:
 -0.5
  1.5
 -0.5
 -0.5

b - A*x # comparable to machine precision

4-element Vector{Float64}:
 0.0
 0.0
 0.0
 0.0

Plots in Julia

Problem: We now have a cubic polynomial with the desired characteristics. Present a visualization.

Must be able to represent as cubic polynomial in Julia using FNC.Polynomial()
(Optional) ditto for the derivative
Draw curves over some range
Design the visualization

using FundamentalsNumericalComputation
# Desired cubic polynomial
p = FNC.Polynomial(x)

Polynomial(-0.5 + 1.5*x - 0.5*x^2 - 0.5*x^3)

# Coefficients of derivative of cubic polynomial
d = [x[2]; 2x[3]; 3x[4]]

3-element Vector{Float64}:
  1.5
 -1.0
 -1.5

# Derivative of cubic polynomial
Dp =  FNC.Polynomial(d)

Polynomial(1.5 - 1.0*x - 1.5*x^2)

scatter([-1; 1], [-2; 0], label="Cubic info", legend=:topleft,
    xlabel=L"t", ylabel=L"p", 
    title="Exercise 2.1.2")

trange = range(-1, 1, length = 500);
yvals = p.(trange);
dyvals = Dp.(trange);
plot!(trange, yvals, label = L"p(t)")

scatter!([-1; 1], [1; -1], label="Derivative info", legend=:left, ylabel=L"p,\ p'")

plot!(trange, dyvals, label = L"p'(t)", l=:dash)

What is learned about Julia so far

representing vectors and matrices
some operations on vectors and matrices
. for element-wise computing and accessing package-specific commands (similar to R’s ::)
! for “appending”
: for options (?) and used for row/column access later
basic plotting and quite some options

Learn Julia through Exercise 2.3.7

Problem: Consider the following system of linear equations: \[\underbrace{\begin{bmatrix}1 & -1 & 0 & \alpha-\beta & \beta \\ 0 & 1 & -1 & 0 & 0 \\ 0 & 0 & 1 & -1 & 0 \\ 0 & 0 &0 &1 & -1 \\ 0 & 0 & 0 & 0 & 1\end{bmatrix}}_{\mathbf{A}}\underbrace{\begin{bmatrix}x_1\\x_2\\x_3\\x_4\\x_5\end{bmatrix}}_{\mathbf{x}}=\underbrace{\begin{bmatrix}\alpha \\0 \\ 0 \\ 0 \\ 1\end{bmatrix}}_{\mathbf{b}}\]

If we solve for \(\mathbf{x}\) by hand:

\(x_1=x_2=\cdots=x_5=1\) is the analytical solution using backward substitution.
But there is a big loss of accuracy numerically depending on the relative sizes of \(\alpha\) and \(\beta\).
This is another example of subtractive cancellation.

Use Function 2.3.5 (FNC.backsub()) to solve for \(\mathbf{x}\) with \(\alpha=0.1\) and \(\beta=10, 10^2, \ldots, 10^{12}\). Tabulate the values of \(\beta\) and \(|x_1-1|\).

Loop over values of \(\beta\)
Use element-wise exponentiation over a range 1:12
Input the \(\mathbf{A}\) (done in a slightly different way which uses FNC.diagm())
Use FNC.backsub() to solve for \(\mathbf{x}\)
Tabulate results

using FundamentalsNumericalComputation
α = 0.1;
β = 10 .^ (1:12);
for β in β
  A = diagm(0 => ones(5), 1 => [-1,-1,-1,-1], 3 => [α-β,0], 4 => [β])
  x = FNC.backsub(A, [α, 0, 0, 0, 1] )
  @show [β abs(x[1] - 1.)]
end

[β abs(x[1] - 1.0)] = [10.0 4.440892098500626e-16]
[β abs(x[1] - 1.0)] = [100.0 5.773159728050814e-15]
[β abs(x[1] - 1.0)] = [1000.0 2.275957200481571e-14]
[β abs(x[1] - 1.0)] = [10000.0 3.638200851696638e-13]
[β abs(x[1] - 1.0)] = [100000.0 5.820788295807233e-12]
[β abs(x[1] - 1.0)] = [1.0e6 2.3283153183228933e-11]
[β abs(x[1] - 1.0)] = [1.0e7 3.725291186640334e-10]
[β abs(x[1] - 1.0)] = [1.0e8 5.9604645663569045e-9]
[β abs(x[1] - 1.0)] = [1.0e9 2.384185793236071e-8]
[β abs(x[1] - 1.0)] = [1.0e10 3.8146972658470446e-7]
[β abs(x[1] - 1.0)] = [1.0e11 6.1035156250222045e-6]
[β abs(x[1] - 1.0)] = [1.0e12 2.4414062500088818e-5]

using FundamentalsNumericalComputation
AbsAcc = [];
α = 0.1;
β = 10 .^ (1:12);
for β in β
  A = diagm(0 => ones(5), 1 => [-1,-1,-1,-1], 3 => [α-β,0], 4 => [β])
  x = FNC.backsub(A, [α, 0, 0, 0, 1])
  push!(AbsAcc, abs(x[1] - 1.))
end
pretty_table((β=β,AbsAcc=AbsAcc),["β" "|x₁-1|"])

┌───────────────┬─────────────┐
│             β │      |x₁-1| │
├───────────────┼─────────────┤
│            10 │ 4.44089e-16 │
│           100 │ 5.77316e-15 │
│          1000 │ 2.27596e-14 │
│         10000 │  3.6382e-13 │
│        100000 │ 5.82079e-12 │
│       1000000 │ 2.32832e-11 │
│      10000000 │ 3.72529e-10 │
│     100000000 │  5.96046e-9 │
│    1000000000 │  2.38419e-8 │
│   10000000000 │   3.8147e-7 │
│  100000000000 │  6.10352e-6 │
│ 1000000000000 │  2.44141e-5 │
└───────────────┴─────────────┘

What’s inside `FNC.backsub()`?

Consider the following system of linear equations: \[\begin{bmatrix}U_{11} & U_{12} & U_{13} & U_{14} \\ 0 & U_{22} & U_{23} & U_{24} \\ 0 & 0& U_{33} & U_{34} \\ 0 & 0 & 0 & U_{44}\end{bmatrix}\begin{bmatrix}x_1\\x_2\\x_3\\x_4\end{bmatrix}=\begin{bmatrix}b_1\\b_2\\b_3\\b_4\end{bmatrix}\]
Numerical issues
- near zero or zero main diagonal entries
- subtractive cancellation as seen earlier

The analytical solution is given by: \[\begin{eqnarray}x_4 &=& b_4/U_{44} \\ x_3 &=& \left(b_3-U_{34}x_4\right)/U_{33} \\ x_2 &=& \left(b_2-U_{23}x_3-U_{24}x_4\right)/U_{22} \\ x_1 &=& \left(b_1-U_{12}x_2-U_{13}x_3-U_{14}x_4\right)/U_{11} \end{eqnarray}\]
Look for patterns. Refer to code in Function 2.3.5.

What’s new so far?

Looping
@show
=> which is the Pair object
push!() which is inserting something into a collection
pointing to entries of matrices: e.g. U[1, 1] for \(U_{11}\)
implementing summations which “look natural”
increments: +=

Learn Julia through Exercise 2.4.5

Problem: Construct an \(\mathbf{LU}\) factorization so that \(\mathbf{U}\) is a unit upper triangular matrix instead. Write a function lufact2 that uses FNC.lufact() without modification.

\(\mathbf{A}=\mathbf{LU} \implies \mathbf{A}^T=\mathbf{U}^T\mathbf{L}^T\)
Think of \(\mathbf{U}^T\) as the unit lower triangular matrix. After that, undo the transpose to get a unit upper triangular matrix.

function lufact2(A)
  UT, LT = FNC.lufact(A')
  return UpperTriangular(LT)', LowerTriangular(UT)'
end

lufact2 (generic function with 1 method)

# same as Exercise 2.1.2
A = [1 -1 1 -1; 0 1 -2 3; 1 1 1 1; 0 1 2 3]; 
L, U = lufact2(A);

4×4 LowerTriangular{Float64, Adjoint{Float64, Matrix{Float64}}}:
 1.0   ⋅    ⋅    ⋅ 
 0.0  1.0   ⋅    ⋅ 
 1.0  2.0  4.0   ⋅ 
 0.0  1.0  4.0  4.0

4×4 UpperTriangular{Float64, Adjoint{Float64, Matrix{Float64}}}:
 1.0  -1.0   1.0  -1.0
  ⋅    1.0  -2.0   3.0
  ⋅     ⋅    1.0  -1.0
  ⋅     ⋅     ⋅    1.0

A - L*U

4×4 Matrix{Float64}:
 0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0

Learn Julia through Exercise 2.4.6

Problem: Write a determinant() function using FNC.lufact().

\(\mathsf{det}\left(\mathbf{A}\right) = \mathsf{det}\left(\mathbf{L}\right)\mathsf{det}\left(\mathbf{U}\right)\)
The determinant of a triangular matrix is equal to the product of the diagonal entries.
\(\mathsf{det}\left(\mathbf{L}\right)=1\) because \(\mathbf{L}\) unit lower triangular matrix.
So, \(\mathsf{det}\left(\mathbf{A}\right)=\mathsf{det}\left(\mathbf{U}\right)=\prod_{i=1}^n U_{ii}\).

function determinant(A)
  L, U = FNC.lufact(A)
  detU = prod(diag(U))
  return detU
end

determinant (generic function with 1 method)

A = [1 -1 1 -1; 0 1 -2 3; 1 1 1 1; 0 1 2 3]; 
determinant(A), det(A)

(16.0, 16.0)

Looking inside `FNC.lufact()`

An alternative representation of matrix multiplication is as a sum of outer products.
Take the LU factorization: \[\mathbf{A}=\mathbf{LU}=\sum_{k=1}^n \boldsymbol{l}_k \mathbf{u}_k^T\]

Visually: \[\underbrace{\begin{bmatrix}\mathbf{a}_1 & \mathbf{a}_2 & \cdots & \mathbf{a}_n\end{bmatrix}}_{\mathbf{A}}=\underbrace{\begin{bmatrix}\mathbf{a}_1^T \\ \mathbf{a}_2^T \\ \vdots \\ \mathbf{a}_n^T\end{bmatrix}}_{\mathbf{A}}=\underbrace{\begin{bmatrix}\boldsymbol{l}_1 & \boldsymbol{l}_2 & \cdots & \boldsymbol{l}_n\end{bmatrix}}_{\mathbf{L}}\underbrace{\begin{bmatrix}\mathbf{u}_1^T\\ \mathbf{u}_2^T \\ \vdots \\ \mathbf{u}_n^T\end{bmatrix}}_{\mathbf{U}}\]

For the LU decomposition, the triangularity imposes a very convenient and useful structure.
The \(\boldsymbol{l}\)’s and \(\mathbf{u}\)’s have zeros in them.
Our aim is to find what all these \(\boldsymbol{l}\)’s and \(\mathbf{u}\)’s look like, given that we know what \(\mathbf{A}\) looks like.
There are two key observations which exploit the triangularity.

Use the outer product representation piece by piece: Suppose you already know \(\boldsymbol{l}_1\) and \(\mathbf{u}_1^T\). Then, \[\mathbf{A}- \boldsymbol{l}_1\mathbf{u}_1^T = \boldsymbol{l}_2\mathbf{u}_2^T+\cdots+\boldsymbol{l}_n\mathbf{u}_n^T\]
Focus on the first row and first column of the matrix with known entries (idea is from right to left!): \[\mathbf{a}_1=\mathbf{e}_1^T \sum_{k=1}^n \boldsymbol{l}_k \mathbf{u}_k^T = \sum_{k=1}^n \left(\mathbf{e}_1^T\boldsymbol{l}_k\right) \mathbf{u}_k^T = L_{11}\mathbf{u}_1^T \\ \mathbf{a}_1^T=\sum_{k=1}^n \boldsymbol{l}_k \mathbf{u}_k^T \mathbf{e}_1=\sum_{k=1}^n \boldsymbol{l}_k \left(\mathbf{u}_k^Te_1\right) = U_{11}\boldsymbol{l}_1\]

Pseudocode:
1. Initialize \(\mathbf{L}\) and \(\mathbf{U}\).
2. Make a copy of \(\mathbf{A}\). Call it A₁.
3. \(\mathbf{u}_1^T\) is the first row of A₁.
4. \(\boldsymbol{l}_1\) is the first column of A₁ divided by \(U_{11}\).
5. Compute A₂ as A₁ after taking out \(\boldsymbol{l}_1\mathbf{u}_1^T\).
6. Repeat steps 3 to 5 as if A₂ were A₁. (subscripts 1 become subscripts 2)

What is new in `FNC.lufact()`?

float(copy(A))
syntax for selecting \(k\)th row and column from a matrix: A[k,:], A[:,k]
decrements: -=
But it can have low accuracy (see Exercise 2.4.3)

Computational time

Each scalar addition, subtraction, multiplication, division, and square root counts as one flop.
Pay attention to the dominant component of the total flops.
Example: \[\sum_{k=1}^n k = \frac{n\left(n+1\right)}{2}\] but the dominant component is really \(n^2\) as \(n\) becomes very large.

Asymptotic notation:
- Let \(n\) be the problem size.
- Let \(f\left(n\right)\) and \(g\left(n\right)\) be positive-valued functions.
- \(f\left(n\right) = O\left(g\left(n\right)\right)\) whenever \(f\left(n\right)/g\left(n\right)\) is bounded above as \(n\to\infty\).
- \(f\left(n\right)\sim g\left(n\right)\) whenever \(f\left(n\right)/g\left(n\right)\to 1\) as \(n\to\infty\).
So \(\sum_{k=1}^n k=O\left(n^2\right)\) and \(\sum_{k=1}^n k \sim n^2/2\).

Let \(n\) be the dimension of a square matrix.
- Inner product: \(\sim 2n\) flops (Exercise 2.5.3)
- Matrix multiplication: \(\sim 2n^3\) flops (Exercise 2.5.4)
- Triangular systems of linear equations: \(\sim n^2\) flops
- LU factorization: \(\sim 2n^3/3\) flops
- General systems of linear equations: \(\sim 2n^3/3\) flops
Observe that the running times \(t\) obey a function that is \(O\left(n^p\right)\).

using Random;
Random.seed!(1234);
n = 400:400:4000;
t = []

Any[]

for n in n
    A = randn(n,n)  
    time = @elapsed for j in 1:12; lu(A); end
    push!(t,time)
end
pretty_table((n=n,t=t),["size" "time";"" "(sec)"])

┌──────┬───────────┐
│ size │      time │
│      │     (sec) │
├──────┼───────────┤
│  400 │ 0.0382276 │
│  800 │   0.15061 │
│ 1200 │  0.294386 │
│ 1600 │  0.616669 │
│ 2000 │   1.14733 │
│ 2400 │   1.97801 │
│ 2800 │   3.07725 │
│ 3200 │   4.55046 │
│ 3600 │   6.40056 │
│ 4000 │   6.67944 │
└──────┴───────────┘

# throwaway to force compilation
lu(randn(3,3));
Random.seed!(1234);
n = 400:400:4000;
t = [];
for n in n
    A = randn(n,n)  
    time = @elapsed for j in 1:12; lu(A); end
    push!(t,time)
end
pretty_table((n=n,t=t),["size" "time";"" "(sec)"])

┌──────┬───────────┐
│ size │      time │
│      │     (sec) │
├──────┼───────────┤
│  400 │ 0.0176111 │
│  800 │  0.119493 │
│ 1200 │  0.247341 │
│ 1600 │  0.481889 │
│ 2000 │  0.821774 │
│ 2400 │   1.35721 │
│ 2800 │   2.10107 │
│ 3200 │   3.02571 │
│ 3600 │   4.27144 │
│ 4000 │   6.06547 │
└──────┴───────────┘

n <- julia_eval("n")
t <- julia_eval("t")
logn <- log(n)
logt <- log(unlist(t))
lm(logt ~ logn)


Call:
lm(formula = logt ~ logn)

Coefficients:
(Intercept)         logn  
    -18.843        2.471

sqrt(diag(vcov(lm(logt ~ logn))))

(Intercept)        logn 
 0.40559139  0.05383431

Things of note in the code:

Throwaway to force compilation?
There is a for loop in counting the elapsed time.
FNC.lu() is used rather than FNC.lufact().
I set the random seed.

Quarto/R Markdown + Julia eccentricities I noticed:

# throwaway to force compilation
lu(randn(3,3));
lu(randn(3,3)); # throwaway to force compilation

LU{Float64, Matrix{Float64}, Vector{Int64}}
L factor:
3×3 Matrix{Float64}:
  1.0       0.0       0.0
 -0.88136   1.0       0.0
  0.29055  -0.371287  1.0
U factor:
3×3 Matrix{Float64}:
 -1.8568  -0.82243   1.08881
  0.0     -1.74805  -0.0414645
  0.0      0.0      -1.11521

Cannot type x₁ as if you are in the Julia console when you are using RStudio?