FNC Sections 2.1 to 2.5

Andrew Pua

2023-08-24

Learning objectives

  • Learn Julia through solving systems of linear equations numerically
  • Get a sense of numerical issues arising from solving systems of linear equations
  • Learn a measure of computational time

Why bother with systems of linear equations?

  • Motivating example: Polynomial interpolation (this is a bit different from polynomial fitting)

  • Numerical issues not encountered when solving by hand:

    • Near non-invertibility can create problems.
    • Subtractive cancellation shows up here too.

Key theory

  • Let \(\mathbf{A}\) be an \(n\times n\) matrix, \(\mathbf{x}\) and \(\mathbf{b}\) be \(n\times 1\) vectors.
  • Task of chapter: Solve for \(\mathbf{x}\) in \(\mathbf{A}\mathbf{x}=\mathbf{b}\).
  • Easy from the armchair: \(\mathbf{x}=\mathbf{A}^{-1}\mathbf{b}\)
  • In practice: Takes a lot of time to figure out \(\mathbf{A}^{-1}\) even by hand

A way out

  • Wouldn’t it be easier to solve a linear system if there are zeros in strategic places?
  • Example: \[x_1=0,\ x_1+x_2=1\]
  • In matrices: \[\begin{bmatrix}1 & 0 \\ 1 & 1\end{bmatrix}\begin{bmatrix}x_1\\x_2\end{bmatrix}=\begin{bmatrix}0\\1\end{bmatrix}\]

A way out, more generally

  • Leads to the ideas of forward substitution and backward substitution!
  • Find an \(\mathbf{LU}\) factorization of \(\mathbf{A}\) so that \(\mathbf{A}=\mathbf{LU}\).
  • \(\mathbf{L}\): unit lower triangular matrix; \(\mathbf{U}\): upper triangular matrix.
  • \(\mathbf{Ax}=\mathbf{b} \implies \mathbf{L}\underbrace{\mathbf{Ux}}_{\mathbf{z}}=\mathbf{b}\).
  • Solve \(\mathbf{L}\mathbf{z}=\mathbf{b}\) and find \(\mathbf{z}\) first.
  • Then solve \(\mathbf{Ux}=\mathbf{z}\) and find \(\mathbf{x}\).

Aside: Weaving Julia in Rmd/qmd

  • Install Julia and the book-specific package FundamentalsNumericalComputation using the instructions here. Take note of the path to the Julia binary (find a folder named bin).
  • Install the latest version of the R package JuliaCall, otherwise, there will be errors described here. Below you will find a code chunk for installing JuliaCall.
install.packages('devtools') # install if you do not have this package
devtools::install_github('Non-Contradiction/JuliaCall')
  • Point to the Julia binary to ensure that Julia can work within R.
if (!require('JuliaCall')) stop("You do not have JuliaCall. It can be installed using the following command: devtools::install_github('Non-Contradiction/JuliaCall')")
# change this for your situation
julia_setup(JULIA_HOME = "/home/apua/julia-1.9.2/bin")
  • Example of a Julia code chunk:
a = sqrt(2)
1.4142135623730951
  • Example of a Julia command inside an R code chunk:
julia_command("a = sqrt(2);")
julia_eval("a")
[1] 1.414214

Learn Julia through Exercise 2.1.2

Problem: Let \(c_1,\ldots,c_4\) be unknown constants. Find a cubic polynomial in \(t\), i.e., \[p\left(t\right)=c_1+c_2t+c_3t^2+c_4t^3\] so that \[p\left(-1\right)=-2,\ p^\prime\left(-1\right)=1,\ p\left(1\right)=0,\ p^\prime\left(1\right)=-1.\]

  • By hand: You have the following system of 4 equations in 4 unknowns \[\underbrace{\begin{bmatrix}1 & -1 & 1 & -1 \\ 0& 1 & -2 & 3 \\ 1 & 1 & 1 & 1 \\ 0 & 1 & 2 & 3 \end{bmatrix}}_{\mathbf{A}}\underbrace{\begin{bmatrix}c_1\\ c_2\\ c_3\\ c_4\end{bmatrix}}_{\mathbf{x}}=\underbrace{\begin{bmatrix} -2\\1\\0\\1 \end{bmatrix}}_{\mathbf{b}}\] After some algebra, we get \[c_1=-1/2,\ c_2=3/2,\ c_3=-1/2,\ c_4 = -1/2.\]
  • Using Julia:
b = [-2; 1; 0; -1] # OR y = [-2, 1, 0, -1]
4-element Vector{Int64}:
 -2
  1
  0
 -1
A = [1 -1 1 -1; 0 1 -2 3; 1 1 1 1; 0 1 2 3] 
4×4 Matrix{Int64}:
 1  -1   1  -1
 0   1  -2   3
 1   1   1   1
 0   1   2   3
A = [1 -1 1 -1
     0 1 -2 3
     1 1 1 1
     0 1 2 3] # alternative
4×4 Matrix{Int64}:
 1  -1   1  -1
 0   1  -2   3
 1   1   1   1
 0   1   2   3
x = A \ b
4-element Vector{Float64}:
 -0.5
  1.5
 -0.5
 -0.5
b - A*x # comparable to machine precision
4-element Vector{Float64}:
 0.0
 0.0
 0.0
 0.0

Plots in Julia

Problem: We now have a cubic polynomial with the desired characteristics. Present a visualization.

  • Must be able to represent as cubic polynomial in Julia using FNC.Polynomial()
  • (Optional) ditto for the derivative
  • Draw curves over some range
  • Design the visualization
using FundamentalsNumericalComputation
# Desired cubic polynomial
p = FNC.Polynomial(x) 
Polynomial(-0.5 + 1.5*x - 0.5*x^2 - 0.5*x^3)
# Coefficients of derivative of cubic polynomial
d = [x[2]; 2x[3]; 3x[4]]
3-element Vector{Float64}:
  1.5
 -1.0
 -1.5
# Derivative of cubic polynomial
Dp =  FNC.Polynomial(d)
Polynomial(1.5 - 1.0*x - 1.5*x^2)
scatter([-1; 1], [-2; 0], label="Cubic info", legend=:topleft,
    xlabel=L"t", ylabel=L"p", 
    title="Exercise 2.1.2")
trange = range(-1, 1, length = 500);
yvals = p.(trange);
dyvals = Dp.(trange);
plot!(trange, yvals, label = L"p(t)")
scatter!([-1; 1], [1; -1], label="Derivative info", legend=:left, ylabel=L"p,\ p'")
plot!(trange, dyvals, label = L"p'(t)", l=:dash)

What is learned about Julia so far

  • representing vectors and matrices
  • some operations on vectors and matrices
  • . for element-wise computing and accessing package-specific commands (similar to R’s ::)
  • ! for “appending”
  • : for options (?) and used for row/column access later
  • basic plotting and quite some options

Learn Julia through Exercise 2.3.7

Problem: Consider the following system of linear equations: \[\underbrace{\begin{bmatrix}1 & -1 & 0 & \alpha-\beta & \beta \\ 0 & 1 & -1 & 0 & 0 \\ 0 & 0 & 1 & -1 & 0 \\ 0 & 0 &0 &1 & -1 \\ 0 & 0 & 0 & 0 & 1\end{bmatrix}}_{\mathbf{A}}\underbrace{\begin{bmatrix}x_1\\x_2\\x_3\\x_4\\x_5\end{bmatrix}}_{\mathbf{x}}=\underbrace{\begin{bmatrix}\alpha \\0 \\ 0 \\ 0 \\ 1\end{bmatrix}}_{\mathbf{b}}\]

If we solve for \(\mathbf{x}\) by hand:

  • \(x_1=x_2=\cdots=x_5=1\) is the analytical solution using backward substitution.
  • But there is a big loss of accuracy numerically depending on the relative sizes of \(\alpha\) and \(\beta\).
  • This is another example of subtractive cancellation.

Use Function 2.3.5 (FNC.backsub()) to solve for \(\mathbf{x}\) with \(\alpha=0.1\) and \(\beta=10, 10^2, \ldots, 10^{12}\). Tabulate the values of \(\beta\) and \(|x_1-1|\).

  • Loop over values of \(\beta\)
  • Use element-wise exponentiation over a range 1:12
  • Input the \(\mathbf{A}\) (done in a slightly different way which uses FNC.diagm())
  • Use FNC.backsub() to solve for \(\mathbf{x}\)
  • Tabulate results
using FundamentalsNumericalComputation
α = 0.1;
β = 10 .^ (1:12);
for β in β
  A = diagm(0 => ones(5), 1 => [-1,-1,-1,-1], 3 =>-β,0], 4 => [β])
  x = FNC.backsub(A, [α, 0, 0, 0, 1] )
  @showabs(x[1] - 1.)]
end
[β abs(x[1] - 1.0)] = [10.0 4.440892098500626e-16]
[β abs(x[1] - 1.0)] = [100.0 5.773159728050814e-15]
[β abs(x[1] - 1.0)] = [1000.0 2.275957200481571e-14]
[β abs(x[1] - 1.0)] = [10000.0 3.638200851696638e-13]
[β abs(x[1] - 1.0)] = [100000.0 5.820788295807233e-12]
[β abs(x[1] - 1.0)] = [1.0e6 2.3283153183228933e-11]
[β abs(x[1] - 1.0)] = [1.0e7 3.725291186640334e-10]
[β abs(x[1] - 1.0)] = [1.0e8 5.9604645663569045e-9]
[β abs(x[1] - 1.0)] = [1.0e9 2.384185793236071e-8]
[β abs(x[1] - 1.0)] = [1.0e10 3.8146972658470446e-7]
[β abs(x[1] - 1.0)] = [1.0e11 6.1035156250222045e-6]
[β abs(x[1] - 1.0)] = [1.0e12 2.4414062500088818e-5]
using FundamentalsNumericalComputation
AbsAcc = [];
α = 0.1;
β = 10 .^ (1:12);
for β in β
  A = diagm(0 => ones(5), 1 => [-1,-1,-1,-1], 3 =>-β,0], 4 => [β])
  x = FNC.backsub(A, [α, 0, 0, 0, 1])
  push!(AbsAcc, abs(x[1] - 1.))
end
pretty_table((β=β,AbsAcc=AbsAcc),["β" "|x₁-1|"])
┌───────────────┬─────────────┐
│             β │      |x₁-1| │
├───────────────┼─────────────┤
│            10 │ 4.44089e-16 │
│           100 │ 5.77316e-15 │
│          1000 │ 2.27596e-14 │
│         10000 │  3.6382e-13 │
│        100000 │ 5.82079e-12 │
│       1000000 │ 2.32832e-11 │
│      10000000 │ 3.72529e-10 │
│     100000000 │  5.96046e-9 │
│    1000000000 │  2.38419e-8 │
│   10000000000 │   3.8147e-7 │
│  100000000000 │  6.10352e-6 │
│ 1000000000000 │  2.44141e-5 │
└───────────────┴─────────────┘

What’s inside FNC.backsub()?

  • Consider the following system of linear equations: \[\begin{bmatrix}U_{11} & U_{12} & U_{13} & U_{14} \\ 0 & U_{22} & U_{23} & U_{24} \\ 0 & 0& U_{33} & U_{34} \\ 0 & 0 & 0 & U_{44}\end{bmatrix}\begin{bmatrix}x_1\\x_2\\x_3\\x_4\end{bmatrix}=\begin{bmatrix}b_1\\b_2\\b_3\\b_4\end{bmatrix}\]

  • Numerical issues

    • near zero or zero main diagonal entries
    • subtractive cancellation as seen earlier
  • The analytical solution is given by: \[\begin{eqnarray}x_4 &=& b_4/U_{44} \\ x_3 &=& \left(b_3-U_{34}x_4\right)/U_{33} \\ x_2 &=& \left(b_2-U_{23}x_3-U_{24}x_4\right)/U_{22} \\ x_1 &=& \left(b_1-U_{12}x_2-U_{13}x_3-U_{14}x_4\right)/U_{11} \end{eqnarray}\]

  • Look for patterns. Refer to code in Function 2.3.5.

What’s new so far?

  • Looping
  • @show
  • => which is the Pair object
  • push!() which is inserting something into a collection
  • pointing to entries of matrices: e.g. U[1, 1] for \(U_{11}\)
  • implementing summations which “look natural”
  • increments: +=

Learn Julia through Exercise 2.4.5

Problem: Construct an \(\mathbf{LU}\) factorization so that \(\mathbf{U}\) is a unit upper triangular matrix instead. Write a function lufact2 that uses FNC.lufact() without modification.

  • \(\mathbf{A}=\mathbf{LU} \implies \mathbf{A}^T=\mathbf{U}^T\mathbf{L}^T\)
  • Think of \(\mathbf{U}^T\) as the unit lower triangular matrix. After that, undo the transpose to get a unit upper triangular matrix.
function lufact2(A)
  UT, LT = FNC.lufact(A')
  return UpperTriangular(LT)', LowerTriangular(UT)'
end
lufact2 (generic function with 1 method)
# same as Exercise 2.1.2
A = [1 -1 1 -1; 0 1 -2 3; 1 1 1 1; 0 1 2 3]; 
L, U = lufact2(A);
L
4×4 LowerTriangular{Float64, Adjoint{Float64, Matrix{Float64}}}:
 1.0   ⋅    ⋅    ⋅ 
 0.0  1.0   ⋅    ⋅ 
 1.0  2.0  4.0   ⋅ 
 0.0  1.0  4.0  4.0
U
4×4 UpperTriangular{Float64, Adjoint{Float64, Matrix{Float64}}}:
 1.0  -1.0   1.0  -1.0
  ⋅    1.0  -2.0   3.0
  ⋅     ⋅    1.0  -1.0
  ⋅     ⋅     ⋅    1.0
A - L*U
4×4 Matrix{Float64}:
 0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0

Learn Julia through Exercise 2.4.6

Problem: Write a determinant() function using FNC.lufact().

  • \(\mathsf{det}\left(\mathbf{A}\right) = \mathsf{det}\left(\mathbf{L}\right)\mathsf{det}\left(\mathbf{U}\right)\)
  • The determinant of a triangular matrix is equal to the product of the diagonal entries.
  • \(\mathsf{det}\left(\mathbf{L}\right)=1\) because \(\mathbf{L}\) unit lower triangular matrix.
  • So, \(\mathsf{det}\left(\mathbf{A}\right)=\mathsf{det}\left(\mathbf{U}\right)=\prod_{i=1}^n U_{ii}\).
function determinant(A)
  L, U = FNC.lufact(A)
  detU = prod(diag(U))
  return detU
end
determinant (generic function with 1 method)
A = [1 -1 1 -1; 0 1 -2 3; 1 1 1 1; 0 1 2 3]; 
determinant(A), det(A)
(16.0, 16.0)

Looking inside FNC.lufact()

  • An alternative representation of matrix multiplication is as a sum of outer products.
  • Take the LU factorization: \[\mathbf{A}=\mathbf{LU}=\sum_{k=1}^n \boldsymbol{l}_k \mathbf{u}_k^T\]
  • Visually: \[\underbrace{\begin{bmatrix}\mathbf{a}_1 & \mathbf{a}_2 & \cdots & \mathbf{a}_n\end{bmatrix}}_{\mathbf{A}}=\underbrace{\begin{bmatrix}\mathbf{a}_1^T \\ \mathbf{a}_2^T \\ \vdots \\ \mathbf{a}_n^T\end{bmatrix}}_{\mathbf{A}}=\underbrace{\begin{bmatrix}\boldsymbol{l}_1 & \boldsymbol{l}_2 & \cdots & \boldsymbol{l}_n\end{bmatrix}}_{\mathbf{L}}\underbrace{\begin{bmatrix}\mathbf{u}_1^T\\ \mathbf{u}_2^T \\ \vdots \\ \mathbf{u}_n^T\end{bmatrix}}_{\mathbf{U}}\]
  • For the LU decomposition, the triangularity imposes a very convenient and useful structure.
  • The \(\boldsymbol{l}\)’s and \(\mathbf{u}\)’s have zeros in them.
  • Our aim is to find what all these \(\boldsymbol{l}\)’s and \(\mathbf{u}\)’s look like, given that we know what \(\mathbf{A}\) looks like.
  • There are two key observations which exploit the triangularity.
  1. Use the outer product representation piece by piece: Suppose you already know \(\boldsymbol{l}_1\) and \(\mathbf{u}_1^T\). Then, \[\mathbf{A}- \boldsymbol{l}_1\mathbf{u}_1^T = \boldsymbol{l}_2\mathbf{u}_2^T+\cdots+\boldsymbol{l}_n\mathbf{u}_n^T\]
  2. Focus on the first row and first column of the matrix with known entries (idea is from right to left!): \[\mathbf{a}_1=\mathbf{e}_1^T \sum_{k=1}^n \boldsymbol{l}_k \mathbf{u}_k^T = \sum_{k=1}^n \left(\mathbf{e}_1^T\boldsymbol{l}_k\right) \mathbf{u}_k^T = L_{11}\mathbf{u}_1^T \\ \mathbf{a}_1^T=\sum_{k=1}^n \boldsymbol{l}_k \mathbf{u}_k^T \mathbf{e}_1=\sum_{k=1}^n \boldsymbol{l}_k \left(\mathbf{u}_k^Te_1\right) = U_{11}\boldsymbol{l}_1\]
  • Pseudocode:

    1. Initialize \(\mathbf{L}\) and \(\mathbf{U}\).
    2. Make a copy of \(\mathbf{A}\). Call it A₁.
    3. \(\mathbf{u}_1^T\) is the first row of A₁.
    4. \(\boldsymbol{l}_1\) is the first column of A₁ divided by \(U_{11}\).
    5. Compute A₂ as A₁ after taking out \(\boldsymbol{l}_1\mathbf{u}_1^T\).
    6. Repeat steps 3 to 5 as if A₂ were A₁. (subscripts 1 become subscripts 2)

What is new in FNC.lufact()?

  • float(copy(A))
  • syntax for selecting \(k\)th row and column from a matrix: A[k,:], A[:,k]
  • decrements: -=
  • But it can have low accuracy (see Exercise 2.4.3)

Computational time

  • Each scalar addition, subtraction, multiplication, division, and square root counts as one flop.
  • Pay attention to the dominant component of the total flops.
  • Example: \[\sum_{k=1}^n k = \frac{n\left(n+1\right)}{2}\] but the dominant component is really \(n^2\) as \(n\) becomes very large.
  • Asymptotic notation:

    • Let \(n\) be the problem size.
    • Let \(f\left(n\right)\) and \(g\left(n\right)\) be positive-valued functions.
    • \(f\left(n\right) = O\left(g\left(n\right)\right)\) whenever \(f\left(n\right)/g\left(n\right)\) is bounded above as \(n\to\infty\).
    • \(f\left(n\right)\sim g\left(n\right)\) whenever \(f\left(n\right)/g\left(n\right)\to 1\) as \(n\to\infty\).
  • So \(\sum_{k=1}^n k=O\left(n^2\right)\) and \(\sum_{k=1}^n k \sim n^2/2\).

  • Let \(n\) be the dimension of a square matrix.

    • Inner product: \(\sim 2n\) flops (Exercise 2.5.3)
    • Matrix multiplication: \(\sim 2n^3\) flops (Exercise 2.5.4)
    • Triangular systems of linear equations: \(\sim n^2\) flops
    • LU factorization: \(\sim 2n^3/3\) flops
    • General systems of linear equations: \(\sim 2n^3/3\) flops
  • Observe that the running times \(t\) obey a function that is \(O\left(n^p\right)\).

using Random;
Random.seed!(1234);
n = 400:400:4000;
t = []
Any[]
for n in n
    A = randn(n,n)  
    time = @elapsed for j in 1:12; lu(A); end
    push!(t,time)
end
pretty_table((n=n,t=t),["size" "time";"" "(sec)"])
┌──────┬───────────┐
│ size │      time │
│      │     (sec) │
├──────┼───────────┤
│  400 │ 0.0382276 │
│  800 │   0.15061 │
│ 1200 │  0.294386 │
│ 1600 │  0.616669 │
│ 2000 │   1.14733 │
│ 2400 │   1.97801 │
│ 2800 │   3.07725 │
│ 3200 │   4.55046 │
│ 3600 │   6.40056 │
│ 4000 │   6.67944 │
└──────┴───────────┘
# throwaway to force compilation
lu(randn(3,3));
Random.seed!(1234);
n = 400:400:4000;
t = [];
for n in n
    A = randn(n,n)  
    time = @elapsed for j in 1:12; lu(A); end
    push!(t,time)
end
pretty_table((n=n,t=t),["size" "time";"" "(sec)"])
┌──────┬───────────┐
│ size │      time │
│      │     (sec) │
├──────┼───────────┤
│  400 │ 0.0176111 │
│  800 │  0.119493 │
│ 1200 │  0.247341 │
│ 1600 │  0.481889 │
│ 2000 │  0.821774 │
│ 2400 │   1.35721 │
│ 2800 │   2.10107 │
│ 3200 │   3.02571 │
│ 3600 │   4.27144 │
│ 4000 │   6.06547 │
└──────┴───────────┘
n <- julia_eval("n")
t <- julia_eval("t")
logn <- log(n)
logt <- log(unlist(t))
lm(logt ~ logn)

Call:
lm(formula = logt ~ logn)

Coefficients:
(Intercept)         logn  
    -18.843        2.471  
sqrt(diag(vcov(lm(logt ~ logn))))
(Intercept)        logn 
 0.40559139  0.05383431 

Things of note in the code:

  • Throwaway to force compilation?
  • There is a for loop in counting the elapsed time.
  • FNC.lu() is used rather than FNC.lufact().
  • I set the random seed.
  • Quarto/R Markdown + Julia eccentricities I noticed:
# throwaway to force compilation
lu(randn(3,3));
lu(randn(3,3)); # throwaway to force compilation
LU{Float64, Matrix{Float64}, Vector{Int64}}
L factor:
3×3 Matrix{Float64}:
  1.0       0.0       0.0
 -0.88136   1.0       0.0
  0.29055  -0.371287  1.0
U factor:
3×3 Matrix{Float64}:
 -1.8568  -0.82243   1.08881
  0.0     -1.74805  -0.0414645
  0.0      0.0      -1.11521
  • Cannot type x₁ as if you are in the Julia console when you are using RStudio?