FNC Sections 8.1 to 8.4

## Loading required package: JuliaCall

## Julia version 1.9.3 at location /home/apua/julia-1.9.3/bin will be used.

## Loading setup script for JuliaCall...

## Finish loading setup script for JuliaCall.

Learning objectives

Learn about the impact of sparsity on numerical computations
Learn how to obtain specific eigenvalues (versus all eigenvalues) of a large matrix
Introduce the ideas behind Krylov methods

Highlights of Section 8.1

Not so much theory here
Talk about how sparse matrices are stored
Dramatic improvement in using sparse matrices (storage and computational time)
Matrix operations such as matrix multiplication can destroy sparsity.
- Bandedness may be destroyed if row pivoting is used.
- LU factorization of a sparse matrix does not necessarily produce sparse L and sparse U.
There are tradeoffs wrt error and computation time when using a sparse representation of a sparse matrix vs a dense representation of a sparse matrix.
Sparse-aware implementations are needed.
New in Julia: nnz(), Matrix(), summarysize(), spy(), FNC.sprandysm(), ft_printf(), eigs(), latexstring()

Detour on saving and loading files

For those with slow internet connections and those trying to find the jld2 files used in the book, use download() and @load.
- download("source", "dest") also can allow you to specify where to save the downloaded file
- It may make sense to download all these external files to avoid re-downloading these files when recompiling an rmd or qmd file.
The path to the jld2 datafiles has to be completely specified.
Include the following prefix to the jld2 file: https://tobydriscoll.net/fnc-julia/_static/resources/

Exercise 8.1.2

Work with small world matrix in Demo 8.1.2.
Find density of $\mathbf{A}$ , $\mathbf{A}^2$ , $\mathbf{A}^4$ , and $\mathbf{A}^8$ .
Find density of $\mathbf{L}$ and $\mathbf{U}$ factors of $\mathbf{A}$ using the built-in lu. Similarly do this for qr.

# Exercise (a)
@load "smallworld.jld2" A; 
m,n = size(A);
plt = plot(layout=(1,3),legend=:none,size=(600,240));
for k in 2:4
  temp = A^(2^(k-1)); 
  density = nnz(temp) / (m*n);
  # pay attention to the escape characters, and how to induce a whitespace
  spy!(temp, subplot=k-1, color=:blues,
        title=latexstring("\\textrm{density}\\ \\mathbf{A}^$(2^(k-1)) = $density"))
end
plt

# Exercise (b)
L, U = FNC.lufact(A);
(nnz(sparse(L)), nnz(sparse(U))) ./ (n*m)

## (0.4869, 0.4959)

# Exercise (c)
qr(A)

## SparseArrays.SPQR.QRSparse{Float64, Int64}
## Q factor:
## 100×100 SparseArrays.SPQR.QRSparseQ{Float64, Int64}
## R factor:
## 100×100 SparseMatrixCSC{Float64, Int64} with 2298 stored entries:
## ⎡⠙⢿⣿⣿⣿⡄⠀⡇⡇⡆⠀⡇⡀⢨⠀⡄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎤
## ⎢⠀⠀⠙⢿⣿⡧⠀⡇⡇⡇⠀⣧⡇⢸⠀⡇⠀⠀⢠⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⡆⠀⠀⠀⠀⠀⠀⠀⢀⎥
## ⎢⠀⠀⠀⠀⠙⢿⣿⣧⣧⣇⠀⣿⡇⢸⠀⡇⠀⠀⢸⠀⠀⠀⠀⠀⠀⡆⠀⠀⠀⡇⠀⣧⠀⠀⠀⠀⢀⠀⠀⢸⎥
## ⎢⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⣷⣿⣧⣼⠀⡇⠀⠀⢸⠀⠀⠀⠀⠀⠀⡇⠀⠀⢠⡇⠀⣿⠀⠀⠀⠀⢸⠀⣴⢸⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⣿⣿⡄⡇⠀⣦⢸⡀⠀⠀⠀⠀⠀⡇⠀⠀⢸⡇⠀⣿⡄⠀⠀⠀⢸⢠⣿⢸⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⣇⡇⠀⣿⢸⣷⡄⠀⢠⠀⠀⡇⠀⠀⢸⣿⠀⣿⡇⠀⠀⠀⢸⢸⣿⢸⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⣿⣿⢸⣿⡇⠀⢸⠀⠀⡇⠀⠀⢸⣿⠀⣿⡇⠀⠀⠀⢸⢸⣿⢸⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⣿⣿⣧⡀⢸⠀⠀⡇⠀⠀⢸⣿⠀⣿⡇⠀⠀⠀⢸⢸⣿⢸⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⣿⣿⣾⢠⢀⡇⠀⠀⢸⣿⠀⣿⡇⡇⠀⠀⢸⢸⣿⢸⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⣿⣿⢸⡇⢠⠀⢸⣿⢠⣿⡇⡇⠀⠀⢸⣸⣿⢸⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⣿⣷⣾⡀⢸⣿⢸⣿⡇⡇⠀⠀⢸⣿⣿⢸⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⣿⣿⣾⣿⢸⣿⡇⡇⠀⠀⢸⣿⣿⣸⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⣿⣿⣸⣿⡇⡇⠀⢰⣾⣿⣿⣿⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⣿⣿⡇⡇⠀⣾⣿⣿⣿⣿⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⡇⣿⣆⣿⣿⣿⣿⣿⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣿⣿⣿⣿⣿⣿⣿⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⣿⣿⣿⣿⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⣿⣿⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⎥
## ⎣⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⎦
## Row permutation:
## 100-element Vector{Int64}:
##   44
##   45
##   46
##   42
##   47
##   41
##   49
##   54
##   50
##   55
##    ⋮
##    4
##    5
##   97
##   98
##  100
##   13
##    1
##    6
##   96
## Column permutation:
## 100-element Vector{Int64}:
##  48
##  49
##  47
##  46
##  45
##  50
##  51
##  44
##  52
##  53
##   ⋮
##   8
##   9
##  10
##  92
##  93
##  94
##  99
##   7
##  61

Exercise 8.1.6

Matrix inspired by the Helmholtz equation for wave propagation
Find the size and density of $\mathbf{A}$ when $k=1$ .
Find the four largest and four smallest (in magnitude) eigenvalues of $\mathbf{A}$ when $k=1$ .
Find a value of $k$ so that $\mathbf{A}$ has exactly three negative eigenvalues.
Pay attention to the ordering of the eigenvalues produced by eigs().

n = 50; k = 1; 
A = FNC.poisson(n) - k^2*I;
A

## 2500×2500 SparseMatrixCSC{Float64, Int64} with 12300 stored entries:
## ⎡⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎤
## ⎢⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
## ⎢⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
## ⎢⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⎥
## ⎣⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⎦

# Exercise (a)
m1, m2 = size(A); nnz(A)/(m1*m2)

## 0.001968

# Exercise (b)
ev, evec = eigs(A, nev = 4, which = :LM);
ev

## 4-element Vector{Float64}:
##  2105.291821573482
##  2102.2965622322154
##  2102.2965622322063
##  2099.301302890972

ev, evec = eigs(A, nev = 4, which = :SM);
ev

## 4-element Vector{Float64}:
##  0.9993676562778431
##  3.994626997515675
##  3.9946269975161233
##  6.989886338755049

# Exercise (c)
k = 2.5;
A = FNC.poisson(n) - k^2*I;
# alternative might be to use sigma option, produces a similar answer but must use which = :LM instead
ev, evec = eigs(A, nev = 100, which = :SM);
ev

## 100-element Vector{Float64}:
##   -1.2553730024836676
##   -1.255373002484174
##    1.7398863387550316
##    3.724099807219634
##    3.7240998072196883
##   -4.250632343722532
##    6.7193591484582855
##    6.719359148458748
##   10.668897239421312
##   10.66889723942192
##    ⋮
##  119.1804712065316
##  120.94193849257286
##  120.94193849257296
##  126.10983297559048
##  126.10983297559069
##  126.21965850193233
##  126.21965850193297
##  132.3109839009445
##  132.31098390094473

# Have not seen count() in the book yet.
count(<(0), ev)

## 3

Highlights of Sections 8.2 and 8.3

We may not necessarily be interested in ALL eigenvalues.
Use targeted approaches instead: power iteration and inverse iteration
Setup for power iteration:
- $\mathbf{A}$ diagonalizable $n\times n$ matrix
- Eigenvalues $\lambda_1,\ldots,\lambda_n$ and corresponding linearly independent eigenvectors $\mathbf{v}_1,\ldots,\mathbf{v}_n$
- Dominant eigenvalue condition: $|\lambda_1| > |\lambda_2| \ge |\lambda_3| \ge \cdots \ge |\lambda_n|$
Ask what happens to $\mathbf{A}^k \mathbf{x}$ : $\begin{split} \mathbf{A}^k\mathbf{x} &= \mathbf{V}\mathbf{D}^k \mathbf{z} = \mathbf{V}\begin{bmatrix} \lambda_1^kz_1 \\[0.5ex] \lambda_2^kz_2 \\ \vdots \\ \lambda_n^kz_n \end{bmatrix} \\ &= \lambda_1^k \left[ z_1 \mathbf{v}_{1} + z_2 \left(\frac{\lambda_2}{\lambda_1}\right) ^k \mathbf{v}_{2} + \cdots + z_n \left(\frac{\lambda_n}{\lambda_1}\right)^k \mathbf{v}_{n} \right].\end{split}$
Key result for power iteration: $\left\| \frac{ \mathbf{A}^k\mathbf{x}}{\lambda_1^k} - z_1\mathbf{v}_1\right\| \le |z_2|\cdot\left|\frac{\lambda_2}{\lambda_1}\right| ^k \| \mathbf{v}_{2} \| + \cdots + |z_n|\cdot\left|\frac{\lambda_n}{\lambda_1}\right|^k \| \mathbf{v}_{n} \| \rightarrow 0 \text{ as $k\rightarrow \infty$}.$
Convergence result for power iteration $\frac{\beta_{k+1} - \lambda_1}{\beta_{k}-\lambda_1} \rightarrow r_2 = \frac{\lambda_2}{\lambda_1} \quad \text{as } k\rightarrow \infty$
- But not very useful, as you need some knowledge of the the first two largest eigenvalues.
Inverse iteration builds on power iteration.
- Reverse the dominant eigenvalue condition. Instead of $|\lambda_1| > |\lambda_2| \ge |\lambda_3| \ge \cdots \ge |\lambda_n|$ we assume $|\lambda_1| < |\lambda_2| \le |\lambda_3| \le \cdots \le |\lambda_n|$
- Apply power iteration to $\mathbf{A}^{-1}$ to recover the smallest eigenvalue!
An extra feature of inverse iteration is the possibility of introducing a user-specified shift $s$ .
- Reverse and adjust the dominant eigenvalue condition. Instead of $|\lambda_1| > |\lambda_2| \ge |\lambda_3| \ge \cdots \ge |\lambda_n|$ we assume $|\lambda_1-s| < |\lambda_2-s| \le |\lambda_3-s| \le \cdots \le |\lambda_n-s|$
- The shift $s$ allows you to target a specific eigenvalue, not necessarily the largest one, because Power iteration on $(\mathbf{A}-s\mathbf{I})^{-1}$ aims to recover the largest eigenvalue $(\lambda_1-s)^{-1}$ . For example:

# Eigenvalues from Demo 8.3.4
# s=0.7
@. abs((1,-0.75,0.6,-0.4,0)-0.7)^(-1)

## (3.333333333333333, 0.6896551724137931, 10.000000000000002, 0.9090909090909091, 1.4285714285714286)

# s=0.1
@. abs((1,-0.75,0.6,-0.4,0)-0.1)^(-1)

## (1.1111111111111112, 1.1764705882352942, 2.0, 2.0, 10.0)

# s=-0.9
@. abs((1,-0.75,0.6,-0.4,0)-(-0.9))^(-1)

## (0.5263157894736842, 6.666666666666666, 0.6666666666666666, 2.0, 1.1111111111111112)

Convergence is best when $s$ is close to the target eigenvalue.
The algorithms behind power iteration and inverse iteration are relatively simple.
The only new things from Julia are normalize() and findmax().
The steps for power iteration are:
- Initialize a starting value $\mathbf{x}$ .
- Calculate $\mathbf{y}=\mathbf{A}*\mathbf{x}$
- Find the location of the largest entry of $\mathbf{y}$ . Call this $m$ .
- Estimate the dominant eigenvalue of $\mathbf{A}$ as $y_m / x_m$ .
- Repeat the previous three steps with an updated starting value. Use $\mathbf{x}=\mathbf{y}/y_m$ .
- Collect the trail of estimates of the eigenvalue and eigenvector.
If you use inverse iteration, you have to modify the steps as follows:
- Initialize a starting value $\mathbf{x}$ .
- (CHANGED) Solve for $\mathbf{y}$ in $\mathbf{A}\mathbf{y}=\mathbf{x}$ .
- Find the location of the largest entry of $\mathbf{y}$ . Call this $m$ .
- (CHANGED) Estimate the dominant eigenvalue of $\mathbf{A}$ as $x_m / y_m$ .
- Repeat the previous three steps with an updated starting value. Use $\mathbf{x}=\mathbf{y}/y_m$ .
- Collect the trail of estimates of the eigenvalue and eigenvector.
Strictly speaking, inverse iteration does not require solving for $\mathbf{y}$ repeatedly as $\mathbf{A}$ stays fixed! So improve algorithm by storing a factorization of $\mathbf{A}$ .
If you use inverse iteration with a user-specified shift $s$ , you have to modify the steps as follows:
- Initialize a starting value $\mathbf{x}$ .
- (CHANGED) Solve for $\mathbf{y}$ in $\left(\mathbf{A}-s\mathbf{I}\right)\mathbf{y}=\mathbf{x}$ .
- Find the location of the largest entry of $\mathbf{y}$ . Call this $m$ .
- (CHANGED) Estimate the dominant eigenvalue of $\mathbf{A}$ as $x_m / y_m+s$ .
- Repeat the previous three steps with an updated starting value. Use $\mathbf{x}=\mathbf{y}/y_m$ .
- Collect the trail of estimates of the eigenvalue and eigenvector.
Again, since $s$ is specified only once, $\mathbf{A}-s\mathbf{I}$ stays fixed!
Things are different with dynamic shifting.
- Use the latest eigenvalue estimate as the new updated value of $s$ .
- No way to avoid repeated factorization.
- Creates numerical problems, but require fewer iterations.

Exercise 8.2.1(a)

Perform 20 power iterations for the matrix $\mathbf{A} = \begin{bmatrix} 1.1 & 1 \\ 0.1 & 2.4 \end{bmatrix} \quad$ . Quantitatively compare the observed convergence to the predicted convergence.
Eigenvalues solved by hand for $A$ in (a) are: 2.472841614740048 and 1.027158385259952.
FNC looks at $\beta_{k+1} -\beta_{k}$ at least in their demos.
Further work on convergence result: $\frac{\beta_{k+1} - \lambda_1}{\beta_{k}-\lambda_1} -1 = \frac{\beta_{k+1}-\beta_k}{\beta_{k}-\lambda_1} \rightarrow \frac{\lambda_2}{\lambda_1}-1$
Roughly, $\beta_{k+1}-\beta_k$ behaves like $\left(\beta_{k}-\lambda_1\right)\left(\frac{\lambda_2-\lambda_1}{\lambda_1}\right)$
Other exercises may be harder for looking into convergence predictions!

A = [1.1 1; 0.1 2.4];
β,x = FNC.poweriter(A, 20);
true1 = 2.472841614740048;
true2 = 1.027158385259952;
for k in 2:20
  @show (β[k] - β[k-1], (β[k-1]-true1)*(true2-true1)/true1)
end

## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (0.032945282730319825, 0.033510261343985916)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (0.014148863119438637, 0.014249649402727224)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (0.00596024520555849, 0.005977880597941828)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (0.0024903159116682616, 0.0024933765820342583)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (0.0010369480833101186, 0.0010374774544289079)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (0.00043116100686280134, 0.0004312524358320103)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (0.00017916946441687287, 0.00017918524595950816)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (7.443570483234296e-5, 7.443842821006927e-5)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (3.092103852297967e-5, 3.092150844084213e-5)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (1.2844237715636808e-5, 1.2844318796215859e-5)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (5.335251679117192e-6, 5.335265668725579e-6)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (2.216145624345245e-6, 2.2161480383932107e-6)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (9.205350992580463e-7, 9.205355154947661e-7)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (3.8236828103777043e-7, 3.823683529832072e-7)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (1.588265625684926e-7, 1.5882657493870146e-7)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (6.597270907349184e-8, 6.597271137842712e-8)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (2.7403463942476947e-8, 2.74034643460236e-8)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (1.138273386658284e-8, 1.138273418071293e-8)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (4.7281116799524625e-9, 4.728111677409651e-9)

Exercise 8.2.2

Describe what happens during power iteration using
- the matrix $\mathbf{A}= \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}$
- initial vector $\mathbf{x}=\begin{bmatrix} 0.4\\0.7 \end{bmatrix}$
A simple but interesting exercise meant to look into a violation of the dominant eigenvalue condition: $|\lambda_1| > |\lambda_2| \ge |\lambda_3| \ge \cdots \ge |\lambda_n|$
The given matrix has eigenvalues 1 and -1.
Adjust code to have a user-specified initial input.
Code later foreshadows the basis of Krylov methods (compare to first part of Section 8.4)

# Modify function to have user-specified starting vector
function poweriterinit(A,numiter,init)
    n = size(A,1)
    x = init
    β = zeros(numiter)
    for k in 1:numiter
        y = A*x
        m = argmax(abs.(y))
        β[k] = y[m]/x[m]
        x = y/y[m]
    end
    return β,x
end

## poweriterinit (generic function with 1 method)

A = [0 1; 1 0];
x = [0.4; 0.7];
evechist = [x zeros(2,7)];
for m in 1:7     
    evechist[:,m+1] = A*evechist[:,m]
end
evechist

## 2×8 Matrix{Float64}:
##  0.4  0.7  0.4  0.7  0.4  0.7  0.4  0.7
##  0.7  0.4  0.7  0.4  0.7  0.4  0.7  0.4

β,fev = poweriterinit(A, 20, x)

## ([1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998], [0.5714285714285715, 1.0])

Exercise 8.2.3

$n^2\times n^2$ sparse matrix $\mathbf{A}$ that can be constructed by FNC.poisson(n).
Let $n=10$ and make a spy plot of $\mathbf{A}$ . What is the density of $\mathbf{A}$ ? Most rows all have the same number of nonzeros; find this number.
Find the dominant $\lambda_1$ using eigs for $n=10,15,20,25$ .
Apply 100 steps of power iteration. Plot the four convergence curves $|\beta_k-\lambda_1|$ using a semi-log scale.

# Exercise (a)
n = 10;
A = FNC.poisson(n);
spy(A)

nnz(A)/(n^2*n^2)

## 0.046

for k in 1:10
  @show nnz(A[k,:])
end

## nnz(A[k, :]) = 3
## nnz(A[k, :]) = 4
## nnz(A[k, :]) = 4
## nnz(A[k, :]) = 4
## nnz(A[k, :]) = 4
## nnz(A[k, :]) = 4
## nnz(A[k, :]) = 4
## nnz(A[k, :]) = 4
## nnz(A[k, :]) = 4
## nnz(A[k, :]) = 3

# Exercise (b)
for n in 10:5:25
  A = FNC.poisson(n);
  ev, evec = eigs(A, nev = 1, which=:LM);
  @show (n, ev)
end

## (n, ev) = (10, [96.09246335392518])
## (n, ev) = (15, [205.5122013714175])
## (n, ev) = (20, [355.4648631179063])
## (n, ev) = (25, [545.9473932882664])

# Exercise (c)

plt = plot(layout=(2,2), legend=:none);
for j in 1:4
  n = 5*j+5;
  A = FNC.poisson(n);
  β,x = FNC.poweriter(A,100);
  ev, evec = eigs(A, nev = 1, which=:LM);
  err = @. ev - β
  plot!(0:99,subplot=j,abs.(err),m=:o,xlabel="k",yaxis=(L"|\beta_k-\lambda_1|",:log10), title=latexstring("n=$n"))
end
plt

Exercise 8.2.5

For symmetric matrices, the Rayleigh quotient converts an $O(\epsilon)$ eigenvector estimate into an $O(\epsilon^2)$ eigenvalue estimate.
- Given hermitian $\mathbf{A}$ and nonzero vector $\mathbf{x}$ , the Rayleigh quotient is the function $R_{\mathbf{A}}(\mathbf{x}) = \frac{ \mathbf{x}^* \mathbf{A} \mathbf{x}}{\mathbf{x}^* \mathbf{x}}$
- If $\mathbf{v}$ is an eigenvector such that $\mathbf{A} \mathbf{v}=\lambda \mathbf{v}$ , then one easily calculates that $R_{\mathbf{A}}(\mathbf{v})=\lambda.$
Create new function powersym to use the Rayleigh quotient to produce the entries of β.
- Your function should not introduce any additional matrix-vector multiplications.
- Apply the original FNC.poweriter and the new powersym on the matrix matrixdepot("fiedler",100). Plot convergence curves.
Avoid log-scale here.
Fix the starting point, as the starting points in FNC.poweriter are random.

function powersym(A,numiter,init)
    n = size(A,1)
    x = init
    β = zeros(numiter)
    for k in 1:numiter
        y = A*x
        m = argmax(abs.(y))
        β[k] = dot(x, y)/dot(x, x)
        x = y/y[m]
    end
    return β,x
end

## powersym (generic function with 1 method)

A = matrixdepot("fiedler",100);
β,x = poweriterinit(A, 100, normalize(fill(1, 100), Inf));
ev, evec = eigs(A, nev = 1, which=:LM);
err = @. ev - β

## 100-element Vector{Float64}:
##  -1476.3155787507021
##    140.68442124929743
##    -25.915578750702025
##      4.696079724551964
##     -0.8633200561712329
##      0.15863178870995398
##     -0.02916308717476568
##      0.005361340315175767
##     -0.0009856477900029859
##      0.00018120501181329018
##      ⋮
##     -9.094947017729282e-13
##     -9.094947017729282e-13
##     -9.094947017729282e-13
##     -9.094947017729282e-13
##     -9.094947017729282e-13
##     -9.094947017729282e-13
##     -1.3642420526593924e-12
##     -9.094947017729282e-13
##     -9.094947017729282e-13

p1 = plot(0:99,abs.(err),m=:o,xlabel="k",ylabel=(L"|\beta_k-\lambda_1|"), title="Power iteration");
β,x = powersym(A, 100, normalize(fill(1, 100), Inf));
err = @. ev - β

## 100-element Vector{Float64}:
##  140.68442124929788
##    4.696079724553329
##    0.15863178870949923
##    0.005361340314266272
##    0.00018120501272278489
##    6.1244581956998445e-6
##    2.069969013973605e-7
##    6.996742740739137e-9
##    2.346496330574155e-10
##    7.73070496506989e-12
##    ⋮
##   -4.547473508864641e-13
##   -4.547473508864641e-13
##   -4.547473508864641e-13
##   -4.547473508864641e-13
##   -4.547473508864641e-13
##    0.0
##   -4.547473508864641e-13
##   -4.547473508864641e-13
##   -4.547473508864641e-13

p2 = plot(0:99,abs.(err),m=:o,xlabel="k",ylabel=(L"|\beta_k-\lambda_1|"), title="Rayleigh mod");
plot(p1, p2, layout=(1,2),legend=:none)

Exercise 8.3.5 (continuation of Exercise 8.2.3)

# Exercises (a) and (b)
plt = plot(layout=(2,2), legend=:none);
for j in 1:4
  n = 5*j+5;
  A = FNC.poisson(n);
  β,x = FNC.inviter(A, 0, 50);
  ev, evec = eigs(A, nev = 1, which=:SM);
  err = @. ev - β
  plot!(0:49,subplot=j,abs.(err),m=:o,xlabel="k",yaxis=(L"|\beta_k-\lambda_1|",:log10), title=latexstring("n=$n"))
end
plt

# Exercise (c)
n = 25; A = FNC.poisson(n);
β,v = FNC.inviter(A, 0, 50);
surface(reshape(v,n,n))

Exercise 8.3.6

# Exercise (a)
# Modified inviter with fixed initial condition
function invitermod(A,s,numiter)
    n = size(A,1)
    x = normalize(fill(1, n), Inf)
    β = zeros(numiter)
    fact = lu(A - s*I)
    for k in 1:numiter
        y = fact\x
        normy,m = findmax(abs.(y))
        β[k] = x[m]/y[m] + s
        x = y/y[m]
    end
    return β,x
end

## invitermod (generic function with 1 method)

# Modified inviter with fixed initial condition and dynamic shifting
function inviterdyn(A,sinit,numiter)
    n = size(A,1)
    x = normalize(fill(1, n), Inf)
    β = zeros(numiter)
    s = sinit
    for k in 1:numiter
        fact = lu(A - s*I)
        y = fact\x
        normy,m = findmax(abs.(y))
        β[k] = x[m]/y[m] + s
        x = y/y[m]
        s = β[k]
    end
    return β,x,s
end

## inviterdyn (generic function with 1 method)

# Test
n = 25; A = FNC.poisson(n);
β,x = invitermod(A, 0, 50);
β

## 50-element Vector{Float64}:
##  1.3769149936297793
##  1.8356457751526984
##  1.9600789508000607
##  1.9894647227710935
##  1.995877516721149
##  1.9972213065749524
##  1.9974973885430662
##  1.9975535792896675
##  1.9975649648693654
##  1.9975672669286766
##  ⋮
##  1.9975678494961702
##  1.9975678494961702
##  1.9975678494961702
##  1.9975678494961702
##  1.9975678494961702
##  1.9975678494961702
##  1.9975678494961702
##  1.9975678494961702
##  1.9975678494961702

β,x = inviterdyn(A, 0, 50);
β

## 50-element Vector{Float64}:
##  1.3769149936297793
##  1.9385601558548915
##  1.9970603088150922
##  1.9975678147111608
##  1.9975678494961906
##  1.9975678494961397
##  1.997567849496145
##  1.9975678494961504
##  1.9975678494961557
##  1.997567849496161
##  ⋮
##  1.997567849496163
##  1.9975678494961684
##  1.9975678494961175
##  1.9975678494961229
##  1.9975678494961282
##  1.9975678494961335
##  1.9975678494961389
##  1.9975678494961442
##  1.9975678494961495

# Exercise (b)
A = diagm(0=>(1:100).^2, 1=>rand(99));
β,x = invitermod(A, 920, 50);
# Limited at 4 iterations, factorization will fail
β,x = invitermod(A, 920, 4);
βd,x = inviterdyn(A, 920, 4);
err = @. log10(abs(900-β));
errd = @. log10(abs(900-βd));
FNC.pretty_table((k=1:4,err=err,errd=errd),["Iteration number" "Log10 Err" "Log10 Err Dynamic"])

## ┌──────────────────┬───────────┬───────────────────┐
## │ Iteration number │ Log10 Err │ Log10 Err Dynamic │
## ├──────────────────┼───────────┼───────────────────┤
## │                1 │  -1.46103 │          -1.46103 │
## │                2 │  -1.77468 │          -4.70917 │
## │                3 │  -2.08686 │          -11.2039 │
## │                4 │  -2.39894 │              -Inf │
## └──────────────────┴───────────┴───────────────────┘

# Exercise (c)
β,x = invitermod(A, 420.5, 4);
βd,x = inviterdyn(A, 420.5, 4);
err = @. log10(abs(441-β));
errd = @. log10(abs(441-βd));
FNC.pretty_table((k=1:4,err=err,errd=errd),["Iteration number" "Log10 Err" "Log10 Err Dynamic"])

## ┌──────────────────┬───────────┬───────────────────┐
## │ Iteration number │ Log10 Err │ Log10 Err Dynamic │
## ├──────────────────┼───────────┼───────────────────┤
## │                1 │  -1.01805 │          -1.01805 │
## │                2 │   1.60219 │          -3.66881 │
## │                3 │  -2.00074 │          -8.97167 │
## │                4 │    1.6022 │              -Inf │
## └──────────────────┴───────────┴───────────────────┘

Highlights of Section 8.4

Setup:
- $n\times n$ matrix $\mathbf{A}$
- a known $n$ -vector $\mathbf{u}$ called the seed vector
- an unknown $n$ -vector $\mathbf{x}$
Krylov used in many senses:
- As a matrix: The $m$ th Krylov matrix is the $n\times m$ matrix $\mathbf{K}_m =\begin{bmatrix} \mathbf{u} & \mathbf{A}\mathbf{u} & \mathbf{A}^{2} \mathbf{u} & \cdots & \mathbf{A}^{m-1} \mathbf{u} \end{bmatrix}$
- As a subspace: The column space of $\mathbf{K}_m$ is the $m$ th Krylov subspace $\mathcal{K}_m$ .
- As a device or method to achieve dimension reduction: Replace $\mathbf{x}\in \mathbb{R}^n$ by $\mathbf{x}\in \mathcal{K}_m$ when solving systems of equations or finding eigenvalues/eigenvectors.
For systems of linear equations $\mathbf{A}\mathbf{x}=\mathbf{b}$ , the idea behind the dimension reduction can be seen from: $\min_{\mathbf{x}\in\mathcal{K}_m} \| \mathbf{A}\mathbf{x}-\mathbf{b} \| = \min_{\mathbf{z}\in\mathbb{C}^m} \| \mathbf{A}(\mathbf{K}_m\mathbf{z})-\mathbf{b} \| = \min_{\mathbf{z}\in\mathbb{C}^m} \| (\mathbf{A}\mathbf{K}_m)\mathbf{z}-\mathbf{b} \|.$
- But the actual solution of linear systems will be explored after Section 8.4.
- The focus is the structure of $\mathcal{K}_m$ .
Krylov matrices will tend to have columns that are getting to become more and more linearly dependent as $m$ grows.
Solution to the instability of $\mathbf{K}_m$ ?
- Find a numerically stable basis for $\mathcal{K}_m$ .
- You actually need to provide $m$ , thus you need a seed vector to generate $\mathbf{K}_m$ .
- But repeatedly finding a numerically stable basis for each $m$ takes time.
- We need a way to generate an additional element of a basis as $m$ increases to $m+1$ and reuse calculations.
The solution leads to the idea behind Arnoldi iteration.
- Start with known $\mathbf{q}_1,\ldots,\mathbf{q}_m$ as the orthonormal basis for $\mathcal{K}_m$ .
- Because we have a Krylov subspace, $\mathbf{A}\mathbf{q}_m \in \mathcal{K}_{m+1}$ .
- Therefore, we can express $\mathbf{A} \mathbf{q}_m = H_{1m} \, \mathbf{q}_1 + H_{2m} \, \mathbf{q}_2 + \cdots + H_{m+1,m}\,\mathbf{q}_{m+1}$ where H_{im} are undetermined constants.
- By orthonormality, $\mathbf{q}_i^* (\mathbf{A}\mathbf{q}_m) = H_{im},\qquad i=1,\ldots,m.$ Therefore, $H_{im}$ for $i=1,\ldots, m$ could be determined.
- What is left are $\mathbf{q}_{m+1}$ and $H_{m+1,m}$ . The only constraint is that $\mathbf{q}_{m+1}$ is a unit vector and orthogonal to $\mathbf{q}_1,\ldots,\mathbf{q}_m$ .
- Choose $\mathbf{v} = (\mathbf{A} \mathbf{q}_m) - H_{1m} \,\mathbf{q}_1 - H_{2m}\, \mathbf{q}_2 - \cdots - H_{mm}\, \mathbf{q}_m.$ Set $H_{m+1,m}=\|\mathbf{v}\|$ and $\mathbf{q}_{m+1}=\mathbf{v}\,/\,H_{m+1,m}$ .
Of course, the iteration has to start somewhere.
- So provide a seed vector first.
- In effect there is NO need to literally do a thin QR factorization!
- “The Arnoldi iteration finds nested orthonormal bases for a family of nested Krylov subspaces.”
Arnoldi iteration also leads to a crucial identity used after Section 8.4. In particular, $\begin{split} \mathbf{A}\mathbf{Q}_m &= \begin{bmatrix} \mathbf{A}\mathbf{q}_1 & \cdots \mathbf{A}\mathbf{q}_m \end{bmatrix}\\ & = \begin{bmatrix} \mathbf{q}_1 & \mathbf{q}_2 & \cdots & \mathbf{q}_{m+1} \end{bmatrix}\: \begin{bmatrix} H_{11} & H_{12} & \cdots & H_{1m} \\ H_{21} & H_{22} & \cdots & H_{2m} \\ & H_{32} & \ddots & \vdots \\ & & \ddots & H_{mm} \\ & & & H_{m+1,m} \end{bmatrix} = \mathbf{Q}_{m+1} \mathbf{H}_m, \end{split}$
- $\mathbf{H}_m$ is called an upper Hessenberg matrix.

Exercise 8.4.2

Calculate 2-norm condition numbers $\kappa(\mathbf{K}_m)$ for $m=1,\ldots,10$ . Use a vector of all ones as the Krylov seed.

matrix from Demo 8.4.3
$\begin{bmatrix} -2 & 1 & & & \\ 1 & -2 & 1 & & \\ & \ddots & \ddots & \ddots & \\ & & 1 & -2 & 1 \\ & & & 1 & -2 \end{bmatrix} \: (100\times 100)$
$\begin{bmatrix} -2 & 1 & & & 1 \\ 1 & -2 & 1 & & \\ & \ddots & \ddots & \ddots & \\ & & 1 & -2 & 1 \\ 1 & & & 1 & -2 \end{bmatrix} \: (200\times 200)$

# (a) 
λ = @. 10 + (1:100);
A = triu(rand(100,100),1) + diagm(λ);
e = fill(1, 100);
K = [e zeros(100,9)];
for m in 2:10
    v = A*K[:,m-1];
    K[:,m] = v/norm(v);
    @show (m, cond(K[:,1:m],2))
end

## (m, cond(K[:, 1:m], 2)) = (2, 60.52427909961785)
## (m, cond(K[:, 1:m], 2)) = (3, 537.8062210977873)
## (m, cond(K[:, 1:m], 2)) = (4, 3617.771740613321)
## (m, cond(K[:, 1:m], 2)) = (5, 24604.52268268676)
## (m, cond(K[:, 1:m], 2)) = (6, 166723.84596720053)
## (m, cond(K[:, 1:m], 2)) = (7, 1.0109854318238025e6)
## (m, cond(K[:, 1:m], 2)) = (8, 6.579238635058788e6)
## (m, cond(K[:, 1:m], 2)) = (9, 4.439105993720062e7)
## (m, cond(K[:, 1:m], 2)) = (10, 3.1488560063742316e8)

# (b)
A = spdiagm(0 => fill(-2,100), 1=> fill(1,99), -1 => fill(1, 99));
e = fill(1, 100);
K = [e zeros(100,9)];
for m in 2:10
    v = A*K[:,m-1];
    K[:,m] = v/norm(v);
    @show (m, cond(K[:,1:m],2))
end

## (m, cond(K[:, 1:m], 2)) = (2, 10.103565741473805)
## (m, cond(K[:, 1:m], 2)) = (3, 31.241956559277746)
## (m, cond(K[:, 1:m], 2)) = (4, 118.59280125956582)
## (m, cond(K[:, 1:m], 2)) = (5, 512.6146509071374)
## (m, cond(K[:, 1:m], 2)) = (6, 2376.2500907557546)
## (m, cond(K[:, 1:m], 2)) = (7, 11500.9753932114)
## (m, cond(K[:, 1:m], 2)) = (8, 57324.73114698661)
## (m, cond(K[:, 1:m], 2)) = (9, 291891.3007260457)
## (m, cond(K[:, 1:m], 2)) = (10, 1.5105902195492175e6)

# (c)
# does not work!
A = spdiagm(0 => fill(-2,200), 1=> fill(1,199), -1 => fill(1, 199));
A[200,1] = 1; A[1,200]=1;
e = fill(1, 200);
K = [e zeros(200,9)];
norm(A*K[:,1])

## 0.0

Exercise 8.4.5

Apply Arnoldi iteration to $\mathbf{A}=\displaystyle \begin{bmatrix} 2& 1& 1& 0\\ 1 &3 &1& 0\\ 0& 1& 3& 1\\ 0& 1& 1& 2 \end{bmatrix}$ to show that the results can depend strongly on the initial vector $\mathbf{u}$ .

A = [2 1 1 0; 1 3 1 0; 0 1 3 1; 0 1 1 2];
Q, H=FNC.arnoldi(A, [1; 0; 0; 0], 10);
Q

## 4×11 Matrix{Float64}:
##  1.0  0.0  0.0        0.0        0.0       …  0.0        0.0       0.0
##  0.0  1.0  0.0        0.0        0.0          0.0        0.0       0.0
##  0.0  0.0  0.707107   0.707107  -0.743294     0.840297   0.542127  0.813733
##  0.0  0.0  0.707107  -0.707107  -0.668965     0.542127  -0.840297  0.581238

## 11×10 Matrix{Float64}:
##  2.0  1.0      0.707107  0.707107    …   0.840297      0.542127
##  1.0  3.0      0.707107  0.707107        0.840297      0.542127
##  0.0  1.41421  3.5       0.5             3.52674      -0.249172
##  0.0  0.0      0.5       1.5             0.805017      1.36086
##  0.0  0.0      0.0       2.9873e-15      5.52899e-16   2.6077e-15
##  0.0  0.0      0.0       0.0         …  -4.95133e-17  -2.55819e-16
##  0.0  0.0      0.0       0.0             1.11057e-31   3.89754e-31
##  0.0  0.0      0.0       0.0            -3.24255e-33  -4.53957e-33
##  0.0  0.0      0.0       0.0            -1.59133e-47  -5.51771e-47
##  0.0  0.0      0.0       0.0             1.94743e-48   3.48682e-48
##  0.0  0.0      0.0       0.0         …   0.0           5.22777e-63



Q, H=FNC.arnoldi(A, fill(1,4), 10);
Q

## 4×11 Matrix{Float64}:
##  0.5  -0.5  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN
##  0.5   0.5  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN
##  0.5   0.5  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN
##  0.5  -0.5  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN

## 11×10 Matrix{Float64}:
##  4.5  1.5  NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN
##  0.5  1.5  NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN
##  0.0  0.0  NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN
##  0.0  0.0  NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN
##  0.0  0.0    0.0  NaN    NaN    NaN    NaN    NaN    NaN    NaN
##  0.0  0.0    0.0    0.0  NaN    NaN    NaN    NaN    NaN    NaN
##  0.0  0.0    0.0    0.0    0.0  NaN    NaN    NaN    NaN    NaN
##  0.0  0.0    0.0    0.0    0.0    0.0  NaN    NaN    NaN    NaN
##  0.0  0.0    0.0    0.0    0.0    0.0    0.0  NaN    NaN    NaN
##  0.0  0.0    0.0    0.0    0.0    0.0    0.0    0.0  NaN    NaN
##  0.0  0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0  NaN

fails at second step of iteration because projection lies on the column space of the first two columns of Q

Exercise 8.4.7

Arnoldi iteration can be used to find eigenvalues approximately
Solve $\mathbf{A}\mathbf{x}=\lambda\mathbf{x}$ over $\mathcal{K}_m$ .
Show that $\mathbf{Q}_m^* \mathbf{A} \mathbf{Q}_m = \tilde{\mathbf{H}}_m$ where $\tilde{\mathbf{H}}_m$ is the upper Hessenberg matrix resulting from deleting the last row of $\mathbf{H}_m$ . What is the size of this matrix?
- Observe that $\mathbf{Q}_m^* \mathbf{A} \mathbf{Q}_m = \mathbf{Q}_m^*\mathbf{Q}_{m+1} \mathbf{H}_m$
- Partition $\mathbf{Q}_{m+1}$ into $\mathbf{Q}_m$ and the last column of $\mathbf{Q}_{m+1}$ . Because of orthonormality, $\mathbf{Q}_m^*\mathbf{Q}_{m+1}=\begin{bmatrix}\mathbf{I}_m & \mathbf{0}\end{bmatrix}$
- Premultiplying the result to $\mathbf{H}_m$ gives the same matrix except with the last row of $\mathbf{H}_m$ deleted, which is the definition of $\tilde{\mathbf{H}}_m$ in the exercise.
Show that $\tilde{\mathbf{H}}_m\mathbf{z} \approx \lambda\mathbf{z}$ .
- Let $\mathbf{x}=\mathbf{Q}_m\mathbf{z}$ .
- Then, $\begin{eqnarray*} \mathbf{A}\mathbf{Q}_m\mathbf{z} &\approx& \lambda\mathbf{Q}_m\mathbf{z} \\ \mathbf{Q}_m^*\mathbf{A}\mathbf{Q}_m\mathbf{z} &\approx& \lambda \mathbf{Q}_m^*\mathbf{Q}_m\mathbf{z} \\ \tilde{\mathbf{H}}_m\mathbf{z} &\approx& \lambda \mathbf{z} \end{eqnarray*}$
Apply Arnoldi iteration to the matrix in Demo 8.4.3. Keep track of the error between the largest of those values (in magnitude) and the largest eigenvalue of $\mathbf{A}$ . Make a log-linear graph of the error as a function of the iteration number.
- Did not present a log-linear graph (had errors)
- Some eigenvalues obtained were complex

# giving complex number results
λ = @. 10 + (1:100);
A = triu(rand(100,100),1) + diagm(λ);
u = randn(100);
abserr = [];
for m in 3:42
    Q, H = FNC.arnoldi(A, u, m);
    maxev, evec = eigs(H[1:m, 1:m], nev=1, which=:LM);
    err =  abs(110-reduce(vcat, maxev));
    push!(abserr, err);
    @show (m, err)
end

## (m, err) = (3, 10.30476902007355)
## (m, err) = (4, 7.624138284390398)
## (m, err) = (5, 6.241967450768371)
## (m, err) = (6, 4.591295962952813)
## (m, err) = (7, 3.564663615901864)
## (m, err) = (8, 3.077658570265342)
## (m, err) = (9, 2.7246517984362697)
## (m, err) = (10, 2.7264129088965348)
## (m, err) = (11, 2.703391095534073)
## (m, err) = (12, 2.7614422770389524)
## (m, err) = (13, 3.492215696318954)
## (m, err) = (14, 1.997635406108433)
## (m, err) = (15, 0.4304597119436835)
## (m, err) = (16, 0.11525073828454424)
## (m, err) = (17, 0.13904342298243932)
## (m, err) = (18, 0.07172341681720695)
## (m, err) = (19, 0.026237325059994987)
## (m, err) = (20, 0.06009362265702123)
## (m, err) = (21, 0.06540846121528432)
## (m, err) = (22, 0.06146818450545766)
## (m, err) = (23, 0.03744358778089918)
## (m, err) = (24, 0.031755935749473)
## (m, err) = (25, 0.029159286096756887)
## (m, err) = (26, 0.026075217475565182)
## (m, err) = (27, 0.021711300320092164)
## (m, err) = (28, 0.019309287405448572)
## (m, err) = (29, 0.02017844186224238)
## (m, err) = (30, 0.01897714284304186)
## (m, err) = (31, 0.014297987935265155)
## (m, err) = (32, 0.00997514548848244)
## (m, err) = (33, 0.0071441668491729615)
## (m, err) = (34, 0.003748673901995403)
## (m, err) = (35, 0.002990287449904372)
## (m, err) = (36, 0.003324325613931478)
## (m, err) = (37, 0.0027973247574237803)
## (m, err) = (38, 0.0021182900585046127)
## (m, err) = (39, 0.001627673277255326)
## (m, err) = (40, 0.0009376686559363634)
## (m, err) = (41, 0.00029249026630395747)
## (m, err) = (42, 9.171003235053377e-5)

plot(3:42, abserr , m=:o, title="Convergence",
    xlabel="Iteration number",yaxis=(L"|\lambda_m-110|", :log10))