## Loading required package: JuliaCall
## Julia version 1.9.3 at location /home/apua/julia-1.9.3/bin will be used.
## Loading setup script for JuliaCall...
## Finish loading setup script for JuliaCall.
Not so much theory here
Talk about how sparse matrices are stored
Dramatic improvement in using sparse matrices (storage and computational time)
Matrix operations such as matrix multiplication can destroy sparsity.
There are tradeoffs wrt error and computation time when using a sparse representation of a sparse matrix vs a dense representation of a sparse matrix.
Sparse-aware implementations are needed.
New in Julia: nnz()
, Matrix()
,
summarysize()
, spy()
,
FNC.sprandysm()
, ft_printf()
,
eigs()
, latexstring()
For those with slow internet connections and those trying to find
the jld2
files used in the book, use
download()
and @load
.
download("source", "dest")
also can allow you to
specify where to save the downloaded fileThe path to the jld2 datafiles has to be completely specified.
Include the following prefix to the jld2
file: https://tobydriscoll.net/fnc-julia/_static/resources/
lu
. Similarly do this for qr
.# Exercise (a)
@load "smallworld.jld2" A;
m,n = size(A);
plt = plot(layout=(1,3),legend=:none,size=(600,240));
for k in 2:4
temp = A^(2^(k-1));
density = nnz(temp) / (m*n);
# pay attention to the escape characters, and how to induce a whitespace
spy!(temp, subplot=k-1, color=:blues,
title=latexstring("\\textrm{density}\\ \\mathbf{A}^$(2^(k-1)) = $density"))
end
plt
# Exercise (b)
L, U = FNC.lufact(A);
(nnz(sparse(L)), nnz(sparse(U))) ./ (n*m)
## (0.4869, 0.4959)
# Exercise (c)
qr(A)
## SparseArrays.SPQR.QRSparse{Float64, Int64}
## Q factor:
## 100×100 SparseArrays.SPQR.QRSparseQ{Float64, Int64}
## R factor:
## 100×100 SparseMatrixCSC{Float64, Int64} with 2298 stored entries:
## ⎡⠙⢿⣿⣿⣿⡄⠀⡇⡇⡆⠀⡇⡀⢨⠀⡄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎤
## ⎢⠀⠀⠙⢿⣿⡧⠀⡇⡇⡇⠀⣧⡇⢸⠀⡇⠀⠀⢠⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⡆⠀⠀⠀⠀⠀⠀⠀⢀⎥
## ⎢⠀⠀⠀⠀⠙⢿⣿⣧⣧⣇⠀⣿⡇⢸⠀⡇⠀⠀⢸⠀⠀⠀⠀⠀⠀⡆⠀⠀⠀⡇⠀⣧⠀⠀⠀⠀⢀⠀⠀⢸⎥
## ⎢⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⣷⣿⣧⣼⠀⡇⠀⠀⢸⠀⠀⠀⠀⠀⠀⡇⠀⠀⢠⡇⠀⣿⠀⠀⠀⠀⢸⠀⣴⢸⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⣿⣿⡄⡇⠀⣦⢸⡀⠀⠀⠀⠀⠀⡇⠀⠀⢸⡇⠀⣿⡄⠀⠀⠀⢸⢠⣿⢸⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⣇⡇⠀⣿⢸⣷⡄⠀⢠⠀⠀⡇⠀⠀⢸⣿⠀⣿⡇⠀⠀⠀⢸⢸⣿⢸⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⣿⣿⢸⣿⡇⠀⢸⠀⠀⡇⠀⠀⢸⣿⠀⣿⡇⠀⠀⠀⢸⢸⣿⢸⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⣿⣿⣧⡀⢸⠀⠀⡇⠀⠀⢸⣿⠀⣿⡇⠀⠀⠀⢸⢸⣿⢸⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⣿⣿⣾⢠⢀⡇⠀⠀⢸⣿⠀⣿⡇⡇⠀⠀⢸⢸⣿⢸⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⣿⣿⢸⡇⢠⠀⢸⣿⢠⣿⡇⡇⠀⠀⢸⣸⣿⢸⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⣿⣷⣾⡀⢸⣿⢸⣿⡇⡇⠀⠀⢸⣿⣿⢸⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⣿⣿⣾⣿⢸⣿⡇⡇⠀⠀⢸⣿⣿⣸⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⣿⣿⣸⣿⡇⡇⠀⢰⣾⣿⣿⣿⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⣿⣿⡇⡇⠀⣾⣿⣿⣿⣿⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⡇⣿⣆⣿⣿⣿⣿⣿⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣿⣿⣿⣿⣿⣿⣿⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⣿⣿⣿⣿⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⣿⣿⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣿⣿⎥
## ⎣⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⎦
## Row permutation:
## 100-element Vector{Int64}:
## 44
## 45
## 46
## 42
## 47
## 41
## 49
## 54
## 50
## 55
## ⋮
## 4
## 5
## 97
## 98
## 100
## 13
## 1
## 6
## 96
## Column permutation:
## 100-element Vector{Int64}:
## 48
## 49
## 47
## 46
## 45
## 50
## 51
## 44
## 52
## 53
## ⋮
## 8
## 9
## 10
## 92
## 93
## 94
## 99
## 7
## 61
eigs()
.n = 50; k = 1;
A = FNC.poisson(n) - k^2*I;
A
## 2500×2500 SparseMatrixCSC{Float64, Int64} with 12300 stored entries:
## ⎡⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎤
## ⎢⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
## ⎢⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
## ⎢⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⠀⠀⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⠀⠀⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⠀⠀⎥
## ⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⣄⠀⎥
## ⎣⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢿⣷⎦
# Exercise (a)
m1, m2 = size(A); nnz(A)/(m1*m2)
## 0.001968
# Exercise (b)
ev, evec = eigs(A, nev = 4, which = :LM);
ev
## 4-element Vector{Float64}:
## 2105.291821573482
## 2102.2965622322154
## 2102.2965622322063
## 2099.301302890972
ev, evec = eigs(A, nev = 4, which = :SM);
ev
## 4-element Vector{Float64}:
## 0.9993676562778431
## 3.994626997515675
## 3.9946269975161233
## 6.989886338755049
# Exercise (c)
k = 2.5;
A = FNC.poisson(n) - k^2*I;
# alternative might be to use sigma option, produces a similar answer but must use which = :LM instead
ev, evec = eigs(A, nev = 100, which = :SM);
ev
## 100-element Vector{Float64}:
## -1.2553730024836676
## -1.255373002484174
## 1.7398863387550316
## 3.724099807219634
## 3.7240998072196883
## -4.250632343722532
## 6.7193591484582855
## 6.719359148458748
## 10.668897239421312
## 10.66889723942192
## ⋮
## 119.1804712065316
## 120.94193849257286
## 120.94193849257296
## 126.10983297559048
## 126.10983297559069
## 126.21965850193233
## 126.21965850193297
## 132.3109839009445
## 132.31098390094473
# Have not seen count() in the book yet.
count(<(0), ev)
## 3
We may not necessarily be interested in ALL eigenvalues.
Use targeted approaches instead: power iteration and inverse iteration
Setup for power iteration:
Ask what happens to \(\mathbf{A}^k \mathbf{x}\): \[\begin{split} \mathbf{A}^k\mathbf{x} &= \mathbf{V}\mathbf{D}^k \mathbf{z} = \mathbf{V}\begin{bmatrix} \lambda_1^kz_1 \\[0.5ex] \lambda_2^kz_2 \\ \vdots \\ \lambda_n^kz_n \end{bmatrix} \\ &= \lambda_1^k \left[ z_1 \mathbf{v}_{1} + z_2 \left(\frac{\lambda_2}{\lambda_1}\right) ^k \mathbf{v}_{2} + \cdots + z_n \left(\frac{\lambda_n}{\lambda_1}\right)^k \mathbf{v}_{n} \right].\end{split}\]
Key result for power iteration: \[\left\| \frac{ \mathbf{A}^k\mathbf{x}}{\lambda_1^k} - z_1\mathbf{v}_1\right\| \le |z_2|\cdot\left|\frac{\lambda_2}{\lambda_1}\right| ^k \| \mathbf{v}_{2} \| + \cdots + |z_n|\cdot\left|\frac{\lambda_n}{\lambda_1}\right|^k \| \mathbf{v}_{n} \| \rightarrow 0 \text{ as $k\rightarrow \infty$}.\]
Convergence result for power iteration \[\frac{\beta_{k+1} - \lambda_1}{\beta_{k}-\lambda_1} \rightarrow r_2 = \frac{\lambda_2}{\lambda_1} \quad \text{as } k\rightarrow \infty\]
Inverse iteration builds on power iteration.
An extra feature of inverse iteration is the possibility of introducing a user-specified shift \(s\).
# Eigenvalues from Demo 8.3.4
# s=0.7
@. abs((1,-0.75,0.6,-0.4,0)-0.7)^(-1)
## (3.333333333333333, 0.6896551724137931, 10.000000000000002, 0.9090909090909091, 1.4285714285714286)
# s=0.1
@. abs((1,-0.75,0.6,-0.4,0)-0.1)^(-1)
## (1.1111111111111112, 1.1764705882352942, 2.0, 2.0, 10.0)
# s=-0.9
@. abs((1,-0.75,0.6,-0.4,0)-(-0.9))^(-1)
## (0.5263157894736842, 6.666666666666666, 0.6666666666666666, 2.0, 1.1111111111111112)
Convergence is best when \(s\) is close to the target eigenvalue.
The algorithms behind power iteration and inverse iteration are relatively simple.
The only new things from Julia are normalize()
and
findmax()
.
The steps for power iteration are:
If you use inverse iteration, you have to modify the steps as follows:
Strictly speaking, inverse iteration does not require solving for \(\mathbf{y}\) repeatedly as \(\mathbf{A}\) stays fixed! So improve algorithm by storing a factorization of \(\mathbf{A}\).
If you use inverse iteration with a user-specified shift \(s\), you have to modify the steps as follows:
Again, since \(s\) is specified only once, \(\mathbf{A}-s\mathbf{I}\) stays fixed!
Things are different with dynamic shifting.
A = [1.1 1; 0.1 2.4];
β,x = FNC.poweriter(A, 20);
true1 = 2.472841614740048;
true2 = 1.027158385259952;
for k in 2:20
@show (β[k] - β[k-1], (β[k-1]-true1)*(true2-true1)/true1)
end
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (0.032945282730319825, 0.033510261343985916)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (0.014148863119438637, 0.014249649402727224)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (0.00596024520555849, 0.005977880597941828)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (0.0024903159116682616, 0.0024933765820342583)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (0.0010369480833101186, 0.0010374774544289079)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (0.00043116100686280134, 0.0004312524358320103)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (0.00017916946441687287, 0.00017918524595950816)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (7.443570483234296e-5, 7.443842821006927e-5)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (3.092103852297967e-5, 3.092150844084213e-5)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (1.2844237715636808e-5, 1.2844318796215859e-5)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (5.335251679117192e-6, 5.335265668725579e-6)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (2.216145624345245e-6, 2.2161480383932107e-6)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (9.205350992580463e-7, 9.205355154947661e-7)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (3.8236828103777043e-7, 3.823683529832072e-7)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (1.588265625684926e-7, 1.5882657493870146e-7)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (6.597270907349184e-8, 6.597271137842712e-8)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (2.7403463942476947e-8, 2.74034643460236e-8)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (1.138273386658284e-8, 1.138273418071293e-8)
## (β[k] - β[k - 1], ((β[k - 1] - true1) * (true2 - true1)) / true1) = (4.7281116799524625e-9, 4.728111677409651e-9)
Describe what happens during power iteration using
A simple but interesting exercise meant to look into a violation of the dominant eigenvalue condition: \[|\lambda_1| > |\lambda_2| \ge |\lambda_3| \ge \cdots \ge |\lambda_n|\]
The given matrix has eigenvalues 1 and -1.
Adjust code to have a user-specified initial input.
Code later foreshadows the basis of Krylov methods (compare to first part of Section 8.4)
# Modify function to have user-specified starting vector
function poweriterinit(A,numiter,init)
n = size(A,1)
x = init
β = zeros(numiter)
for k in 1:numiter
y = A*x
m = argmax(abs.(y))
β[k] = y[m]/x[m]
x = y/y[m]
end
return β,x
end
## poweriterinit (generic function with 1 method)
A = [0 1; 1 0];
x = [0.4; 0.7];
evechist = [x zeros(2,7)];
for m in 1:7
evechist[:,m+1] = A*evechist[:,m]
end
evechist
## 2×8 Matrix{Float64}:
## 0.4 0.7 0.4 0.7 0.4 0.7 0.4 0.7
## 0.7 0.4 0.7 0.4 0.7 0.4 0.7 0.4
β,fev = poweriterinit(A, 20, x)
## ([1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998, 1.7499999999999998], [0.5714285714285715, 1.0])
FNC.poisson(n)
.spy
plot of \(\mathbf{A}\). What is the density of \(\mathbf{A}\)? Most rows all have the same
number of nonzeros; find this number.eigs
for \(n=10,15,20,25\).# Exercise (a)
n = 10;
A = FNC.poisson(n);
spy(A)
nnz(A)/(n^2*n^2)
## 0.046
for k in 1:10
@show nnz(A[k,:])
end
## nnz(A[k, :]) = 3
## nnz(A[k, :]) = 4
## nnz(A[k, :]) = 4
## nnz(A[k, :]) = 4
## nnz(A[k, :]) = 4
## nnz(A[k, :]) = 4
## nnz(A[k, :]) = 4
## nnz(A[k, :]) = 4
## nnz(A[k, :]) = 4
## nnz(A[k, :]) = 3
# Exercise (b)
for n in 10:5:25
A = FNC.poisson(n);
ev, evec = eigs(A, nev = 1, which=:LM);
@show (n, ev)
end
## (n, ev) = (10, [96.09246335392518])
## (n, ev) = (15, [205.5122013714175])
## (n, ev) = (20, [355.4648631179063])
## (n, ev) = (25, [545.9473932882664])
# Exercise (c)
plt = plot(layout=(2,2), legend=:none);
for j in 1:4
n = 5*j+5;
A = FNC.poisson(n);
β,x = FNC.poweriter(A,100);
ev, evec = eigs(A, nev = 1, which=:LM);
err = @. ev - β
plot!(0:99,subplot=j,abs.(err),m=:o,xlabel="k",yaxis=(L"|\beta_k-\lambda_1|",:log10), title=latexstring("n=$n"))
end
plt
For symmetric matrices, the Rayleigh quotient converts an \(O(\epsilon)\) eigenvector estimate into an \(O(\epsilon^2)\) eigenvalue estimate.
Create new function powersym
to use the Rayleigh
quotient to produce the entries of β
.
FNC.poweriter
and the new
powersym
on the matrix
matrixdepot("fiedler",100)
. Plot convergence curves.Avoid log-scale here.
Fix the starting point, as the starting points in
FNC.poweriter
are random.
function powersym(A,numiter,init)
n = size(A,1)
x = init
β = zeros(numiter)
for k in 1:numiter
y = A*x
m = argmax(abs.(y))
β[k] = dot(x, y)/dot(x, x)
x = y/y[m]
end
return β,x
end
## powersym (generic function with 1 method)
A = matrixdepot("fiedler",100);
β,x = poweriterinit(A, 100, normalize(fill(1, 100), Inf));
ev, evec = eigs(A, nev = 1, which=:LM);
err = @. ev - β
## 100-element Vector{Float64}:
## -1476.3155787507021
## 140.68442124929743
## -25.915578750702025
## 4.696079724551964
## -0.8633200561712329
## 0.15863178870995398
## -0.02916308717476568
## 0.005361340315175767
## -0.0009856477900029859
## 0.00018120501181329018
## ⋮
## -9.094947017729282e-13
## -9.094947017729282e-13
## -9.094947017729282e-13
## -9.094947017729282e-13
## -9.094947017729282e-13
## -9.094947017729282e-13
## -1.3642420526593924e-12
## -9.094947017729282e-13
## -9.094947017729282e-13
p1 = plot(0:99,abs.(err),m=:o,xlabel="k",ylabel=(L"|\beta_k-\lambda_1|"), title="Power iteration");
β,x = powersym(A, 100, normalize(fill(1, 100), Inf));
err = @. ev - β
## 100-element Vector{Float64}:
## 140.68442124929788
## 4.696079724553329
## 0.15863178870949923
## 0.005361340314266272
## 0.00018120501272278489
## 6.1244581956998445e-6
## 2.069969013973605e-7
## 6.996742740739137e-9
## 2.346496330574155e-10
## 7.73070496506989e-12
## ⋮
## -4.547473508864641e-13
## -4.547473508864641e-13
## -4.547473508864641e-13
## -4.547473508864641e-13
## -4.547473508864641e-13
## 0.0
## -4.547473508864641e-13
## -4.547473508864641e-13
## -4.547473508864641e-13
p2 = plot(0:99,abs.(err),m=:o,xlabel="k",ylabel=(L"|\beta_k-\lambda_1|"), title="Rayleigh mod");
plot(p1, p2, layout=(1,2),legend=:none)
# Exercises (a) and (b)
plt = plot(layout=(2,2), legend=:none);
for j in 1:4
n = 5*j+5;
A = FNC.poisson(n);
β,x = FNC.inviter(A, 0, 50);
ev, evec = eigs(A, nev = 1, which=:SM);
err = @. ev - β
plot!(0:49,subplot=j,abs.(err),m=:o,xlabel="k",yaxis=(L"|\beta_k-\lambda_1|",:log10), title=latexstring("n=$n"))
end
plt
# Exercise (c)
n = 25; A = FNC.poisson(n);
β,v = FNC.inviter(A, 0, 50);
surface(reshape(v,n,n))
# Exercise (a)
# Modified inviter with fixed initial condition
function invitermod(A,s,numiter)
n = size(A,1)
x = normalize(fill(1, n), Inf)
β = zeros(numiter)
fact = lu(A - s*I)
for k in 1:numiter
y = fact\x
normy,m = findmax(abs.(y))
β[k] = x[m]/y[m] + s
x = y/y[m]
end
return β,x
end
## invitermod (generic function with 1 method)
# Modified inviter with fixed initial condition and dynamic shifting
function inviterdyn(A,sinit,numiter)
n = size(A,1)
x = normalize(fill(1, n), Inf)
β = zeros(numiter)
s = sinit
for k in 1:numiter
fact = lu(A - s*I)
y = fact\x
normy,m = findmax(abs.(y))
β[k] = x[m]/y[m] + s
x = y/y[m]
s = β[k]
end
return β,x,s
end
## inviterdyn (generic function with 1 method)
# Test
n = 25; A = FNC.poisson(n);
β,x = invitermod(A, 0, 50);
β
## 50-element Vector{Float64}:
## 1.3769149936297793
## 1.8356457751526984
## 1.9600789508000607
## 1.9894647227710935
## 1.995877516721149
## 1.9972213065749524
## 1.9974973885430662
## 1.9975535792896675
## 1.9975649648693654
## 1.9975672669286766
## ⋮
## 1.9975678494961702
## 1.9975678494961702
## 1.9975678494961702
## 1.9975678494961702
## 1.9975678494961702
## 1.9975678494961702
## 1.9975678494961702
## 1.9975678494961702
## 1.9975678494961702
β,x = inviterdyn(A, 0, 50);
β
## 50-element Vector{Float64}:
## 1.3769149936297793
## 1.9385601558548915
## 1.9970603088150922
## 1.9975678147111608
## 1.9975678494961906
## 1.9975678494961397
## 1.997567849496145
## 1.9975678494961504
## 1.9975678494961557
## 1.997567849496161
## ⋮
## 1.997567849496163
## 1.9975678494961684
## 1.9975678494961175
## 1.9975678494961229
## 1.9975678494961282
## 1.9975678494961335
## 1.9975678494961389
## 1.9975678494961442
## 1.9975678494961495
# Exercise (b)
A = diagm(0=>(1:100).^2, 1=>rand(99));
β,x = invitermod(A, 920, 50);
# Limited at 4 iterations, factorization will fail
β,x = invitermod(A, 920, 4);
βd,x = inviterdyn(A, 920, 4);
err = @. log10(abs(900-β));
errd = @. log10(abs(900-βd));
FNC.pretty_table((k=1:4,err=err,errd=errd),["Iteration number" "Log10 Err" "Log10 Err Dynamic"])
## ┌──────────────────┬───────────┬───────────────────┐
## │ Iteration number │ Log10 Err │ Log10 Err Dynamic │
## ├──────────────────┼───────────┼───────────────────┤
## │ 1 │ -1.46103 │ -1.46103 │
## │ 2 │ -1.77468 │ -4.70917 │
## │ 3 │ -2.08686 │ -11.2039 │
## │ 4 │ -2.39894 │ -Inf │
## └──────────────────┴───────────┴───────────────────┘
# Exercise (c)
β,x = invitermod(A, 420.5, 4);
βd,x = inviterdyn(A, 420.5, 4);
err = @. log10(abs(441-β));
errd = @. log10(abs(441-βd));
FNC.pretty_table((k=1:4,err=err,errd=errd),["Iteration number" "Log10 Err" "Log10 Err Dynamic"])
## ┌──────────────────┬───────────┬───────────────────┐
## │ Iteration number │ Log10 Err │ Log10 Err Dynamic │
## ├──────────────────┼───────────┼───────────────────┤
## │ 1 │ -1.01805 │ -1.01805 │
## │ 2 │ 1.60219 │ -3.66881 │
## │ 3 │ -2.00074 │ -8.97167 │
## │ 4 │ 1.6022 │ -Inf │
## └──────────────────┴───────────┴───────────────────┘
Setup:
Krylov used in many senses:
For systems of linear equations \(\mathbf{A}\mathbf{x}=\mathbf{b}\), the idea behind the dimension reduction can be seen from: \[\min_{\mathbf{x}\in\mathcal{K}_m} \| \mathbf{A}\mathbf{x}-\mathbf{b} \| = \min_{\mathbf{z}\in\mathbb{C}^m} \| \mathbf{A}(\mathbf{K}_m\mathbf{z})-\mathbf{b} \| = \min_{\mathbf{z}\in\mathbb{C}^m} \| (\mathbf{A}\mathbf{K}_m)\mathbf{z}-\mathbf{b} \|.\]
Krylov matrices will tend to have columns that are getting to become more and more linearly dependent as \(m\) grows.
Solution to the instability of \(\mathbf{K}_m\)?
The solution leads to the idea behind Arnoldi iteration.
Of course, the iteration has to start somewhere.
Arnoldi iteration also leads to a crucial identity used after Section 8.4. In particular, \[\begin{split} \mathbf{A}\mathbf{Q}_m &= \begin{bmatrix} \mathbf{A}\mathbf{q}_1 & \cdots \mathbf{A}\mathbf{q}_m \end{bmatrix}\\ & = \begin{bmatrix} \mathbf{q}_1 & \mathbf{q}_2 & \cdots & \mathbf{q}_{m+1} \end{bmatrix}\: \begin{bmatrix} H_{11} & H_{12} & \cdots & H_{1m} \\ H_{21} & H_{22} & \cdots & H_{2m} \\ & H_{32} & \ddots & \vdots \\ & & \ddots & H_{mm} \\ & & & H_{m+1,m} \end{bmatrix} = \mathbf{Q}_{m+1} \mathbf{H}_m, \end{split}\]
Calculate 2-norm condition numbers \(\kappa(\mathbf{K}_m)\) for \(m=1,\ldots,10\). Use a vector of all ones as the Krylov seed.
# (a)
λ = @. 10 + (1:100);
A = triu(rand(100,100),1) + diagm(λ);
e = fill(1, 100);
K = [e zeros(100,9)];
for m in 2:10
v = A*K[:,m-1];
K[:,m] = v/norm(v);
@show (m, cond(K[:,1:m],2))
end
## (m, cond(K[:, 1:m], 2)) = (2, 60.52427909961785)
## (m, cond(K[:, 1:m], 2)) = (3, 537.8062210977873)
## (m, cond(K[:, 1:m], 2)) = (4, 3617.771740613321)
## (m, cond(K[:, 1:m], 2)) = (5, 24604.52268268676)
## (m, cond(K[:, 1:m], 2)) = (6, 166723.84596720053)
## (m, cond(K[:, 1:m], 2)) = (7, 1.0109854318238025e6)
## (m, cond(K[:, 1:m], 2)) = (8, 6.579238635058788e6)
## (m, cond(K[:, 1:m], 2)) = (9, 4.439105993720062e7)
## (m, cond(K[:, 1:m], 2)) = (10, 3.1488560063742316e8)
# (b)
A = spdiagm(0 => fill(-2,100), 1=> fill(1,99), -1 => fill(1, 99));
e = fill(1, 100);
K = [e zeros(100,9)];
for m in 2:10
v = A*K[:,m-1];
K[:,m] = v/norm(v);
@show (m, cond(K[:,1:m],2))
end
## (m, cond(K[:, 1:m], 2)) = (2, 10.103565741473805)
## (m, cond(K[:, 1:m], 2)) = (3, 31.241956559277746)
## (m, cond(K[:, 1:m], 2)) = (4, 118.59280125956582)
## (m, cond(K[:, 1:m], 2)) = (5, 512.6146509071374)
## (m, cond(K[:, 1:m], 2)) = (6, 2376.2500907557546)
## (m, cond(K[:, 1:m], 2)) = (7, 11500.9753932114)
## (m, cond(K[:, 1:m], 2)) = (8, 57324.73114698661)
## (m, cond(K[:, 1:m], 2)) = (9, 291891.3007260457)
## (m, cond(K[:, 1:m], 2)) = (10, 1.5105902195492175e6)
# (c)
# does not work!
A = spdiagm(0 => fill(-2,200), 1=> fill(1,199), -1 => fill(1, 199));
A[200,1] = 1; A[1,200]=1;
e = fill(1, 200);
K = [e zeros(200,9)];
norm(A*K[:,1])
## 0.0
Apply Arnoldi iteration to \(\mathbf{A}=\displaystyle \begin{bmatrix} 2& 1& 1& 0\\ 1 &3 &1& 0\\ 0& 1& 3& 1\\ 0& 1& 1& 2 \end{bmatrix}\) to show that the results can depend strongly on the initial vector \(\mathbf{u}\).
A = [2 1 1 0; 1 3 1 0; 0 1 3 1; 0 1 1 2];
Q, H=FNC.arnoldi(A, [1; 0; 0; 0], 10);
Q
## 4×11 Matrix{Float64}:
## 1.0 0.0 0.0 0.0 0.0 … 0.0 0.0 0.0
## 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0
## 0.0 0.0 0.707107 0.707107 -0.743294 0.840297 0.542127 0.813733
## 0.0 0.0 0.707107 -0.707107 -0.668965 0.542127 -0.840297 0.581238
H
## 11×10 Matrix{Float64}:
## 2.0 1.0 0.707107 0.707107 … 0.840297 0.542127
## 1.0 3.0 0.707107 0.707107 0.840297 0.542127
## 0.0 1.41421 3.5 0.5 3.52674 -0.249172
## 0.0 0.0 0.5 1.5 0.805017 1.36086
## 0.0 0.0 0.0 2.9873e-15 5.52899e-16 2.6077e-15
## 0.0 0.0 0.0 0.0 … -4.95133e-17 -2.55819e-16
## 0.0 0.0 0.0 0.0 1.11057e-31 3.89754e-31
## 0.0 0.0 0.0 0.0 -3.24255e-33 -4.53957e-33
## 0.0 0.0 0.0 0.0 -1.59133e-47 -5.51771e-47
## 0.0 0.0 0.0 0.0 1.94743e-48 3.48682e-48
## 0.0 0.0 0.0 0.0 … 0.0 5.22777e-63
Q, H=FNC.arnoldi(A, fill(1,4), 10);
Q
## 4×11 Matrix{Float64}:
## 0.5 -0.5 NaN NaN NaN NaN NaN NaN NaN NaN NaN
## 0.5 0.5 NaN NaN NaN NaN NaN NaN NaN NaN NaN
## 0.5 0.5 NaN NaN NaN NaN NaN NaN NaN NaN NaN
## 0.5 -0.5 NaN NaN NaN NaN NaN NaN NaN NaN NaN
H
## 11×10 Matrix{Float64}:
## 4.5 1.5 NaN NaN NaN NaN NaN NaN NaN NaN
## 0.5 1.5 NaN NaN NaN NaN NaN NaN NaN NaN
## 0.0 0.0 NaN NaN NaN NaN NaN NaN NaN NaN
## 0.0 0.0 NaN NaN NaN NaN NaN NaN NaN NaN
## 0.0 0.0 0.0 NaN NaN NaN NaN NaN NaN NaN
## 0.0 0.0 0.0 0.0 NaN NaN NaN NaN NaN NaN
## 0.0 0.0 0.0 0.0 0.0 NaN NaN NaN NaN NaN
## 0.0 0.0 0.0 0.0 0.0 0.0 NaN NaN NaN NaN
## 0.0 0.0 0.0 0.0 0.0 0.0 0.0 NaN NaN NaN
## 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 NaN NaN
## 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 NaN
fails at second step of iteration because projection lies on the column space of the first two columns of Q
Arnoldi iteration can be used to find eigenvalues approximately
Solve \(\mathbf{A}\mathbf{x}=\lambda\mathbf{x}\) over \(\mathcal{K}_m\).
Show that \[\mathbf{Q}_m^* \mathbf{A} \mathbf{Q}_m = \tilde{\mathbf{H}}_m \] where \(\tilde{\mathbf{H}}_m\) is the upper Hessenberg matrix resulting from deleting the last row of \(\mathbf{H}_m\). What is the size of this matrix?
Show that \(\tilde{\mathbf{H}}_m\mathbf{z} \approx \lambda\mathbf{z}\).
Apply Arnoldi iteration to the matrix in Demo 8.4.3. Keep track of the error between the largest of those values (in magnitude) and the largest eigenvalue of \(\mathbf{A}\). Make a log-linear graph of the error as a function of the iteration number.
# giving complex number results
λ = @. 10 + (1:100);
A = triu(rand(100,100),1) + diagm(λ);
u = randn(100);
abserr = [];
for m in 3:42
Q, H = FNC.arnoldi(A, u, m);
maxev, evec = eigs(H[1:m, 1:m], nev=1, which=:LM);
err = abs(110-reduce(vcat, maxev));
push!(abserr, err);
@show (m, err)
end
## (m, err) = (3, 10.30476902007355)
## (m, err) = (4, 7.624138284390398)
## (m, err) = (5, 6.241967450768371)
## (m, err) = (6, 4.591295962952813)
## (m, err) = (7, 3.564663615901864)
## (m, err) = (8, 3.077658570265342)
## (m, err) = (9, 2.7246517984362697)
## (m, err) = (10, 2.7264129088965348)
## (m, err) = (11, 2.703391095534073)
## (m, err) = (12, 2.7614422770389524)
## (m, err) = (13, 3.492215696318954)
## (m, err) = (14, 1.997635406108433)
## (m, err) = (15, 0.4304597119436835)
## (m, err) = (16, 0.11525073828454424)
## (m, err) = (17, 0.13904342298243932)
## (m, err) = (18, 0.07172341681720695)
## (m, err) = (19, 0.026237325059994987)
## (m, err) = (20, 0.06009362265702123)
## (m, err) = (21, 0.06540846121528432)
## (m, err) = (22, 0.06146818450545766)
## (m, err) = (23, 0.03744358778089918)
## (m, err) = (24, 0.031755935749473)
## (m, err) = (25, 0.029159286096756887)
## (m, err) = (26, 0.026075217475565182)
## (m, err) = (27, 0.021711300320092164)
## (m, err) = (28, 0.019309287405448572)
## (m, err) = (29, 0.02017844186224238)
## (m, err) = (30, 0.01897714284304186)
## (m, err) = (31, 0.014297987935265155)
## (m, err) = (32, 0.00997514548848244)
## (m, err) = (33, 0.0071441668491729615)
## (m, err) = (34, 0.003748673901995403)
## (m, err) = (35, 0.002990287449904372)
## (m, err) = (36, 0.003324325613931478)
## (m, err) = (37, 0.0027973247574237803)
## (m, err) = (38, 0.0021182900585046127)
## (m, err) = (39, 0.001627673277255326)
## (m, err) = (40, 0.0009376686559363634)
## (m, err) = (41, 0.00029249026630395747)
## (m, err) = (42, 9.171003235053377e-5)
plot(3:42, abserr , m=:o, title="Convergence",
xlabel="Iteration number",yaxis=(L"|\lambda_m-110|", :log10))