DeprecationWarning: scipy.arange is deprecated and will be removed in SciPy 2.0.0, use numpy.arange instead. Line [3] [17].
DeprecationWarning: scipy.vander is deprecated and will be removed in SciPy 2.0.0, use numpy.vander instead. Line [5].
DeprecationWarning: scipy.linspace is deprecated and will be removed in SciPy 2.0.0, use numpy.linspace instead. Lines [8], [9], [18], [19].
DeprecationWarning: scipy.array is deprecated and will be removed in SciPy 2.0.0, use numpy.array instead. Line [10] [11] [12] [19], [29], [30], [31], [38], [40], [43], [49], [50], [51], [60], [65].
DeprecationWarning: scipy.sqrt, scipy.exp, scipy.log are deprecated and will be removed in SciPy 2.0.0, use numpy. [11]
Also need to update hstack and vstack. [13, [14], [52].
DeprecationWarning: scipy.ones is deprecated and will be removed in SciPy 2.0.0, use numpy.ones instead. [27], [45], [47], [48], [49].
DeprecationWarning: scipy.diag. [32], [47], [49]
DeprecationWarning: scipy.eye. [61]
DeprecationWarning: scipy.cos [39].
DeprecationWarning: scipy.rand and scipy.floor are deprecated. [44] Use np.random.rand.
DeprecationWarning: scipy.outer and scipy.dot. [62], [64]
Up through line [69]. More after that.
from scipy import *
from matplotlib.pyplot import *
from scipy.linalg import *
from numpy.linalg import *
import scipy.sparse as sparse
from scipy.sparse.linalg import splu
from timeit import default_timer as timer
import FNC
# This (optional) block is for improving the display of plots.
from IPython.display import set_matplotlib_formats
set_matplotlib_formats("svg","pdf")
rcParams["figure.figsize"] = [7,4]
rcParams["lines.linewidth"] = 2
rcParams["lines.markersize"] = 4
rcParams['animation.html'] = "jshtml" # or try "html5"
We create two vectors for data about the population of China. The first has the years of census data, the other has the numbers of millions of people.
year = np.arange(1980,2020,10)
pop = [984.736, 1148.364, 1263.638, 1330.141]
It's convenient to measure time in years since 1980.
t = year - 1980
y = pop
Now we have four data points $(t_1,y_1),\dots,(t_4,y_4)$, so $n=4$ and we seek an interpolating cubic polynomial. We construct the associated Vandermonde matrix:
V = np.vander(t)
print(V)
[[ 0 0 0 1] [ 1000 100 10 1] [ 8000 400 20 1] [27000 900 30 1]]
To solve for the vector of polynomial coefficients, we use solve
(from scipy.linalg
):
c = solve(V,y)
print(c)
[-6.95000e-05 -2.39685e-01 1.87666e+01 9.84736e+02]
The algorithms used by the backslash operator are the main topic of this chapter. For now, observe that the coefficients of the cubic polynomial vary over several orders of magnitude, which is typical in this context. By our definitions, these coefficients are given in descending order of power in $t$.
We can use the resulting polynomial to estimate the population of China in 2005:
p = poly1d(c) # construct a polynomial
p(2005-1980) # apply the 1980 time shift
1303.0119375
The official figure is 1297.8, so our result is not bad.
We can visualize the interpolation process. First, we plot the data as points. Then we add a plot of the interpolant, taking care to shift the $t$ variable back to actual years.
scatter(year,y,color="k",label="data")
tt = np.linspace(0,30,300) # 300 times from 1980 to 2010
plot(1980+tt,p(tt),label="interpolant")
xlabel("year");
ylabel("population (millions)");
title("Population of China")
legend();
Let's redo it, this time continuing the curve outside of the original date range.
scatter(year,y,color="k",label="data")
tt = np.linspace(-10,50,300) # 300 times from 1980 to 2010
plot(1980+tt,p(tt),label="interpolant")
xlabel("year");
ylabel("population (millions)");
title("Population of China")
legend();
While the interpolation is plausible, the extrapolation to the future is highly questionable! As a rule, extrapolation more than a short distance beyond the original interval is not reliable.
While Python (specifically NumPy) does have distinct representations for matrices and 2D arrays, use of the explicit matrix class is officially discouraged. We follow this advice here and use arrays to represent matrices and vectors.
A vector is created using square brackets and commas to enclose and separate its entries.
x = np.array([3, 3, 0, 1, 0 ])
x.shape
(5,)
To construct a matrix, you nest the brackets to create a "vector of vectors". The inner vectors are the rows.
A = np.array([ [1,2,3,4,5],
[50,40,30,20,10],
[pi,np.sqrt(2),np.exp(1),(1+np.sqrt(5))/2,np.log(3)]
])
print(A)
A.shape
[[ 1. 2. 3. 4. 5. ] [50. 40. 30. 20. 10. ] [ 3.14159265 1.41421356 2.71828183 1.61803399 1.09861229]]
(3, 5)
In the text we default to all vectors being equivalent to matrices with a single column. That isn't quite true in Python, because even an $n\times1$ array has two dimensions, unlike a vector.
np.array([[3],[1],[2]]).shape
(3, 1)
You can also concatenate arrays with compatible dimensions, not just numbers, using hstack
and vstack
.
print( np.hstack([A,A]) )
[[ 1. 2. 3. 4. 5. 1. 2. 3. 4. 5. ] [50. 40. 30. 20. 10. 50. 40. 30. 20. 10. ] [ 3.14159265 1.41421356 2.71828183 1.61803399 1.09861229 3.14159265 1.41421356 2.71828183 1.61803399 1.09861229]]
print( np.vstack([A,A]) )
[[ 1. 2. 3. 4. 5. ] [50. 40. 30. 20. 10. ] [ 3.14159265 1.41421356 2.71828183 1.61803399 1.09861229] [ 1. 2. 3. 4. 5. ] [50. 40. 30. 20. 10. ] [ 3.14159265 1.41421356 2.71828183 1.61803399 1.09861229]]
Transposing a matrix is done by appending .T
to it. For matrices with complex values, we often want instead the hermitian, which is .conj().T
. They are the same for a real matrix.
A.T
array([[ 1. , 50. , 3.14159265], [ 2. , 40. , 1.41421356], [ 3. , 30. , 2.71828183], [ 4. , 20. , 1.61803399], [ 5. , 10. , 1.09861229]])
x.conj().T
array([3, 3, 0, 1, 0])
There are many convenient shorthand ways of building vectors and matrices other than entering all of their entries directly or in a loop. To get a vector with evenly spaced entries between two endpoints, you have two options.
print(np.arange(1,7,2)) # from 1 to 7 (not inclusive), step by 2
[1 3 5]
print(np.linspace(-1,1,5)) # from -1 to 1 (inclusive), with 5 total values
[-1. -0.5 0. 0.5 1. ]
The practical difference between these is whether you want to specify the step size in arange
or the number of points in linspace
.
Accessing an element is done by giving one (for a vector) or two index values in square brackets. In Python, indexing always starts with zero, not 1.
A = np.array([ [1,2,3,4,5],[50,40,30,20,10], np.linspace(-5,5,5) ])
x = np.array([3, 2, 0, 1, -1 ])
print("A:",A)
print("x:",x)
A: [[ 1. 2. 3. 4. 5. ] [50. 40. 30. 20. 10. ] [-5. -2.5 0. 2.5 5. ]] x: [ 3 2 0 1 -1]
print("row 2, col 3 of A:",A[1,2])
print("first element of x:",x[0])
row 2, col 3 of A: 30.0 first element of x: 3
The indices can be ranges, in which case a slice or block of the matrix is accessed. You build these using a colon in the form start:stop
. However, the last value of this range is stop-1
, not stop
.
print(A[1:3,0:2]) # rows 2 and 3, cols 1 and 2
[[50. 40. ] [-5. -2.5]]
If start
or stop
is omitted, the range extends to the first or last index.
print(x[1:]) # elements 2 through the end
[ 2 0 1 -1]
print(A[:2,0]) # first two rows in column 1
[ 1. 50.]
Notice in the last case above that even when the slice is in the shape of a column vector, the result is just a vector with one dimension and neither row nor column shape.
There are more variations on the colon ranges. A negative value means to count from the end rather than the beginning. And a colon by itself means to include everything from the relevant dimension.
print(A[:-1,:]) # all rows up to the last, all columns
[[ 1. 2. 3. 4. 5.] [50. 40. 30. 20. 10.]]
Finally, start:stop:step
means to step size or stride other than one. You can mix this with the other variations.
print(x[::2]) # all the odd indexes
[ 3 0 -1]
print(A[:,::-1]) # reverse the columns
[[ 5. 4. 3. 2. 1. ] [10. 20. 30. 40. 50. ] [ 5. 2.5 0. -2.5 -5. ]]
The matrix and vector senses of addition, subtraction, and scalar multiplication and division are all handled by the usual symbols. Two matrices of the same size (what Python calls shape
) are operated on elementwise.
A - 2*np.ones([3,5]) # subtract two from each element
array([[-1. , 0. , 1. , 2. , 3. ], [48. , 38. , 28. , 18. , 8. ], [-7. , -4.5, -2. , 0.5, 3. ]])
If one operand has a smaller number of dimensions than the other, Python tries to broadcast it in the "missing" dimension(s), and the operation proceeds if the resulting shapes are identical.
A - 2 # subtract two from each element
array([[-1. , 0. , 1. , 2. , 3. ], [48. , 38. , 28. , 18. , 8. ], [-7. , -4.5, -2. , 0.5, 3. ]])
A - np.array([1,2,3,4,5]) # repeat this row for every row of A
array([[ 0. , 0. , 0. , 0. , 0. ], [49. , 38. , 27. , 16. , 5. ], [-6. , -4.5, -3. , -1.5, 0. ]])
A - np.array([1,2,3]) # broadcasting this would be 3x3, so it's an error
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-30-de6554c3079e> in <module> ----> 1 A - np.array([1,2,3]) # broadcasting this would be 3x3, so it's an error ValueError: operands could not be broadcast together with shapes (3,5) (3,)
A - np.array([1,2,3]).reshape([3,1]) # broadcasts to each column of A
array([[ 0. , 1. , 2. , 3. , 4. ], [48. , 38. , 28. , 18. , 8. ], [-8. , -5.5, -3. , -0.5, 2. ]])
Matrix–matrix and matrix–vector products are computed using @
or dot
.
B = np.diag([-1,0,-5]) # create a diagonal 3x3
print( B@A ) # matrix product
[[ -1. -2. -3. -4. -5. ] [ 0. 0. 0. 0. 0. ] [ 25. 12.5 0. -12.5 -25. ]]
$AB$ is undefined for these matrix sizes.
print( A@B )
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-33-643fffa63ee8> in <module> ----> 1 print( A@B ) ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 3 is different from 5)
A square matrix raised to an integer power is the same as repeated matrix multiplication.
The multiplication operator *
is reserved for elementwise multiplication. Both operands have to be the same size, after any potential broadcasts.
B*A
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-34-55e7d2e10b92> in <module> ----> 1 B*A ValueError: operands could not be broadcast together with shapes (3,3) (3,5)
(A/2)*A
array([[5.000e-01, 2.000e+00, 4.500e+00, 8.000e+00, 1.250e+01], [1.250e+03, 8.000e+02, 4.500e+02, 2.000e+02, 5.000e+01], [1.250e+01, 3.125e+00, 0.000e+00, 3.125e+00, 1.250e+01]])
To raise to a power elementwise, use a double star. This will broadcast as well.
B**3
array([[ -1, 0, 0], [ 0, 0, 0], [ 0, 0, -125]])
2.0**x
array([8. , 4. , 1. , 2. , 0.5])
Because elementwise and matrix multiplication are different, so are the following.
C = np.array([[1,2],[3,4]])
print(C**2)
print(C@C)
[[ 1 4] [ 9 16]] [[ 7 10] [15 22]]
Most of the mathematical functions, such as cos, sin, log, exp and sqrt, expecting scalars as operands will be broadcast to arrays.
print(np.cos(pi*x))
[-1. 1. 1. -1. -1.]
For a square matrix $A$, the command solve(A,B)
is mathematically equivalent to $A^{-1}b$.
A = np.array([[1,0,-1],[2,2,1],[-1,-3,0]])
b = np.array([1,2,3])
A, b
(array([[ 1, 0, -1], [ 2, 2, 1], [-1, -3, 0]]), array([1, 2, 3]))
x = solve(A,b)
print(x)
[ 2.14285714 -1.71428571 1.14285714]
One way to check the answer is to compute a quantity known as the residual. It is (hopefully) close to machine precision, scaled by the size of the entries of the data. Notice that the matrix–vector multiplication here is performed by dot
.
residual = b - A@x
print(residual)
[-4.4408921e-16 -4.4408921e-16 0.0000000e+00]
If the matrix $A$ is singular, you may get an error.
A = np.array([[0,1],[0,0]])
b = np.array([1,-1])
solve(A,b)
--------------------------------------------------------------------------- LinAlgError Traceback (most recent call last) <ipython-input-43-73192002ad0f> in <module> 1 A = np.array([[0,1],[0,0]]) 2 b = np.array([1,-1]) ----> 3 solve(A,b) <__array_function__ internals> in solve(*args, **kwargs) /dsc/anaconda3/lib/python3.8/site-packages/numpy/linalg/linalg.py in solve(a, b) 391 signature = 'DD->D' if isComplexType(t) else 'dd->d' 392 extobj = get_linalg_error_extobj(_raise_linalgerror_singular) --> 393 r = gufunc(a, b, signature=signature, extobj=extobj) 394 395 return wrap(r.astype(result_t, copy=False)) /dsc/anaconda3/lib/python3.8/site-packages/numpy/linalg/linalg.py in _raise_linalgerror_singular(err, flag) 86 87 def _raise_linalgerror_singular(err, flag): ---> 88 raise LinAlgError("Singular matrix") 89 90 def _raise_linalgerror_nonposdef(err, flag): LinAlgError: Singular matrix
Detecting singularity is a lot like checking whether two floating point numbers are exactly equal: because of roundoff, it could be missed. We're headed toward a more robust way to fully describe the situation.
It's easy to get just the lower triangular part of any matrix using the tril
command.
A = 1 + np.floor(9*np.random.rand(5,5))
L = tril(A)
print(L)
[[7. 0. 0. 0. 0.] [8. 2. 0. 0. 0.] [9. 3. 2. 0. 0.] [5. 4. 6. 4. 0.] [3. 3. 2. 8. 6.]]
We'll set up and solve a linear system with this matrix.
b = np.ones(5)
x = FNC.forwardsub(L,b)
print(x)
[ 0.14285714 -0.07142857 -0.03571429 0.19642857 -0.11904762]
It's not clear what the error in this answer is. However, we can always check the residual.
b - L@x
array([0., 0., 0., 0., 0.])
Next we'll engineer a problem to which we know the exact answer.
alpha = 0.3;
beta = 2.2;
U = np.diag(np.ones(5)) + np.diag([-1,-1,-1,-1],k=1)
U[0,3:5] = [ alpha-beta, beta ]
print(U)
[[ 1. -1. 0. -1.9 2.2] [ 0. 1. -1. 0. 0. ] [ 0. 0. 1. -1. 0. ] [ 0. 0. 0. 1. -1. ] [ 0. 0. 0. 0. 1. ]]
x_exact = np.ones(5)
b = np.array([alpha,0,0,0,1])
x = FNC.backsub(U,b)
print("error:",x - x_exact)
error: [2.22044605e-16 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00]
Everything seems OK here. But another example, with a different value for $\beta$, is more troubling.
alpha = 0.3;
beta = 1e12;
U = np.diag(np.ones(5)) + np.diag([-1,-1,-1,-1],k=1)
U[0,[3,4]] = [ alpha-beta, beta ]
b = np.array([alpha,0,0,0,1])
x = FNC.backsub(U,b)
print("error:",x - x_exact)
error: [-4.8828125e-05 0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00]
It's not so good to get four digits of accuracy after starting with sixteen! But the source of the error is not hard to track down. Solving for $x_1$ performs $(\alpha-\beta)+\beta$ in the first row. Since $|\alpha|$ is so much smaller than $|\beta|$, this a recipe for losing digits to subtractive cancellation.
We create a 4-by-4 linear system with the matrix
A = np.array([
[2, 0, 4, 3],
[-4, 5, -7, -10],
[1, 15, 2, -4.5],
[-2, 0, 2, -13]
])
and with the right-hand side
b = np.array([ 4, 9, 29, 40 ])
We define an augmented matrix by tacking $b$ on the end as a new column.
S = np.hstack([A,b.reshape(4,1)])
print(S)
[[ 2. 0. 4. 3. 4. ] [ -4. 5. -7. -10. 9. ] [ 1. 15. 2. -4.5 29. ] [ -2. 0. 2. -13. 40. ]]
The goal is to introduce zeros into the lower triangle of this matrix. By using only elementary row operations, we ensure that the matrix $S$ always represents a linear system that is equivalent to the original. We proceed from left to right and top to bottom. The first step is to put a zero in the (2,1) location using a multiple of row 1:
mult21 = S[1,0]/S[0,0]
S[1,:] -= mult21*S[0,:] # -= means "subtract the following from"
print(S)
[[ 2. 0. 4. 3. 4. ] [ 0. 5. 1. -4. 17. ] [ 1. 15. 2. -4.5 29. ] [ -2. 0. 2. -13. 40. ]]
We repeat the process for the (3,1) and (4,1) entries.
mult31 = S[2,0]/S[0,0];
S[2,:] -= mult31*S[0,:];
mult41 = S[3,0]/S[0,0];
S[3,:] -= mult41*S[0,:];
print(S)
[[ 2. 0. 4. 3. 4.] [ 0. 5. 1. -4. 17.] [ 0. 15. 0. -6. 27.] [ 0. 0. 6. -10. 44.]]
The first column has the zero structure we want. To avoid interfering with that, we no longer add multiples of row 1 to anything. Instead, to handle column 2, we use multiples of row 2. We'll also exploit the highly repetitive nature of the operations to write them as a loop.
for i in range(2,4):
mult = S[i,1]/S[1,1]
S[i,:] -= mult*S[1,:]
print(S)
[[ 2. 0. 4. 3. 4.] [ 0. 5. 1. -4. 17.] [ 0. 0. -3. 6. -24.] [ 0. 0. 6. -10. 44.]]
We finish out the triangularization with a zero in the (4,3) place. It's a little silly to use a loop for just one iteration, but the point is to establish a pattern.
for i in range(3,4):
mult = S[i,2]/S[2,2]
S[i,:] -= mult*S[2,:]
print(S)
[[ 2. 0. 4. 3. 4.] [ 0. 5. 1. -4. 17.] [ 0. 0. -3. 6. -24.] [ 0. 0. 0. 2. -4.]]
Recall that $S$ is an augmented matrix: it represents the system $Ux=z$, where
U = S[:,:-1]
z = S[:,-1:].flatten() # turn it into a proper vector
print("U=",U)
print("z=",z)
U= [[ 2. 0. 4. 3.] [ 0. 5. 1. -4.] [ 0. 0. -3. 6.] [ 0. 0. 0. 2.]] z= [ 4. 17. -24. -4.]
The solutions to this system are identical to those of the original system, but this one can be solved by backward substitution.
x = FNC.backsub(U,z)
print(x)
[-3. 1. 4. -2.]
print(b - A@x)
[0. 0. 0. 0.]
We revisit the previous example using algebra to express the row operations on $A$.
A = np.array([[2, 0, 4, 3],[-4, 5, -7, -10],[1, 15, 2, -4.5],[-2, 0, 2, -13]])
We use the identity and its columns heavily.
I = np.eye(4)
print(I)
[[1. 0. 0. 0.] [0. 1. 0. 0.] [0. 0. 1. 0.] [0. 0. 0. 1.]]
<ipython-input-61-4052b55cfe46>:1: DeprecationWarning: scipy.eye is deprecated and will be removed in SciPy 2.0.0, use numpy.eye instead I = eye(4)
The first step is to put a zero in the (2,1) location using a multiple of row 1:
mult21 = A[1,0]/A[0,0]
L21 = I - mult21*np.outer(I[:,1],I[:,0])
A = np.dot(L21,A)
print(A)
[[ 2. 0. 4. 3. ] [ 0. 5. 1. -4. ] [ 1. 15. 2. -4.5] [ -2. 0. 2. -13. ]]
<ipython-input-62-a0143ed1b32e>:2: DeprecationWarning: scipy.outer is deprecated and will be removed in SciPy 2.0.0, use numpy.outer instead L21 = I - mult21*outer(I[:,1],I[:,0]) <ipython-input-62-a0143ed1b32e>:3: DeprecationWarning: scipy.dot is deprecated and will be removed in SciPy 2.0.0, use numpy.dot instead A = dot(L21,A)
print(L21)
[[1. 0. 0. 0.] [2. 1. 0. 0.] [0. 0. 1. 0.] [0. 0. 0. 1.]]
We repeat the process for the (3,1) and (4,1) entries.
mult31 = A[2,0]/A[0,0]
L31 = I - mult31*np.outer(I[:,2],I[:,0])
A = np.dot(L31,A)
mult41 = A[3,0]/A[0,0];
L41 = I - mult41*np.outer(I[:,3],I[:,0])
A = np.dot(L41,A)
print(A)
[[ 2. 0. 4. 3.] [ 0. 5. 1. -4.] [ 0. 15. 0. -6.] [ 0. 0. 6. -10.]]
And so on, following the pattern as before.
A = np.array([[2, 0, 4, 3],[-4, 5, -7, -10],[1, 15, 2, -4.5],[-2, 0, 2, -13]])
L,U = FNC.lufact(A)
print("L=",L)
print("U=",U)
L= [[ 1. 0. 0. 0. ] [-2. 1. 0. 0. ] [ 0.5 3. 1. 0. ] [-1. 0. -2. 1. ]] U= [[ 2. 0. 4. 3.] [ 0. 5. 1. -4.] [ 0. 0. -3. 6.] [ 0. 0. 0. 2.]]
It's best to compare two floating-point quantities by taking their difference.
LtimesU = L@U
print(LtimesU - A)
[[0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.]]
(Usually we can expect "zero" only up to machine precision. However, all the exact numbers in this example are also floating-point numbers.)
To solve a linear system, we no longer need the matrix $A$.
b = [4,9,29,40]
z = FNC.forwardsub(L,b)
x = FNC.backsub(U,z)
print(x)
[-3. 1. 4. -2.]
print(b - A@x)
[0. 0. 0. 0.]
n = 6
A = np.random.rand(n,n)
x = np.ones(n)
y = np.zeros(n)
for i in range(0,n):
for j in range(0,n):
y[i] += A[i,j]*x[j] # 2 flops
Each of the loops implies a summation of flops. The total flop count for this algorithm is [ \sum{i=1}^n \sum{j=1}^n 2 = \sum_{i=1}^n 2n = 2n^2. ] Since the matrix $A$ has $n^2$ elements, all of which have to be involved in the product, it seems unlikely that we could get a flop count that is smaller than $O(n^2)$.
Let's run an experiment with the built-in matrix-vector multiplication. We assume that flops dominate the computation time and thus measure elapsed time.
n = 400*np.arange(1,11)
print(" n t")
for i,n in enumerate(n):
A = np.random.randn(n,n)
x = np.random.randn(n)
start = timer()
for j in range(1,20): A@x
print(f"{n:5} {timer() - start:10.3e}")
n t 400 1.433e-03 800 3.526e-03 1200 3.633e-03 1600 4.955e-03 2000 7.433e-03 2400 2.957e-02 2800 8.322e-02 3200 1.259e-01 3600 1.539e-01 4000 1.826e-01
The reason for doing multiple repetitions at each value of $n$ above is to avoid having times so short that the resolution of the timer is a factor.
Let's repeat the experiment of the previous example for more, and larger, values of $n$.
N = np.arange(400,6200,200)
t = np.zeros(len(N))
for i,n in enumerate(N):
A = np.random.randn(n,n)
x = np.random.randn(n)
start = timer()
for j in range(1,20): A@x
t[i] = timer() - start
Plotting the time as a function of $n$ on log-log scales is equivalent to plotting the logs of the variables, but is formatted more neatly.
fig,ax = subplots()
ax.loglog(N,t,"-o",label="observed")
ylabel("elapsed time (sec)");
xlabel("$n$");
title("Timing of matrix-vector multiplications");
You can see that while the full story is complicated, the graph is trending to a straight line of positive slope. For comparison, we can plot a line that represents $O(n^2)$ growth exactly. (All such lines have slope equal to 2.)
ax.loglog(N,(N/N[-1])**2*t[-1],"--",label="$O(n^2)$")
ax.legend(); fig
We'll test the conclusion of $O(n^3)$ flops experimentally, using the built-in lu
function instead of the purely instructive lufact
.
N = np.arange(200,2600,200)
t = np.zeros(len(N))
for i,n in enumerate(N):
A = np.random.randn(n,n)
start = timer()
for j in range(1,5): lu(A)
t[i] = timer() - start
We plot the timings on a log-log graph and compare it to $O(n^3)$. The result could vary significantly from machine to machine.
loglog(N,t,"-o",label="obseved")
loglog(N,(N/N[-1])**3*t[-1],"--",label="$O(n^3)$")
legend();
xlabel("$n$");
ylabel("elapsed time (sec)");
title("Timing of LU factorizations");
Here is the previously solved system.
A = np.array([[2, 0, 4, 3],[-4, 5, -7, -10],[1, 15, 2, -4.5],[-2, 0, 2, -13]])
b = np.array([ 4, 9, 29, 40 ])
It has a perfectly good solution, obtainable through LU factorization.
L,U = FNC.lufact(A)
x = FNC.backsub( U, FNC.forwardsub(L,b) )
print(x)
[-3. 1. 4. -2.]
If we swap the second and fourth equations, nothing essential is changed, and the built-in method still finds the solution.
A[[1,3],:] = A[[3,1],:]
b[[1,3]] = b[[3,1]]
x = solve(A,b)
print(x)
[-3. 1. 4. -2.]
However, LU factorization fails.
L,U = FNC.lufact(A)
print(L)
[[ 1. 0. 0. 0. ] [-1. 1. 0. 0. ] [ 0.5 inf 1. 0. ] [-2. inf nan 1. ]]
/home/hermanr/FNCfiles/python/FNC02.py:45: RuntimeWarning: divide by zero encountered in double_scalars L[i,j] = U[i,j] / U[j,j] # row multiplier /home/hermanr/FNCfiles/python/FNC02.py:46: RuntimeWarning: invalid value encountered in multiply U[i,j:] = U[i,j:] - L[i,j]*U[j,j:] /home/hermanr/FNCfiles/python/FNC02.py:45: RuntimeWarning: invalid value encountered in double_scalars L[i,j] = U[i,j] / U[j,j] # row multiplier
Here is the system that "broke" LU factorization for us.
A = np.array([[2, 0, 4, 3],[-4, 5, -7, -10],[1, 15, 2, -4.5],[-2, 0, 2, -13]])
b = np.array([ 4, 9, 29, 40 ])
When we use the lu
function with three outputs, we get the elements of the PLU factorization.
P,L,U = lu(A)
print(L)
[[ 1. 0. 0. 0. ] [-0.25 1. 0. 0. ] [ 0.5 -0.15384615 1. 0. ] [-0.5 0.15384615 0.08333333 1. ]]
print(U)
[[ -4. 5. -7. -10. ] [ 0. 16.25 0.25 -7. ] [ 0. 0. 5.53846154 -9.07692308] [ 0. 0. 0. -0.16666667]]
print(P)
[[0. 0. 0. 1.] [1. 0. 0. 0.] [0. 1. 0. 0.] [0. 0. 1. 0.]]
However, while our notation defines $A=P^TLU$, Python is using $A=PLU$. In order to use the factors to solve the linear system, we have to use $P^T$, which equals $P^{-1}$ for a permutation matrix.
x = FNC.backsub( U, FNC.forwardsub(L,np.dot(P.T,b)) )
print(x)
[-3. 1. 4. -2.]
If you have to solve many different linear systems for the same matrix, you can perform the computationally expensive factorization just once, and repeat only the much faster triangular solves for the different right-hand sides.
The norm
command computes vector norms.
x = np.array([2,-3,1,-1])
print( norm(x) ) # 2-norm by default
3.872983346207417
print( norm(x,Inf) )
3.0
print( norm(x,1) )
7.0
A = np.array([[2,0],[1,-1]])
The default matrix norm is not the 2-norm. Instead you must provide the 2 explicitly.
print( norm(A,2) )
2.2882456112707374
You can get the 1-norm as well.
print( norm(A,1) )
3.0
The 1-norm is equivalent to
max(np.sum(abs(A),0) ) # sum down the rows (dimension=0)
3
Similarly, we can get the $\infty$-norm and check our formula for it.
print( norm(A,Inf) )
2.0
max(np.sum(abs(A),1) ) # sum across columns (dimension=1)
2
Here we illustrate the geometric interpretation of the 2-norm. First, we will sample a lot of vectors on the unit circle in $\mathbb{R}^2$.
theta = np.linspace(0,2*pi,601)
x = np.vstack([np.cos(theta),np.sin(theta)]) # 601 unit columns
We can apply A
to every column of x
simply by using a matrix multiplication.
y = A@x
We superimpose the image of the unit circle with the circle whose radius is $\|A\|_2$, and display the plots side by side.
subplot(1,2,1)
plot(x[0,:],x[1,:])
axis("equal")
title("Unit circle")
xlabel("$x_1$")
ylabel("$x_2$");
subplot(1,2,2)
plot(y[0,:],y[1,:])
plot(norm(A,2)*x[0,:],norm(A,2)*x[1,:],"--")
axis("equal")
title("Image under map")
xlabel("$y_1$")
ylabel("$y_2$");
The function cond
to computes matrix condition numbers. By default, the 2-norm is used. As an example, the family of Hilbert matrices is famously badly conditioned. Here is the $7\times 7$ case.
A = np.array([[ 1/(i+j+2) for j in range(7)] for i in range(7) ])
print(A)
[[0.5 0.33333333 0.25 0.2 0.16666667 0.14285714 0.125 ] [0.33333333 0.25 0.2 0.16666667 0.14285714 0.125 0.11111111] [0.25 0.2 0.16666667 0.14285714 0.125 0.11111111 0.1 ] [0.2 0.16666667 0.14285714 0.125 0.11111111 0.1 0.09090909] [0.16666667 0.14285714 0.125 0.11111111 0.1 0.09090909 0.08333333] [0.14285714 0.125 0.11111111 0.1 0.09090909 0.08333333 0.07692308] [0.125 0.11111111 0.1 0.09090909 0.08333333 0.07692308 0.07142857]]
kappa = cond(A)
print(kappa)
1697836321.7187743
Next we engineer a linear system problem to which we know the exact answer.
x_exact = 1.0 + np.arange(7)
b = A@x_exact
Now we perturb the data randomly with a vector of norm $10^{-12}$.
dA = np.random.randn(7,7)
dA = 1e-12*(dA/norm(dA,2))
db = np.random.randn(7)
db = 1e-12*(db/norm(db,2))
We solve the perturbed problem using built-in pivoted LU and see how the solution was changed.
x = solve(A+dA,b+db)
dx = x - x_exact
Here is the relative error in the solution.
print("relative error:",norm(dx) / norm(x_exact))
relative error: 2.631027165280246e-05
And here are upper bounds predicted using the condition number of the original matrix.
print("b_bound:",kappa*1e-12/norm(b))
print("A_bound:",kappa*1e-12/norm(A,2))
b_bound: 0.00017690558703123912 A_bound: 0.0014498431950500583
Even if we don't make any manual perturbations to the data, machine epsilon does when we solve the linear system numerically.
x = solve(A,b)
print("relative error:",norm(x-x_exact) / norm(x_exact))
print("rounding bound:",kappa/2**52)
relative error: 3.6591498138119105e-08 rounding bound: 3.769953952834136e-07
Because $\kappa\approx 10^8$, it's possible to lose 8 digits of accuracy in the process of passing from $A$ and $b$ to $x$. That's independent of the algorithm; it's inevitable once the data are expressed in double precision.
Larger Hilbert matrices are even more poorly conditioned.
A = np.array([[ 1/(i+j+2) for j in range(14)] for i in range(14) ])
kappa = cond(A)
print(kappa)
2.9943121605195904e+17
Before we compute the solution, note that $\kappa$ exceeds 1/eps
. In principle we therefore might end up with an answer that is completely wrong (i.e., a relative error greater than 100%).
print(kappa/2**52)
66.48708607047894
x_exact = 1.0 + np.arange(14)
b = A@x_exact
x = solve(A,b)
We got an answer. But in fact the error does exceed 100%.
print("relative error:",norm(x-x_exact) / norm(x_exact))
relative error: 3.3016840182370952
Here is a matrix with both lower and upper bandwidth equal to one. Such a matrix is called tridiagonal.
A = np.array([ [2, -1, 0, 0, 0, 0],
[4, 2, -1, 0, 0, 0],
[0, 3, 0, -1, 0, 0],
[0, 0, 2, 2, -1, 0],
[0, 0, 0, 1, 1, -1],
[0, 0, 0, 0, 0, 2 ]])
We can extract the elements on any diagonal using the diag
command. The "main" or central diagonal is numbered zero, above and to the right of that is positive, and below and to the left is negative.
print(np.diag(A) )
[2 2 0 2 1 2]
print(np.diag(A,1) )
[-1 -1 -1 -1 -1]
print(np.diag(A,-1) )
[4 3 2 1 0]
We can also construct matrices by specifying a diagonal with the diag
function.
A = A + np.diag([pi,8,6,7],2)
print(A)
[[ 2. -1. 3.14159265 0. 0. 0. ] [ 4. 2. -1. 8. 0. 0. ] [ 0. 3. 0. -1. 6. 0. ] [ 0. 0. 2. 2. -1. 7. ] [ 0. 0. 0. 1. 1. -1. ] [ 0. 0. 0. 0. 0. 2. ]]
L,U = FNC.lufact(A)
print(L)
[[1. 0. 0. 0. 0. 0. ] [2. 1. 0. 0. 0. 0. ] [0. 0.75 1. 0. 0. 0. ] [0. 0. 0.36614016 1. 0. 0. ] [0. 0. 0. 0.21915497 1. 0. ] [0. 0. 0. 0. 0. 1. ]]
print(U)
[[ 2. -1. 3.14159265 0. 0. 0. ] [ 0. 4. -7.28318531 8. 0. 0. ] [ 0. 0. 5.46238898 -7. 6. 0. ] [ 0. 0. 0. 4.56298115 -3.19684099 7. ] [ 0. 0. 0. 0. 1.70060359 -2.53408479] [ 0. 0. 0. 0. 0. 2. ]]
Observe above that the lower and upper bandwidths of $A$ are preserved in the $L$ and $U$ results.
We'll use a large banded matrix to observe the speedup possible in LU factorization. If we use an ordinary "dense" matrix, though, then there's no way to exploit a banded structure such as tridiagonality.
n = 8000
main = 1 + np.arange(n)
plusone = np.linspace(n-1,1,n-1)
minusone = np.ones(n-1)
A = np.diag(main) + np.diag(plusone,1) + np.diag(minusone,1)
start = timer()
lu(A)
print("time:",timer()-start)
time: 3.740448735654354
If instead we construct a proper "sparse" matrix, though, the speedup can be dramatic.
A = sparse.diags([main,plusone,minusone],[0,1,-1],format="csc")
start = timer()
splu(A)
print("time:",timer()-start)
time: 0.008521122857928276
We begin with a symmetric $A$.
A = np.array([
[2, 4, 4, 2],
[4, 5, 8, -5],
[4, 8, 6, 2],
[2, -5, 2, -26]
])
Carrying out our usual elimination in the first column leads us to
L1 = np.eye(4)
L1[1:,0] = [-2,-2,-1]
A1 = L1@A
print(A1)
[[ 2. 4. 4. 2.] [ 0. -3. 0. -9.] [ 0. 0. -2. -2.] [ 0. -9. -2. -28.]]
But now let's note that if we transpose this result, we have the same first column as before! So we could apply again and then transpose back.
A2 = (L1@A1.T).T
print(A2)
[[ 2. 0. 0. 0.] [ 0. -3. 0. -9.] [ 0. 0. -2. -2.] [ 0. -9. -2. -28.]]
Using transpose identities, this is just
A2 = A1@L1.T
print(A2)
[[ 2. 0. 0. 0.] [ 0. -3. 0. -9.] [ 0. 0. -2. -2.] [ 0. -9. -2. -28.]]
Now you can see how we proceed down and to the right, eliminating in a column and then symmetrically in the corresponding row.
L2 = np.eye(4)
L2[2:,1] = [0,-3]
A3 = L2@A2@L2.T
print(A3)
[[ 2. 0. 0. 0.] [ 0. -3. 0. 0.] [ 0. 0. -2. -2.] [ 0. 0. -2. -1.]]
Finally, we arrive at a diagonal matrix.
L3 = np.eye(4)
L3[3,2] = -1
D = L3@A3@L3.T
print(D)
[[ 2. 0. 0. 0.] [ 0. -3. 0. 0.] [ 0. 0. -2. 0.] [ 0. 0. 0. 1.]]
A randomly chosen matrix is extremely unlikely to be symmetric. However, there is a simple way to symmetrize one.
A = 1.0 + np.floor(9*np.random.rand(4,4))
B = A + A.T
print(B)
[[14. 9. 7. 10.] [ 9. 16. 11. 8.] [ 7. 11. 10. 10.] [10. 8. 10. 8.]]
Similarly, a random symmetric matrix is unlikely to be positive definite. The Cholesky algorithm always detects a non-PD matrix by quitting with an error.
cholesky(B)
--------------------------------------------------------------------------- LinAlgError Traceback (most recent call last) <ipython-input-132-7cafdfc45cb8> in <module> ----> 1 cholesky(B) <__array_function__ internals> in cholesky(*args, **kwargs) /dsc/anaconda3/lib/python3.8/site-packages/numpy/linalg/linalg.py in cholesky(a) 761 t, result_t = _commonType(a) 762 signature = 'D->D' if isComplexType(t) else 'd->d' --> 763 r = gufunc(a, signature=signature, extobj=extobj) 764 return wrap(r.astype(result_t, copy=False)) 765 /dsc/anaconda3/lib/python3.8/site-packages/numpy/linalg/linalg.py in _raise_linalgerror_nonposdef(err, flag) 89 90 def _raise_linalgerror_nonposdef(err, flag): ---> 91 raise LinAlgError("Matrix is not positive definite") 92 93 def _raise_linalgerror_eigenvalues_nonconvergence(err, flag): LinAlgError: Matrix is not positive definite
It's not hard to manufacture an SPD matrix to try out the Cholesky factorization.
B = A.T@A
R = cholesky(B)
print(R)
[[10.24695077 0. 0. 0. ] [10.24695077 7. 0. 0. ] [ 7.12407053 5.14285714 3.714652 0. ] [10.14936076 -0.57142857 -2.25215886 2.93116463]]
norm(R@R.T - B)
3.1776437161565096e-14