Lecture 1: Matrix-vector multiplication
This should primarily be a review of concepts from the first linear algebra class, though there are probably examples you haven’t seen before.
Practice interpreting the summation notation, like in equation 1.1 and in other places in the chapter. This book uses such summation notation a lot.
A linear combination of some vectors is a sum of scalar multiples of the vectors (i.e., \(c_1\vec v_1+c_2\vec v_2+\cdots+c_n\vec v_n\)). A nontrival linear combination is a linear combination with at least one nonzero scalar coefficient.
The span of some vectors is the set of all linear combinations of the vectors (this is a set of vectors). A set of vectors is linearly independent if none of the vectors can be written as a linear combination of the others. Equivalently, a set of vectors \(\{\vec v_1, \vec v_2, \ldots, \vec v_n\}\) is linearly independent if there is any nontrivial linear combination that equals the zero vector (i.e., there is some nonzero coefficients \(c_i\) so that \(c_1\vec v_1+c_2\vec v_2+\cdots +c_n\vec v_n=\vec 0\)).
A vector space is the span of some vectors. Equivalently, a vector space is a set of vectors that is closed under linear combinations (i.e., any linear combinations of any vectors in the set are always also in the set). A basis for a vector space is a linearly independent set of vectors that spans the space. The dimension of the space is the size of a basis (all bases of a space have the same size).Which of the following sets of vectors are vector spaces? If not, give a reason why it is not (i.e., take some vectors in the set, and give a linear combination of those vectors that is not in the set). For each vector space, give two different bases and the dimension of the vector space.
- \(\{ (1,2,3)\}\)
- \(\{c(-1,0,1)+(2,1,3)\mid c\in\mathbb{R}\}\)
- \(\{(x,y) \mid x\geq 0 \text{ and } y\geq 0\}\): all vectors in \(\mathbb{R}^2\) that have only positive components
- \(\{(x,y)\mid x=0 \text{ or } y=0\}\): all vectors in \(\mathbb{R}^2\) that are on either the \(x\) or \(y\) axes
- \(\{(x,y,z)\mid \sqrt{x^2+y^2+z^2}\leq 1\}\): all vectors in \(\mathbb{R}^3\) that have length less than or equal to 1
- \(\{c(1,2,3) \mid c\in\mathbb{R}\}\): all multiples (i.e., linear combinations) of the vector \((1,2,3)\). The endpoints of these vectors form a line through the origin.
- \(\{(0,0,0)\}\): just the zero vector (i.e., all linear combinations of the zero vector). This vector space has only one vector, as opposed to the others which have an infinite number of vectors.
- \(\{c_1(-1,0,1)+c_2(2,1,2) \mid c_1,c_2\in\mathbb{R}\}\): all linear combinations of \((-1,0,1)\) and \((2,1,2)\). The endpoints of these vectors form a plane through the origin.
- \(\{c_1(-1,0,1)+c_2(2,1,2)+c_3(-4,-1,0) \mid c_1,c_2,c_3\in\mathbb{R}\}\): all linear combinations of \((-1,0,1)\), \((2,1,2)\), and \((-4,-1,0)\). Hint: Since the third vector \((-4,-1,0)\) is actually a linear combination of \((-1,0,1)\) and \((2,1,2)\), \((-4,-1,0)=2(-1,0,1)-(2,1,2)\), any linear combination of the three vectors can actually be written as a linear combination of just the first two vectors. For example, \[\begin{align*} 3(-1,0,1)-2(2,1,2)+2(-4,-1,0)&=3(-1,0,1)-2(2,1,2)+2(2(-1,0,1)-(2,1,2))\\ &=7(-1,0,1)-4(2,1,2). \end{align*}\]
- The set of all polynomials with degree at most 3.
- The set of all polynomials with degree equal to 3.
- The set of all polynomials.
- The set of all 2 by 2 matrices.
- The set of all invertible 3 by 3 matrices.
- The set of all diagonal 3 by 3 matrices.
- The set of all 3 by 3 matrices that are upper triangular.
- The set of all 3 by 3 symmetric matrices.
- The set of all continuous functions.
- The set of all continuous functions such that \(f(0)=0\).
- The set of all functions with continuous first derivatives.
A linear transformation \(T\colon V\to W\) is a function from one vector space to another that:
- preserves vector addition: \(T(\vec x+\vec y) = T(\vec x)+T(\vec y)\)
- preserves scalar multiplication: \(T(c\vec x) = cT(\vec x)\).
You can check your work with Sage (and also see how Sage creates vectors and matrices) by playing with the example below. Go ahead and change the matrix to your matrix.
When you get to this point, ask me to help you draw a diagram connecting the column space and null space of a matrix \(A\) and the linear transformation \(T(\vec x)=A\vec x\).
Suppose \(T\colon \mathbb{R}^2\to \mathbb{R}^2\) is a linear transformation that does the following operations in order:
- rotates vectors by 90 degrees clockwise, then
- flips the vectors over the \(y\) axis, then
- stretches things horizontally by a factor of 3.
Answer the following questions:
- What is \(T((1,0))\)?
- What is \(T((0,1))\)?
- What is \(T(3(1,0)+2(0,1))\)?
- What is \(T((a,b))\)?
- What is the matrix \(A\) representing \(T\), so that \(T(\vec x)=A\vec x\)?
Check your work to the above exercise by modifying \(A\) and \(\vec v\) below and confirming your answers in the first parts of the problem.
Let \(\mathcal{B}=\{1,x,x^2\}\) be a basis for \(P_2\), the set of all polynomials with degree at most 2.
- Let \(p=2x-x^2\) be a vector in \(P_3\). What is the coordinate vector \([p]_\mathcal{B}\)?
- What is the polynomial represented by the coordinate vector \((2,3,-4)_\mathcal{B}\)?
One of the huge advantages of using coordinate vectors is that linear transformations on finite dimensional vector spaces can be computed by multiplying a matrix and a coordinate vector.
Do the following.
- Express the columns of \(AB\) as a linear combination of columns of \(A\)
- Express the rows of \(AB\) as a linear combination of rows of \(B\).
- Express the entries of \(AB\) as dot products of rows of \(A\) and columns of \(B\).
- What if \(B\) is a matrix of all ones? What does \(AB\) compute then? What does \(BA\) compute?
Check your answer to the last part of the question above about an all-ones matrix.
Let \[A=\begin{bmatrix}1&3\\2&5\end{bmatrix},\quad A^{-1}=\begin{bmatrix}-5&3\\2&-1\end{bmatrix}.\] Let \(\mathcal{B}=\{(1,2),\, (3,5)\}\) be a basis for \(\mathbb{R}^2\).
Use \(A\) to compute the vectors with coordinate vectors \((1,0)_\mathcal{B}\), \((0,1)_\mathcal{B}\), and \((2,3)_\mathcal{B}\).
Use \(A^{-1}\) to compute the coordinate vectors \([(-3,-4)]_\mathcal{B}\) and \([(a,b)]_\mathcal{B}\).
When you get to this point, ask me to help you modify the diagram we drew above in order to deal with linear transformations that correspond to invertible matrices.
Lecture 2: Orthogonal Vectors and Matrices
The set of complex numbers is denoted \(\mathbb{C}\). The conjugate of the complex number \(z=a+bi\) is \(\bar z = a-bi\).
Answer the following.
- Let \(A=\begin{bmatrix}4&4+i&2+i\\1-i&2-i&5\end{bmatrix}\). What is \(A^*\)? Is \(A\) hermitian?
- Let \(B=\begin{bmatrix}2&4+2i\\4-2i&5\end{bmatrix}\). What is \(B^*\)? Is \(B\) hermitian?
Answer the following.
- What must be true about diagonal entries of hermitian matrices?
- Prove that \(AA^*\) is always hermitian.
Answer the following.
- Prove that \((AB)^{-1}=B^{-1}A^{-1}\).
- Prove that \((A^*)^{-1}=(A^{-1})^*\).
Answer the following.
- What is the angle between \((1,2)\) and \((2,-1)\)? Are these vectors orthogonal? [Hint: use equation (2.3) for the first question.]
- Is the set \(\{(1/\sqrt{2}, i/\sqrt{2}), (1+i,-1)\}\) orthonormal? Is it linearly independent? Remember that now, scalars can be complex numbers.
Lecture 3: Norms
You can use Sage examples to explore unit balls and induced matrix norms.
Let \(\vec v = (1,\,2+3i,\, -4i)\). Compute \(\norm{\vec v}_1\), \(\norm{\vec v}_2\), \(\norm{\vec v}_4\), and \(\norm{\vec v}_\infty\). Check your work in Sage.
Norms give us a way to associate a number with a vector. The number may represent something other than physical distances.
A manufacturing robot has costs associated with moving the tip of its arm. Moving east or west costs $2/meter, moving north or south costs $3/meter, and moving up or down costs $5/meter. A vector \(\vec v=(a,b,c)\) represents moving \(a\) meters east, \(b\) meters north, and \(c\) meters up (each number can be negative to move the opposite direction).
Come up with a norm formula that gives the cost for moving along a vector \((a,b,c)\).
What is the norm of \((1,2,3)\) and \((2,-1,3)\)?
Text exercise 3.1.
The text claims in paragraph 3 of page 19 that condition (3) of equation (3.1) implies that the action of \(A\) is determined by its action on the unit vectors of the domain. In other words, to find the maximum stretch that \(A\) causes, you only have to look at the unit vectors of the domain. Explain why this is true.
Let \(A=[a_1\,a_2\,\cdots\,a_n]\) be an \(n\) by \(n\) matrix with columns \(a_1,a_2,\ldots,a_n\). Let \(x\in \mathbb{C}^n\) be a vector inside the diamond-shaped 1-norm unit ball in \(\mathbb{C}^n\) (i.e., \(\sum_{j=1}^n\abs{x_j}\leq 1\)). Explain why each of the following inequalities is true. [Hint: these inequalities are from the line between equations (3.8) and (3.9) in the text.]
\[\norm{Ax}_1 = \norms{\sum_{j=1}^n x_ja_j}_1 \leq \sum_{j=1}^n\abs{x_j}\norm{a_j}_1\leq \max_{1\leq j\leq n}\norm{a_j}_1\]
Explain why these inequalities that occur just before equation (3.14) are true:
\[\norm{ABx}_{(\ell)}\leq \norm{A}_{(\ell,m)} \norm{Bx}_{(m)}\leq \norm{A}_{(\ell,m)}\norm{B}_{(m,n)}\norm{x}_{(n)}.\]Do the following.
- Either find matrices \(A\) and \(B\) so that \(\norm{AB}_F<\norm{A}_F\norm{B}_F\) or show that such an example does not exist.
- Either find matrices \(A\) and \(B\) so that \(\norm{AB}_F=\norm{A}_F\norm{B}_F\) or show that such an example does not exist.
You can calculate matrix norms in Sage. You might use this to check some of your examples.
Lecture 4: The Singular Value Decomposition
You can use a Sage interact example to explore the SVD for invertible 2 by 2 matrices. You can also compute the SVD in Sage.
Lecture 5: More on the SVD
As mentioned in chapter 4, you can use a Sage interact example to explore the SVD for invertible 2 by 2 matrices. You can also compute the SVD in Sage.
Let \(A\) be a matrix and \(A_k=\sum_{j=1}^k \sigma_j u_j v_j^*\), as defined in equation (5.4). Explain why \(\norm{A-A_k}_2=\sigma_{k+1}\) if \(k\leq\rank(A)\). [Hint: see Theorem 5.3.]
Let \[A=\begin{bmatrix} 1 & 2 & 3\\4 & 5 & 6\\-1 & 2 & 1\end{bmatrix}\].
Write \(A\) as the sum of rank 1 matrices given in equation (5.3). Show explicitly the elements of the sum. You might use Sage to compute the SVD matrices.
Calculate the low-rank approximations \(A_1\) and \(A_2\) according to equation (5.4). Find the distance between \(A\) and each of \(A_1\) and \(A_2\), where distance is measured in the matrix 2-norm (i.e., find \(\norm{A-A_1}_2\) and \(\norm{A-A_2}_2\)). Compare these distances to the singular values of \(A\).
Explain why \(A_1\) and \(A_2\) are so special (i.e., tell us what Theorem 5.8 says is significant about \(A_1\) and \(A_2\) compared to \(A\)).
If you want a challenge, do text exercise 5.2 and think about the consequences. Remind me to discuss this.
Lecture 6: Projections
In class, we worked through much of the mathematics of this chapter already. Make sure you understand why and how the calculations work in the rest of that worksheet. Also, make sure you read the chapter.
Lecture 7: QR
We didn’t talk about the “When Vectors Become Continuous Functions” section in class. If you plan to do things with physics, signal processing, or differential equations, you should really read this section, though.
Lecture 8: Gram-Schmidt
We’ll have a review question asking about modified Gram-Schmidt later. Also, a question involving numerical stability was given in class.
Lecture 9: Matlab
We didn’t talk about this chapter in class. You should read through Experiments 2 and 3 in the chapter, though, to see the results. You can do some of the calculations in Octave. Plotting doesn’t quite work yet in the demo below, though.
x = (-128:128)'/128
A=[x.^0 x.^1 x.^2 x.^3];
[Q, R] = qr(A, 0);
scale = Q(257, :);
Q = Q*diag(1 ./scale);
Q
Lecture 10: Householder
Here is an example of using Sage to compute a QR decomposition. For matrices over RDF
, Sage uses Householder reflections to compute the decomposition (internally using Scipy and the LAPACK functions dgeqrf and dorgqr).
Let \[A=\begin{bmatrix}2&4&-1\\3&2&3\\3&-2&1\\5&2&0\end{bmatrix}.\]
Calculate a full QR decomposition for \(A\) in the following ways (I’ve used the book notation for the intermediate quantities). Make sure to show all of your work for each step. You can use a computer or calculator to do the arithmetic, and you can round off to 4 decimals in intermediate calculations.
- Modified Gram-Schmidt:
- Calculate \(q_1\) and the projector \(P_{\perp q_1}\)
- Calculate the new vectors \(v^{(2)}_2\) and \(v^{(2)}_3\)
- Calculate \(q_2\) and the projector matrix \(P_{\perp q_2}\)
- Calculate the new vector \(v^{(3)}_3\)
- Calculate \(q_3\)
- Calculate a 4 by 4 \(Q\) and a 4 by 3 \(R\) (show how you get them from your calculations above) and double-check that \(Q\) is unitary and that \(A=QR\).
- Householder reflections. Use the strategy mentioned in the book to calculate the appropriate direction for \(v\) in each case below to avoid numerical error.
- Calculate \(Q_1\) and the \(v\) corresponding to \(Q_1\), as well as \(Q_1A\).
- Calculate \(Q_2\) and the \(v\) corresponding to \(Q_2\).
- Calculate \(Q\) and \(R\) (show how you get them from your calculations above) and double-check that \(Q\) is unitary and that \(A=QR\).
Lecture 11: Least Squares
Use least squares to find the best-fitting line for the points \((2,3)\), \((-1,2)\), \((5,4)\), \((3,1)\), \((1,3)\). Use QR decomposition (Algorithm 11.2) to do the least squares. Your final answer should be a line of the form \(y=a_0+a_1x\) (find \(a_0\) and \(a_1\)). You can use Sage or other software to compute \(\hat Q\) and \(\hat R\) (in Sage, compute the full QR decomposition and then delete the necessary rows and columns to get a reduced QR decomposition).
Lecture 12: Conditioning and Condition numbers
Show that if \(A\) is square and nonsingular, then \[\frac{\norm{x}}{\norm{Ax}}\leq \norm{A^{-1}}.\]
Show that if \(A\in \mathbb{C}^{m\times m}\) is nonsingular and has smallest singular value \(\sigma_m\), then \(\norm{A^{-1}}_2=1/\sigma_m\).
Lecture 20-23: LU and Cholesky factorizations
Compute by hand \(P\), \(L\), and \(U\) using the \(LU\) factorization with partial pivoting so that \(PA=LU\) for \[A=\left(\begin{array}{rrrr} 12 & 12 & 24 & -12 \\ 6 & 12 & -21 & -48 \\ 3 & -5 & -34 & -35 \\ -4 & 20 & 4 & 28 \end{array}\right).\]
Show each step.
Text exercise 21.1
Compute by hand the Cholesky factorization of \[\begin{bmatrix} 1 & 5 & 3 & 2 \\ 5 & 34 & 18 & 16 \\ 3 & 18 & 14 & 6 \\ 2 & 16 & 6 & 25 \end{bmatrix}.\] Show each step (i.e., each \(R_i\)).
Text exercise 23.1