Lecture 1: Matrix-vector multiplication

This should primarily be a review of concepts from the first linear algebra class, though there are probably examples you haven’t seen before.

Practice interpreting the summation notation, like in equation 1.1 and in other places in the chapter. This book uses such summation notation a lot.

A linear combination of some vectors is a sum of scalar multiples of the vectors (i.e., $c_1\vec v_1+c_2\vec v_2+\cdots+c_n\vec v_n$). A nontrival linear combination is a linear combination with at least one nonzero scalar coefficient.

The span of some vectors is the set of all linear combinations of the vectors (this is a set of vectors). A set of vectors is linearly independent if none of the vectors can be written as a linear combination of the others. Equivalently, a set of vectors $\{\vec v_1, \vec v_2, \ldots, \vec v_n\}$ is linearly independent if there is any nontrivial linear combination that equals the zero vector (i.e., there is some nonzero coefficients $c_i$ so that $c_1\vec v_1+c_2\vec v_2+\cdots +c_n\vec v_n=\vec 0$).

A vector space is the span of some vectors. Equivalently, a vector space is a set of vectors that is closed under linear combinations (i.e., any linear combinations of any vectors in the set are always also in the set). A basis for a vector space is a linearly independent set of vectors that spans the space. The dimension of the space is the size of a basis (all bases of a space have the same size).

Which of the following sets of vectors are vector spaces? If not, give a reason why it is not (i.e., take some vectors in the set, and give a linear combination of those vectors that is not in the set). For each vector space, give two different bases and the dimension of the vector space.

$\{ (1,2,3)\}$
$\{c(-1,0,1)+(2,1,3)\mid c\in\mathbb{R}\}$
$\{(x,y) \mid x\geq 0 \text{ and } y\geq 0\}$: all vectors in $\mathbb{R}^2$ that have only positive components
$\{(x,y)\mid x=0 \text{ or } y=0\}$: all vectors in $\mathbb{R}^2$ that are on either the $x$ or $y$ axes
$\{(x,y,z)\mid \sqrt{x^2+y^2+z^2}\leq 1\}$: all vectors in $\mathbb{R}^3$ that have length less than or equal to 1
$\{c(1,2,3) \mid c\in\mathbb{R}\}$: all multiples (i.e., linear combinations) of the vector $(1,2,3)$. The endpoints of these vectors form a line through the origin.
$\{(0,0,0)\}$: just the zero vector (i.e., all linear combinations of the zero vector). This vector space has only one vector, as opposed to the others which have an infinite number of vectors.
$\{c_1(-1,0,1)+c_2(2,1,2) \mid c_1,c_2\in\mathbb{R}\}$: all linear combinations of $(-1,0,1)$ and $(2,1,2)$. The endpoints of these vectors form a plane through the origin.
$\{c_1(-1,0,1)+c_2(2,1,2)+c_3(-4,-1,0) \mid c_1,c_2,c_3\in\mathbb{R}\}$: all linear combinations of $(-1,0,1)$, $(2,1,2)$, and $(-4,-1,0)$. Hint: Since the third vector $(-4,-1,0)$ is actually a linear combination of $(-1,0,1)$ and $(2,1,2)$, $(-4,-1,0)=2(-1,0,1)-(2,1,2)$, any linear combination of the three vectors can actually be written as a linear combination of just the first two vectors. For example, \[\begin{align*} 3(-1,0,1)-2(2,1,2)+2(-4,-1,0)&=3(-1,0,1)-2(2,1,2)+2(2(-1,0,1)-(2,1,2))\\ &=7(-1,0,1)-4(2,1,2). \end{align*}\]
The set of all polynomials with degree at most 3.
The set of all polynomials with degree equal to 3.
The set of all polynomials.
The set of all 2 by 2 matrices.
The set of all invertible 3 by 3 matrices.
The set of all diagonal 3 by 3 matrices.
The set of all 3 by 3 matrices that are upper triangular.
The set of all 3 by 3 symmetric matrices.
The set of all continuous functions.
The set of all continuous functions such that $f(0)=0$.
The set of all functions with continuous first derivatives.

A linear transformation $T\colon V\to W$ is a function from one vector space to another that:

preserves vector addition: $T(\vec x+\vec y) = T(\vec x)+T(\vec y)$
preserves scalar multiplication: $T(c\vec x) = cT(\vec x)$.

Let $A$ be a matrix. Prove that $T(\vec x)=A\vec x$ is a linear transformation. [Hint: show that it satisfies the two criteria for a linear transformation at the bottom of page 3 of the text.]

Find the matrix $A$ for the linear transformation $T((a,b)) = (2a-3b, 5a)$ so that $T(\vec x)=A\vec x$.

You can check your work with Sage (and also see how Sage creates vectors and matrices) by playing with the example below. Go ahead and change the matrix to your matrix.

var('a,b') v=vector([a,b]) A=matrix(QQ,[[1,2],[3,4]]) print A print A*v

When you get to this point, ask me to help you draw a diagram connecting the column space and null space of a matrix $A$ and the linear transformation $T(\vec x)=A\vec x$.

Show that linear transformations preserve linear combinations. In other words, show that if $T$ is a linear transformation, then \[T(c_1\vec v_1+c_2\vec v_2+\cdots+c_n\vec v_n) = c_1T(\vec v_1)+c_2 T(\vec v_2)+\cdots+c_n T(\vec v_n).\]

Suppose $T\colon \mathbb{R}^2\to \mathbb{R}^2$ is a linear transformation. Suppose also that $\vec u,\vec v\in\mathbb{R}^2$. Suppose also that $T(\vec u)=(1,2)$ and $T(\vec v)=(-3,4)$. What is $T(3\vec u+2\vec v)$?

Suppose $T\colon \mathbb{R}^2\to \mathbb{R}^2$ is a linear transformation that does the following operations in order:

rotates vectors by 90 degrees clockwise, then
flips the vectors over the $y$ axis, then
stretches things horizontally by a factor of 3.

Answer the following questions:

What is $T((1,0))$?
What is $T((0,1))$?
What is $T(3(1,0)+2(0,1))$?
What is $T((a,b))$?
What is the matrix $A$ representing $T$, so that $T(\vec x)=A\vec x$?

Check your work to the above exercise by modifying $A$ and $\vec v$ below and confirming your answers in the first parts of the problem.

v=vector([1,2]) A=matrix(QQ,[[1,2],[3,4]]) print A*v

A coordinate vector $[\vec v]_\mathcal{B}$ relative to an ordered basis $\mathcal{B}$ is a vector of coefficients of the linear combination of basis elements equaling $\vec v$. In other words, if $\mathcal{B}=\{b_1,b_2,\ldots,b_n\}$ is a basis, then the coordinate vector for a vector $\vec v=c_1\vec b_1+c_2\vec b_2+\cdots+c_n\vec b_n$ is $[\vec v]_\mathcal{B}=(c_1,c_2,\ldots,c_n)_\mathcal{B}$.

Let $\mathcal{B}=\{1,x,x^2\}$ be a basis for $P_2$, the set of all polynomials with degree at most 2.

Let $p=2x-x^2$ be a vector in $P_3$. What is the coordinate vector $[p]_\mathcal{B}$?
What is the polynomial represented by the coordinate vector $(2,3,-4)_\mathcal{B}$?

One of the huge advantages of using coordinate vectors is that linear transformations on finite dimensional vector spaces can be computed by multiplying a matrix and a coordinate vector.

$P_3$ is the vector space of polynomials with degree at most 3. Let $T\colon P_3\to\mathbb{R}^3$ be the linear transformation $T(p)=(p(x=-1), p(x=0), p(x=1))$ (i.e., evaluate $p$ at $x=-1$, $x=0$, and $x=1$). For example, $T(3+x-2x^2+5x^3)=(3-1-2-5, 3, 3+1-2+5)=(-5,3,7)$. Let $\mathcal{B}=\{1,x,x^2,x^3\}$ be a basis for $P_3$. Find a matrix representing $T$ relative to $\mathcal{B}$ and the standard basis on $\mathbb{R}^3$.

v=vector([3,1,-2,5]) #fill in A A=matrix(QQ, [[],[],[]]) A*v # should be (-5,3,7), according to our example above.

Write $A\begin{pmatrix} 1\\2\\3\end{pmatrix}$ as a linear combination of the columns of $A$.

Do the following.

Express the columns of $AB$ as a linear combination of columns of $A$
Express the rows of $AB$ as a linear combination of rows of $B$.
Express the entries of $AB$ as dot products of rows of $A$ and columns of $B$.
What if $B$ is a matrix of all ones? What does $AB$ compute then? What does $BA$ compute?

Check your answer to the last part of the question above about an all-ones matrix.

A=random_matrix(QQ, 3); B=ones_matrix(QQ,3) print A print print A*B

Give short reasons why the text’s Theorem 1.3 parts (a), (b), (c), and (d) are equivalent.

Let \[A=\begin{bmatrix}1&3\\2&5\end{bmatrix},\quad A^{-1}=\begin{bmatrix}-5&3\\2&-1\end{bmatrix}.\] Let $\mathcal{B}=\{(1,2),\, (3,5)\}$ be a basis for $\mathbb{R}^2$.

Use $A$ to compute the vectors with coordinate vectors $(1,0)_\mathcal{B}$, $(0,1)_\mathcal{B}$, and $(2,3)_\mathcal{B}$.
Use $A^{-1}$ to compute the coordinate vectors $[(-3,-4)]_\mathcal{B}$ and $[(a,b)]_\mathcal{B}$.

When you get to this point, ask me to help you modify the diagram we drew above in order to deal with linear transformations that correspond to invertible matrices.

Text exercise 1.1

Text exercise 1.3

Text exercise 1.4

Lecture 2: Orthogonal Vectors and Matrices

The set of complex numbers is denoted $\mathbb{C}$. The conjugate of the complex number $z=a+bi$ is $\bar z = a-bi$.

Show that if $z\bar z = z^2$, then $z$ must be a real number (i.e., the imaginary part of $z$ is 0).

Answer the following.

Let $A=\begin{bmatrix}4&4+i&2+i\\1-i&2-i&5\end{bmatrix}$. What is $A^*$? Is $A$ hermitian?
Let $B=\begin{bmatrix}2&4+2i\\4-2i&5\end{bmatrix}$. What is $B^*$? Is $B$ hermitian?

Answer the following.

What must be true about diagonal entries of hermitian matrices?
Prove that $AA^*$ is always hermitian.

# Use CDF for complex matrices A=matrix(CDF, [[4,4+I,2+I],[1-I,2-I,5]]) print A.H # this is A^* print A.is_hermitian() print A.H==A

Give an example showing that the inner product on $\mathbb{C}^2$ is bilinear. [Hint: compute explicitly each side of the 3 defining equations defining bilinearity.]

Give an example showing that the (2,1) element of $(AB)^*$ is the same as the (2,1) element of $B^*A^*$.

Answer the following.

Prove that $(AB)^{-1}=B^{-1}A^{-1}$.
Prove that $(A^*)^{-1}=(A^{-1})^*$.

Answer the following.

What is the angle between $(1,2)$ and $(2,-1)$? Are these vectors orthogonal? [Hint: use equation (2.3) for the first question.]
Is the set $\{(1/\sqrt{2}, i/\sqrt{2}), (1+i,-1)\}$ orthonormal? Is it linearly independent? Remember that now, scalars can be complex numbers.

Give the reasons why the second equality in equation (2.7) in the text is true.

Use the Sage cell below, or the Sage notebook, to compute the two decompositions of $v$ given in equation (2.7).

A=matrix(RDF, [[1,4,3],[2,3,6],[-1,-4,3]]) Q,R=A.QR() # check that Q is unitary, or in other words, that the columns of Q are # an orthonormal basis. (q1,q2,q3) = Q.columns() # nicer columns q1=vector(QQ,[2,-2,1])/3 q2=vector(QQ,[2,1,-2])/3 q1=vector(QQ,[1,2,2])/3

Text exercise 2.1

Text exercise 2.2

Text exercise 2.3

Text exercise 2.4. Additionally, what can be said about the determinant of a unitary matrix?

Prove that if $\lambda$ is an eigenvalue of $A$ with eigenvector $\vec x$, then $\lambda^2$ is an eigenvalue of $A^2$ with the same eigenvector $\vec x$. [Hint: in mathematical notation, if $A\vec x=\lambda x$ for some nonzero vector $\vec x$, then show that $A^2\vec x = \lambda^2 x$].

Text exercise 2.5. [Hint: For any matrix $A$ , $(I+A)(I-A)=(I-A)(I+A)$. Prove this hint if you use it.]

Text exercise 2.6

Text exercise 2.7

Lecture 3: Norms

You can use Sage examples to explore unit balls and induced matrix norms.

Let $\vec v = (1,\,2+3i,\, -4i)$. Compute $\norm{\vec v}_1$, $\norm{\vec v}_2$, $\norm{\vec v}_4$, and $\norm{\vec v}_\infty$. Check your work in Sage.

# An example of how to compute norms of a vector in Sage. v=vector(CDF, [1,-2*I]) print v.norm(p=3) print v.norm(p=Infinity)

Norms give us a way to associate a number with a vector. The number may represent something other than physical distances.

A manufacturing robot has costs associated with moving the tip of its arm. Moving east or west costs $2/meter, moving north or south costs $3/meter, and moving up or down costs $5/meter. A vector $\vec v=(a,b,c)$ represents moving $a$ meters east, $b$ meters north, and $c$ meters up (each number can be negative to move the opposite direction).

Come up with a norm formula that gives the cost for moving along a vector $(a,b,c)$.
What is the norm of $(1,2,3)$ and $(2,-1,3)$?

Text exercise 3.1.

The text claims in paragraph 3 of page 19 that condition (3) of equation (3.1) implies that the action of $A$ is determined by its action on the unit vectors of the domain. In other words, to find the maximum stretch that $A$ causes, you only have to look at the unit vectors of the domain. Explain why this is true.

Let $A=[a_1\,a_2\,\cdots\,a_n]$ be an $n$ by $n$ matrix with columns $a_1,a_2,\ldots,a_n$. Let $x\in \mathbb{C}^n$ be a vector inside the diamond-shaped 1-norm unit ball in $\mathbb{C}^n$ (i.e., $\sum_{j=1}^n\abs{x_j}\leq 1$). Explain why each of the following inequalities is true. [Hint: these inequalities are from the line between equations (3.8) and (3.9) in the text.]

\[\norm{Ax}_1 = \norms{\sum_{j=1}^n x_ja_j}_1 \leq \sum_{j=1}^n\abs{x_j}\norm{a_j}_1\leq \max_{1\leq j\leq n}\norm{a_j}_1\]

Prove equation (3.10) in the text is true.

Explain why (3.12) is true, given that equation (2.3) is true.

Explain why each equality or inequality in equation (3.13) is true.

Explain why these inequalities that occur just before equation (3.14) are true:

\[\norm{ABx}_{(\ell)}\leq \norm{A}_{(\ell,m)} \norm{Bx}_{(m)}\leq \norm{A}_{(\ell,m)}\norm{B}_{(m,n)}\norm{x}_{(n)}.\]

Give an example showing that the inequality in equation (3.14) is not tight. In other words, give an induced matrix norm and explicit matrices $A$ and $B$ so that $\norm{AB}_{(\ell,n)}\neq \norm{A}_{(\ell,m)}\norm{B}_{(m,n)}$.

Prove equation (3.18).

Do the following.

Either find matrices $A$ and $B$ so that $\norm{AB}_F<\norm{A}_F\norm{B}_F$ or show that such an example does not exist.
Either find matrices $A$ and $B$ so that $\norm{AB}_F=\norm{A}_F\norm{B}_F$ or show that such an example does not exist.

Show that if $Q$ is a unitary matrix, then $\norm{QA}_F=\norm{A}_F$. [Hint: see Theorem 3.1.]

Text exercise 3.2

Text exercise 3.3

You can calculate matrix norms in Sage. You might use this to check some of your examples.

# An example of how to compute norms of a matrix in Sage. A=matrix(CDF, [[1,-2*I], [-I, 3]]) print A.norm(p=1) print A.norm(p=2) print A.norm(p=Infinity) print A.norm(p='frob')

Lecture 4: The Singular Value Decomposition

You can use a Sage interact example to explore the SVD for invertible 2 by 2 matrices. You can also compute the SVD in Sage.

# An example of how to compute the SVD of a matrix A=matrix(CDF, [[1,-2*I], [-I, 3]]) U,S,V = A.SVD() print U print print S print print V print print A-U*S*V.H # should be very close to the zero matrix

Using the definition of the SVD given in equation (4.4), show that the largest singular value $\sigma_1$ of a matrix $A$ is equal to $\norm{A}_2$.

Text exercise 4.1 (by hand—so show your work and reasoning). [Hint: You might first determine a vectors that are stretched the most and second-most by the matrix. Then figure out the matrices that will transform those vectors to $(1,0)$ and $(0,1)$, then stretch appropriately, then rotate them to the final positions. Remember that singular values must be nonnegative. Another way to approach the problem is to use equation (4.1)—figure out which vectors are scaled the most and second-most, and where they go. Check your answer by computing $U\Sigma V^*$.]

Show that the product of two unitary matrices $U$ and $V$ is also unitary.

Text exercise 4.2. [Hint: Find a way to express the relationship between $A$ and $B$ as matrix multiplication and other matrix operations.]

Text exercise 4.4

Lecture 5: More on the SVD

As mentioned in chapter 4, you can use a Sage interact example to explore the SVD for invertible 2 by 2 matrices. You can also compute the SVD in Sage.

Show that if $B$ is an invertible matrix, then $\rank(AB)=\rank(A)$ and $\rank(BC)=\rank(C)$ for any matrices $A$ and $C$ that are the right size for multiplication to be defined. This is used in the proof of Theorem 5.1 (and a few other places as well).

Let $A$ be a matrix and $A_k=\sum_{j=1}^k \sigma_j u_j v_j^*$, as defined in equation (5.4). Explain why $\norm{A-A_k}_2=\sigma_{k+1}$ if $k\leq\rank(A)$. [Hint: see Theorem 5.3.]

Let \[A=\begin{bmatrix} 1 & 2 & 3\\4 & 5 & 6\\-1 & 2 & 1\end{bmatrix}\].

Write $A$ as the sum of rank 1 matrices given in equation (5.3). Show explicitly the elements of the sum. You might use Sage to compute the SVD matrices.
Calculate the low-rank approximations $A_1$ and $A_2$ according to equation (5.4). Find the distance between $A$ and each of $A_1$ and $A_2$, where distance is measured in the matrix 2-norm (i.e., find $\norm{A-A_1}_2$ and $\norm{A-A_2}_2$). Compare these distances to the singular values of $A$.
Explain why $A_1$ and $A_2$ are so special (i.e., tell us what Theorem 5.8 says is significant about $A_1$ and $A_2$ compared to $A$).

Text exercise 5.1

Text exercise 5.3

If you want a challenge, do text exercise 5.2 and think about the consequences. Remind me to discuss this.

Lecture 6: Projections

In class, we worked through much of the mathematics of this chapter already. Make sure you understand why and how the calculations work in the rest of that worksheet. Also, make sure you read the chapter.

Text exercise 6.1

Text exercise 6.3

Text exercise 6.4

Text exercise 6.5

Lecture 7: QR

We didn’t talk about the “When Vectors Become Continuous Functions” section in class. If you plan to do things with physics, signal processing, or differential equations, you should really read this section, though.

Text exercise 7.1

Text exercise 7.5

Lecture 8: Gram-Schmidt

We’ll have a review question asking about modified Gram-Schmidt later. Also, a question involving numerical stability was given in class.

Lecture 9: Matlab

We didn’t talk about this chapter in class. You should read through Experiments 2 and 3 in the chapter, though, to see the results. You can do some of the calculations in Octave. Plotting doesn’t quite work yet in the demo below, though.

x = (-128:128)'/128
A=[x.^0 x.^1 x.^2 x.^3];
[Q, R] = qr(A, 0);
scale = Q(257, :);
Q = Q*diag(1 ./scale);
Q

Lecture 10: Householder

Here is an example of using Sage to compute a QR decomposition. For matrices over RDF, Sage uses Householder reflections to compute the decomposition (internally using Scipy and the LAPACK functions dgeqrf and dorgqr).

A=matrix(RDF,3,3,[3,2,-1, 2,5,4, 3,1,0]) Q,R=A.QR() html.table([ ["$A$", A], ["$Q$", Q], ["$R$", R], ["$QQ^*$", Q*Q.H], # should be close to the identity since Q is unitary ["$A-QR$", A-Q*R], # should be close to zero ])

Text exercise 10.1

Let \[A=\begin{bmatrix}2&4&-1\\3&2&3\\3&-2&1\\5&2&0\end{bmatrix}.\]

Calculate a full QR decomposition for $A$ in the following ways (I’ve used the book notation for the intermediate quantities). Make sure to show all of your work for each step. You can use a computer or calculator to do the arithmetic, and you can round off to 4 decimals in intermediate calculations.

Modified Gram-Schmidt:
1. Calculate $q_1$ and the projector $P_{\perp q_1}$
2. Calculate the new vectors $v^{(2)}_2$ and $v^{(2)}_3$
3. Calculate $q_2$ and the projector matrix $P_{\perp q_2}$
4. Calculate the new vector $v^{(3)}_3$
5. Calculate $q_3$
6. Calculate a 4 by 4 $Q$ and a 4 by 3 $R$ (show how you get them from your calculations above) and double-check that $Q$ is unitary and that $A=QR$.
Householder reflections. Use the strategy mentioned in the book to calculate the appropriate direction for $v$ in each case below to avoid numerical error.
1. Calculate $Q_1$ and the $v$ corresponding to $Q_1$, as well as $Q_1A$.
2. Calculate $Q_2$ and the $v$ corresponding to $Q_2$.
3. Calculate $Q$ and $R$ (show how you get them from your calculations above) and double-check that $Q$ is unitary and that $A=QR$.

Lecture 11: Least Squares

Use least squares to find the best-fitting line for the points $(2,3)$, $(-1,2)$, $(5,4)$, $(3,1)$, $(1,3)$. Use QR decomposition (Algorithm 11.2) to do the least squares. Your final answer should be a line of the form $y=a_0+a_1x$ (find $a_0$ and $a_1$). You can use Sage or other software to compute $\hat Q$ and $\hat R$ (in Sage, compute the full QR decomposition and then delete the necessary rows and columns to get a reduced QR decomposition).

Lecture 12: Conditioning and Condition numbers

Show that if $A$ is square and nonsingular, then \[\frac{\norm{x}}{\norm{Ax}}\leq \norm{A^{-1}}.\]

Show that if $A\in \mathbb{C}^{m\times m}$ is nonsingular and has smallest singular value $\sigma_m$, then $\norm{A^{-1}}_2=1/\sigma_m$.

Lecture 20-23: LU and Cholesky factorizations

Compute by hand $P$, $L$, and $U$ using the $LU$ factorization with partial pivoting so that $PA=LU$ for \[A=\left(\begin{array}{rrrr} 12 & 12 & 24 & -12 \\ 6 & 12 & -21 & -48 \\ 3 & -5 & -34 & -35 \\ -4 & 20 & 4 & 28 \end{array}\right).\]

Show each step.

Text exercise 21.1

Compute by hand the Cholesky factorization of \[\begin{bmatrix} 1 & 5 & 3 & 2 \\ 5 & 34 & 18 & 16 \\ 3 & 18 & 14 & 6 \\ 2 & 16 & 6 & 25 \end{bmatrix}.\] Show each step (i.e., each $R_i$).

Text exercise 23.1

Applied Linear Algebra Notes

Jason Grout

Fall 2012

Preface

Class Plans

27 Aug 2012

29 Aug 2012

05 Sep 2012

10 Sep 2012

Prepare

Class

12 Sep 2012

Prepare

Class

Lecture 1: Matrix-vector multiplication

Lecture 2: Orthogonal Vectors and Matrices

Lecture 3: Norms

Lecture 4: The Singular Value Decomposition

Lecture 5: More on the SVD

Lecture 6: Projections

Lecture 7: QR

Lecture 8: Gram-Schmidt

Lecture 9: Matlab

Lecture 10: Householder

Lecture 11: Least Squares

Lecture 12: Conditioning and Condition numbers

Lecture 20-23: LU and Cholesky factorizations