Danny Siu's Personal Blog

SVD and basis

April 19, 2020 | 3 min read

Suppose a matrix

Am×n=σ1u1v1T++σrurvrTA_{m \times n} = \sigma_1u_1v_1^T + \dots + \sigma_ru_rv_r^T

where σi0, uiRm, viRn i=1,,r\sigma_i \ne 0,\space u_i \isin \reals^m,\space v_i \isin \reals^n \space\forall i=1, \dots , r and u1,,uru_1, \dots , u_r are orthogonal and v1,,vrv_1, \dots , v_r are orthogonal. Then the rank of AA is rr, {u1,,uru_1, \dots , u_r} can be a basis to C(A)C(A) and {v1,,vrv_1, \dots , v_r} can be a basis to C(AT).C(A^T).

Proof:

All columns of AA are linear combinations of u1,,uru_1, \dots , u_r. For example, take the first column of matrix AA:

A[10]=σ1u1v1T[10]++σrurvrT[10]A \begin{bmatrix} 1 \\ \vdots \\ 0 \end{bmatrix} = \sigma_1u_1v_1^T \begin{bmatrix} 1 \\ \vdots \\ 0 \end{bmatrix} + \dots + \sigma_ru_rv_r^T \begin{bmatrix} 1 \\ \vdots \\ 0 \end{bmatrix}

which is a linear combination of u1,,uru_1, \dots , u_r.

Then, the column space of A, C(A)C(A) can be spanned by u1,,uru_1, \dots , u_r but it is still not enough to prove that those can be a basis because maybe any kr1k \leq r-1 of u1,,uru_1, \dots , u_r are enough to represent columns of AA.

Suppose that’s the case and we assume u1,,ur1u_1, \dots , u_{r-1} are enough to represent columns of AA (without loss of generality, we just assume don’t need uru_r). Then we can write AA as

A=[u1ur1]B=[u1ur1][b1Tbr1T]A = \begin{bmatrix} u_1 & \dots & u_{r-1} \end{bmatrix} B = \begin{bmatrix} u_1 & \dots & u_{r-1} \end{bmatrix} \begin{bmatrix} b_1^T \\ \vdots \\ b_{r-1}^T \end{bmatrix}

for some matrix BB (we write BB using row vectors)

A=u1b1T++ur1br1T=σ1u1v1T++σrurvrT\therefore A = u_1b_1^T + \dots + u_{r-1}b_{r-1}^T = \sigma_1u_1v_1^T + \dots + \sigma_ru_rv_r^T

Multiplying both sides by urTu_r^T from the left,

urTA=urT(u1b1T++ur1br1T)=urT(σ1u1v1T++σrurvrT)0=σrvrT0\begin{aligned} u_r^TA = u_r^T(u_1b_1^T + \dots + u_{r-1}b_{r-1}^T) &= u_r^T(\sigma_1u_1v_1^T + \dots + \sigma_ru_rv_r^T) \\ 0 &= \sigma_rv_r^T \ne 0 \end{aligned}

which leads to a contradiction. Hence, exactly rr vectors u1,,uru_1, \dots , u_r (not one less) can represent columns of AA (using linear combination).

Obviously, u1,,urC(A)u_1, \dots , u_r \isin C(A). So, we have orthogonal vectors u1,,urC(A)u_1, \dots , u_r \isin C(A), exactly all of them (not one less) are needed to represent columns of AA using linear combination. Hence, Dim(C(A))=rDim(C(A)) = r and {u1,,uru_1, \dots , u_r} can be a basis to C(A)C(A). Similarly, {v1,,vrv_1, \dots , v_r} can be a basis to C(AT)C(A^T).


Written by Danny Siu who lives in Hong Kong. You should follow him on Twitter