SVD and basis

April 19, 2020 | 3 min read

Suppose a matrix

A_{m \times n} = \sigma_1u_1v_1^T + \dots + \sigma_ru_rv_r^T

where $\sigma_i \ne 0,\space u_i \isin \reals^m,\space v_i \isin \reals^n \space\forall i=1, \dots , r$ and $u_1, \dots , u_r$ are orthogonal and $v_1, \dots , v_r$ are orthogonal. Then the rank of $A$ is $r$ , { $u_1, \dots , u_r$ } can be a basis to $C(A)$ and { $v_1, \dots , v_r$ } can be a basis to $C(A^T).$

Proof:

All columns of $A$ are linear combinations of $u_1, \dots , u_r$ . For example, take the first column of matrix $A$ :

A \begin{bmatrix} 1 \\ \vdots \\ 0 \end{bmatrix} = \sigma_1u_1v_1^T \begin{bmatrix} 1 \\ \vdots \\ 0 \end{bmatrix} + \dots + \sigma_ru_rv_r^T \begin{bmatrix} 1 \\ \vdots \\ 0 \end{bmatrix}

which is a linear combination of $u_1, \dots , u_r$ .

Then, the column space of A, $C(A)$ can be spanned by $u_1, \dots , u_r$ but it is still not enough to prove that those can be a basis because maybe any $k \leq r-1$ of $u_1, \dots , u_r$ are enough to represent columns of $A$ .

Suppose that’s the case and we assume $u_1, \dots , u_{r-1}$ are enough to represent columns of $A$ (without loss of generality, we just assume don’t need $u_r$ ). Then we can write $A$ as

A = \begin{bmatrix} u_1 & \dots & u_{r-1} \end{bmatrix} B = \begin{bmatrix} u_1 & \dots & u_{r-1} \end{bmatrix} \begin{bmatrix} b_1^T \\ \vdots \\ b_{r-1}^T \end{bmatrix}

for some matrix $B$ (we write $B$ using row vectors)

\therefore A = u_1b_1^T + \dots + u_{r-1}b_{r-1}^T = \sigma_1u_1v_1^T + \dots + \sigma_ru_rv_r^T

Multiplying both sides by $u_r^T$ from the left,

\begin{aligned} u_r^TA = u_r^T(u_1b_1^T + \dots + u_{r-1}b_{r-1}^T) &= u_r^T(\sigma_1u_1v_1^T + \dots + \sigma_ru_rv_r^T) \\ 0 &= \sigma_rv_r^T \ne 0 \end{aligned}

which leads to a contradiction. Hence, exactly $r$ vectors $u_1, \dots , u_r$ (not one less) can represent columns of $A$ (using linear combination).

Obviously, $u_1, \dots , u_r \isin C(A)$ . So, we have orthogonal vectors $u_1, \dots , u_r \isin C(A)$ , exactly all of them (not one less) are needed to represent columns of $A$ using linear combination. Hence, $Dim(C(A)) = r$ and { $u_1, \dots , u_r$ } can be a basis to $C(A)$ . Similarly, { $v_1, \dots , v_r$ } can be a basis to $C(A^T)$ .

Written by Danny Siu who lives in Hong Kong. You should follow him on Twitter

Using Fermat Number to prove the infinity of primes →