The factorization is a matrix version of the Gram-Schmidt process. Suppose
\begin{equation*}
A = \left[ \begin{array}{cccc}
|\amp |\amp \amp | \\
\mathbf{c}_{1} \amp \mathbf{c}_{2} \amp \cdots \amp \mathbf{c}_{n}\\
|\amp |\amp \amp |
\end{array}\right]
\end{equation*}
is an \(m \times n\) matrix with linearly independent columns \(\mathbf{c}_{1}, \mathbf{c}_{2}, \dots, \mathbf{c}_{n}\text{.}\) The Gram-Schmidt algorithm can be applied to these columns to provide orthogonal columns \(\mathbf{f}_{1}, \mathbf{f}_{2}, \dots, \mathbf{f}_{n}\) where \(\mathbf{f}_{1} = \mathbf{c}_{1}\) and
\begin{equation*}
\mathbf{f}_{k} = \mathbf{c}_{k} - \frac{\mathbf{c}_{k} \cdot \mathbf{f}_{1}}{\norm{ \mathbf{f}_{1} }^2}\mathbf{f}_{1} + \frac{\mathbf{c}_{k} \cdot \mathbf{f}_{2}}{\norm{ \mathbf{f}_{2} }^2}\mathbf{f}_{2} - \dots - \frac{\mathbf{c}_{k} \cdot \mathbf{f}_{k-1}}{\norm{ \mathbf{f}_{k-1} }^2}\mathbf{f}_{k-1}
\end{equation*}
for each \(k = 2, 3, \dots, n\text{.}\) Now write \(\mathbf{q}_{k} = \frac{1}{\norm{ \mathbf{f}_{k} }}\mathbf{f}_{k}\) for each \(k\text{.}\) Then \(\mathbf{q}_{1}, \mathbf{q}_{2}, \dots, \mathbf{q}_{n}\) are orthonormal columns, and the above equation becomes
\begin{equation*}
\norm{ \mathbf{f}_{k} } \mathbf{q}_{k} = \mathbf{c}_{k} - (\mathbf{c}_{k} \cdot \mathbf{q}_{1})\mathbf{q}_{1} - (\mathbf{c}_{k} \cdot \mathbf{q}_{2})\mathbf{q}_{2} - \dots - (\mathbf{c}_{k} \cdot \mathbf{q}_{k-1})\mathbf{q}_{k-1}.
\end{equation*}
Using these equations, express each \(\mathbf{c}_{k}\) as a linear combination of the \(\mathbf{q}_{i}\text{:}\)
\begin{equation*}
\begin{array}{ccl}
\mathbf{c}_{1} \amp =\amp \norm{ \mathbf{f}_{1} } \mathbf{q}_{1} \\
\mathbf{c}_{2} \amp =\amp (\mathbf{c}_{2} \cdot \mathbf{q}_{1})\mathbf{q}_{1} + \norm{ \mathbf{f}_{2} } \mathbf{q}_{2} \\
\mathbf{c}_{3} \amp =\amp (\mathbf{c}_{3} \cdot \mathbf{q}_{1})\mathbf{q}_{1} + (\mathbf{c}_{3} \cdot \mathbf{q}_{2})\mathbf{q}_{2} + \norm{ \mathbf{f}_{3} } \mathbf{q}_{3} \\
\vdots \amp \amp \vdots \\
\mathbf{c}_{n} \amp =\amp (\mathbf{c}_{n} \cdot \mathbf{q}_{1})\mathbf{q}_{1} + (\mathbf{c}_{n} \cdot \mathbf{q}_{2})\mathbf{q}_{2} + (\mathbf{c}_{n} \cdot \mathbf{q}_{3})\mathbf{q}_{3} + \dots + \norm{ \mathbf{f}_{n} } \mathbf{q}_{n}
\end{array}
\end{equation*}
These equations have a matrix form that gives the required factorization:
\begin{align}
A \amp = \left[ \begin{array}{ccccc}
|\amp |\amp |\amp \amp | \\
\mathbf{c}_{1} \amp \mathbf{c}_{2} \amp \mathbf{c}_{3} \amp \cdots \amp \mathbf{c}_{n}\\
|\amp |\amp |\amp \amp |
\end{array}\right] \tag{7.6.1}\\
\amp = \left[ \begin{array}{ccccc}
|\amp |\amp |\amp \amp | \\
\mathbf{q}_{1} \amp \mathbf{q}_{2} \amp \mathbf{q}_{3} \amp \cdots \amp \mathbf{q}_{n}\\
|\amp |\amp |\amp \amp |
\end{array}\right] \left[ \begin{array}{ccccc}
\norm{ \mathbf{f}_{1} } \amp \mathbf{c}_{2} \cdot \mathbf{q}_{1} \amp \mathbf{c}_{3} \cdot \mathbf{q}_{1} \amp \cdots \amp \mathbf{c}_{n} \cdot \mathbf{q}_{1} \\
0 \amp \norm{ \mathbf{f}_{2} } \amp \mathbf{c}_{3} \cdot \mathbf{q}_{2} \amp \cdots \amp \mathbf{c}_{n} \cdot \mathbf{q}_{2} \\
0 \amp 0 \amp \norm{ \mathbf{f}_{3} } \amp \cdots \amp \mathbf{c}_{n} \cdot \mathbf{q}_{3} \\
\vdots \amp \vdots \amp \vdots \amp \ddots \amp \vdots \\
0 \amp 0 \amp 0 \amp \cdots \amp \norm{ \mathbf{f}_{n} }
\end{array} \right]. \notag
\end{align}
\begin{equation*}
Q = \left[ \begin{array}{ccccc}
|\amp |\amp |\amp \amp | \\
\mathbf{q}_{1} \amp \mathbf{q}_{2} \amp \mathbf{q}_{3} \amp \cdots \amp \mathbf{q}_{n}\\
|\amp |\amp |\amp \amp |
\end{array}\right]
\end{equation*}
has orthonormal columns, and the second factor is an \(n \times n\) upper triangular matrix \(R\) with positive diagonal entries (and so is invertible). We record this in the following theorem.