try ai
Popular Science
Edit
Share
Feedback
  • Null Space of the Transpose (The Left Null Space)

Null Space of the Transpose (The Left Null Space)

SciencePediaSciencePedia
  • The left null space of a matrix AAA is the null space of its transpose, ATA^TAT, and consists of all vectors orthogonal to the column space of AAA.
  • Its dimension is determined by the formula dim⁡(N(AT))=m−r\dim(N(A^T)) = m - rdim(N(AT))=m−r, where mmm is the number of rows and rrr is the rank of the matrix.
  • A linear system Ax=bA\mathbf{x} = \mathbf{b}Ax=b has a solution if and only if the vector b\mathbf{b}b is orthogonal to every vector in the left null space.
  • The left null space is fundamental to least-squares problems, as the error vector of the best approximation must belong to this space.

Introduction

In the study of linear algebra, a matrix is often conceptualized as a linear transformation, a process that maps input vectors to output vectors. The set of vectors that this transformation maps to zero, known as the null space, reveals crucial information about the transformation's kernel. However, this perspective provides only part of the picture. Every matrix has a transpose, which represents a related but distinct transformation. The critical question then arises: what happens when we consider the null space of this transposed matrix? This inquiry leads us to the concept of the ​​left null space​​, a fundamental subspace with profound implications.

This article addresses the role and significance of the left null space, a concept often seen as more abstract than its counterparts. We will demystify this essential component of a matrix's structure and illustrate its power. Over the following chapters, you will gain a comprehensive understanding of this topic. The "Principles and Mechanisms" chapter will define the left null space, explore its properties as a vector subspace, establish its dimensional relationship with a matrix's rank, and uncover its beautiful geometric orthogonality to the column space. Subsequently, the "Applications and Interdisciplinary Connections" chapter will demonstrate how this seemingly theoretical idea is the ultimate arbiter of solvability for linear systems, the foundation for least-squares approximations in data science, and a unifying concept in fields ranging from network analysis to numerical computation.

Principles and Mechanisms

In our journey through linear algebra, we often think of a matrix AAA as an operator, a machine that takes an input vector x\mathbf{x}x and transforms it into an output vector AxA\mathbf{x}Ax. A particularly interesting question is: which vectors are completely "crushed" by this machine, transformed into the zero vector? This collection of vectors forms the ​​null space​​, N(A)N(A)N(A). It's a fundamental concept, telling us about the kernel of the transformation, the vectors that lose their identity in the process.

But this is only half the story. The matrix AAA has an alter ego, its transpose ATA^TAT, which you can picture as the same machine but with its internal wiring reversed. What happens if we ask the same question about this transposed matrix? What vectors does it crush to zero? The answer leads us to a second, equally important space: the ​​left null space​​ of the original matrix AAA.

A Tale of Two Null Spaces

The left null space of a matrix AAA is, by definition, the null space of its transpose, ATA^TAT. It is the set of all vectors y\mathbf{y}y for which ATy=0A^T \mathbf{y} = \mathbf{0}ATy=0. You might wonder about the name "left null space". If we take the transpose of the entire equation, we get (yTA)T=0T(\mathbf{y}^T A)^T = \mathbf{0}^T(yTA)T=0T, which simplifies to yTA=0T\mathbf{y}^T A = \mathbf{0}^TyTA=0T. This reveals the origin of the name: it's the space of vectors that, when placed on the left of AAA (as a row vector), annihilate the matrix, yielding a row of zeros.

Before we dive deeper, let's get a feel for where these vectors live. If our original matrix AAA has mmm rows and nnn columns (an m×nm \times nm×n matrix), its transpose ATA^TAT will have nnn rows and mmm columns. For the multiplication ATyA^T \mathbf{y}ATy to be defined, the vector y\mathbf{y}y must have a number of components equal to the number of columns in ATA^TAT, which is mmm. So, for an m×nm \times nm×n matrix AAA, its left null space N(AT)N(A^T)N(AT) is always a subspace of Rm\mathbb{R}^mRm. The familiar null space N(A)N(A)N(A), on the other hand, lives in Rn\mathbb{R}^nRn. This is a crucial distinction: these two null spaces generally live in entirely different universes, unless the matrix happens to be square (m=nm=nm=n).

Let's make this concrete. Consider the matrix:

A=(102132)A = \begin{pmatrix} 1 & 0 \\ 2 & 1 \\ 3 & 2 \end{pmatrix}A=​123​012​​

This is a 3×23 \times 23×2 matrix, so its left null space will be a subspace of R3\mathbb{R}^3R3. To find it, we first find its transpose:

AT=(123012)A^T = \begin{pmatrix} 1 & 2 & 3 \\ 0 & 1 & 2 \end{pmatrix}AT=(10​21​32​)

Now we seek all vectors y=(y1y2y3)\mathbf{y} = \begin{pmatrix} y_1 \\ y_2 \\ y_3 \end{pmatrix}y=​y1​y2​y3​​​ such that ATy=0A^T \mathbf{y} = \mathbf{0}ATy=0. This gives us a system of linear equations:

{y1+2y2+3y3=0y2+2y3=0\begin{cases} y_1 + 2y_2 + 3y_3 &= 0 \\ y_2 + 2y_3 &= 0 \end{cases}{y1​+2y2​+3y3​y2​+2y3​​=0=0​

From the second equation, we find y2=−2y3y_2 = -2y_3y2​=−2y3​. Substituting this into the first equation gives y1+2(−2y3)+3y3=0y_1 + 2(-2y_3) + 3y_3 = 0y1​+2(−2y3​)+3y3​=0, which simplifies to y1−y3=0y_1 - y_3 = 0y1​−y3​=0, or y1=y3y_1 = y_3y1​=y3​. So, any vector in this space must look like (y3−2y3y3)\begin{pmatrix} y_3 \\ -2y_3 \\ y_3 \end{pmatrix}​y3​−2y3​y3​​​. We can factor out the free variable y3y_3y3​ to see the underlying structure: y3(1−21)y_3 \begin{pmatrix} 1 \\ -2 \\ 1 \end{pmatrix}y3​​1−21​​. This means the left null space of AAA is a one-dimensional line in R3\mathbb{R}^3R3 spanned by the single basis vector (1−21)\begin{pmatrix} 1 \\ -2 \\ 1 \end{pmatrix}​1−21​​.

The Rules of the Club: Why It's a "Space"

We call it a "space" for a very good reason. It's not just a random collection of vectors; it has a robust structure. If you find two vectors v1\mathbf{v}_1v1​ and v2\mathbf{v}_2v2​ in the left null space, then any linear combination of them, like c1v1+c2v2c_1\mathbf{v}_1 + c_2\mathbf{v}_2c1​v1​+c2​v2​, will also be in the left null space.

Why? The logic is simple and elegant. If v1\mathbf{v}_1v1​ and v2\mathbf{v}_2v2​ are in N(AT)N(A^T)N(AT), it means ATv1=0A^T \mathbf{v}_1 = \mathbf{0}ATv1​=0 and ATv2=0A^T \mathbf{v}_2 = \mathbf{0}ATv2​=0. Let's test their sum:

AT(v1+v2)=ATv1+ATv2=0+0=0A^T(\mathbf{v}_1 + \mathbf{v}_2) = A^T\mathbf{v}_1 + A^T\mathbf{v}_2 = \mathbf{0} + \mathbf{0} = \mathbf{0}AT(v1​+v2​)=ATv1​+ATv2​=0+0=0

It works! The sum is also in the space. The same holds for scalar multiples. This property of closure under addition and scalar multiplication is the very definition of a vector subspace. This predictable structure is what makes these spaces so powerful in our analysis.

The Cosmic Balance Sheet: Dimensions and Duality

So, how large is this left null space? Its dimension isn't arbitrary. It's intimately tied to the other fundamental properties of the matrix through a beautiful relationship.

You may recall the ​​Rank-Nullity Theorem​​, which for any m×nm \times nm×n matrix AAA states:

rank⁡(A)+dim⁡(N(A))=n(the number of columns)\operatorname{rank}(A) + \dim(N(A)) = n \quad (\text{the number of columns})rank(A)+dim(N(A))=n(the number of columns)

We can apply this very same theorem to the transpose, ATA^TAT. Since ATA^TAT is an n×mn \times mn×m matrix, the theorem tells us:

rank⁡(AT)+dim⁡(N(AT))=m(the number of columns of AT)\operatorname{rank}(A^T) + \dim(N(A^T)) = m \quad (\text{the number of columns of } A^T)rank(AT)+dim(N(AT))=m(the number of columns of AT)

Here comes the magic ingredient: a cornerstone of linear algebra is that a matrix and its transpose have the same rank. That is, rank⁡(A)=rank⁡(AT)\operatorname{rank}(A) = \operatorname{rank}(A^T)rank(A)=rank(AT). This deep fact connects the span of the columns to the span of the rows. Substituting this into our equation for ATA^TAT, we get the master equation for the left null space:

rank⁡(A)+dim⁡(N(AT))=m\operatorname{rank}(A) + \dim(N(A^T)) = mrank(A)+dim(N(AT))=m

This is a profound statement. It means the dimension of the left null space is completely determined by the rank of the matrix and the number of rows. If a 5×75 \times 75×7 matrix AAA has a column space of dimension 4 (meaning its rank is 4), we immediately know the dimension of its left null space must be 5−4=15 - 4 = 15−4=1. If a 5×35 \times 35×3 matrix has a row space of dimension 2 (rank 2), its left null space must have dimension 5−2=35-2=35−2=3.

These dimensional relationships are locked together in a kind of cosmic balance sheet. For any m×nm \times nm×n matrix with rank rrr, we have:

  • dim⁡(N(A))=n−r\dim(N(A)) = n - rdim(N(A))=n−r
  • dim⁡(N(AT))=m−r\dim(N(A^T)) = m - rdim(N(AT))=m−r

If you're told that for a 7×107 \times 107×10 matrix, the dimensions of the two null spaces add up to 9, you can deduce the rank must be 4, and from that, the dimension of the column space is also 4.

The Perpendicular Universe: Orthogonality and Solvability

There is a beautiful geometric interpretation of the left null space. Remember that a vector y\mathbf{y}y is in N(AT)N(A^T)N(AT) if yTA=0T\mathbf{y}^T A = \mathbf{0}^TyTA=0T. Let's write AAA in terms of its columns: A=(∣∣c1…cn∣∣)A = \begin{pmatrix} | & & | \\ \mathbf{c}_1 & \dots & \mathbf{c}_n \\ | & & | \end{pmatrix}A=​∣c1​∣​…​∣cn​∣​​. The equation yTA=0T\mathbf{y}^T A = \mathbf{0}^TyTA=0T means that the dot product of y\mathbf{y}y with every single column of AAA is zero: y⋅ci=0\mathbf{y} \cdot \mathbf{c}_i = 0y⋅ci​=0 for all iii.

This means every vector in the left null space is ​​orthogonal​​ to every column of AAA. And since the columns of AAA span the ​​column space​​ C(A)C(A)C(A), it follows that every vector in the left null space is orthogonal to the entire column space.

So, here is the grand picture: In the universe of Rm\mathbb{R}^mRm, the column space C(A)C(A)C(A) and the left null space N(AT)N(A^T)N(AT) exist as orthogonal subspaces. They meet only at the origin, and they are perpendicular to each other in every possible direction. This is why their dimensions must sum to the dimension of the whole space: dim⁡(C(A))+dim⁡(N(AT))=m\dim(C(A)) + \dim(N(A^T)) = mdim(C(A))+dim(N(AT))=m.

This orthogonality is not just an abstract curiosity; it is the key to understanding when systems of linear equations have solutions. The system Ax=bA\mathbf{x} = \mathbf{b}Ax=b has a solution if and only if the vector b\mathbf{b}b lies in the column space of AAA. The orthogonality condition gives us a perfect test for this: b\mathbf{b}b is in the column space if and only if it is orthogonal to every vector in the left null space. This powerful idea is known as the ​​Fredholm Alternative​​. If you have a system of equations that seems unsolvable, you can check if the right-hand side vector b\mathbf{b}b is "perpendicular" to the "problem-causing" directions defined by the left null space.

Blueprints for Matrices: Subspaces as Constraints

The connections we've uncovered also work in reverse. Knowing the properties of a matrix's fundamental subspaces allows us to deduce properties of the matrix itself. If you're told that the left null space of a 3×23 \times 23×2 matrix AAA is spanned by the vector v=[1,0,−1]T\mathbf{v} = [1, 0, -1]^Tv=[1,0,−1]T, this isn't just abstract information. It's a concrete blueprint.

The condition means that vTA=0T\mathbf{v}^T A = \mathbf{0}^TvTA=0T. If the rows of AAA are r1,r2,r3\mathbf{r}_1, \mathbf{r}_2, \mathbf{r}_3r1​,r2​,r3​, this translates to:

[1,0,−1](—r1——r2——r3—)=(1)r1+(0)r2+(−1)r3=0T[1, 0, -1] \begin{pmatrix} — \mathbf{r}_1 — \\ — \mathbf{r}_2 — \\ — \mathbf{r}_3 — \end{pmatrix} = (1)\mathbf{r}_1 + (0)\mathbf{r}_2 + (-1)\mathbf{r}_3 = \mathbf{0}^T[1,0,−1]​—r1​——r2​——r3​—​​=(1)r1​+(0)r2​+(−1)r3​=0T

This gives us a structural constraint on the matrix: its first row must be identical to its third row, r1=r3\mathbf{r}_1 = \mathbf{r}_3r1​=r3​. Abstract properties of a subspace dictate concrete algebraic relationships among the entries of the matrix.

And as a final thought, what happens when things get weird? In the familiar world of real vectors, a non-zero vector can never be orthogonal to itself. But in more exotic settings, like vector spaces over complex numbers, it's possible for a non-zero vector v\mathbf{v}v to satisfy vTv=0\mathbf{v}^T \mathbf{v} = 0vTv=0. In such a scenario, if we construct a matrix A=vuTA = \mathbf{v}\mathbf{u}^TA=vuT, its column space is simply the line spanned by v\mathbf{v}v. The left null space is the set of all vectors orthogonal to v\mathbf{v}v. But since v\mathbf{v}v is orthogonal to itself, the entire column space lies inside the left null space!. This is a beautiful, counter-intuitive result that reminds us that the principles we've discussed are gateways to even richer and more fascinating mathematical structures. The humble left null space is not just a calculation to be performed; it is a key that unlocks a deeper understanding of the symmetry and geometry hidden within matrices.

Applications and Interdisciplinary Connections

In our journey so far, we have become acquainted with the cast of characters that populate the world of a matrix—the four fundamental subspaces. One of these, the left null space, might have seemed a bit more mysterious than the others. It is the set of all vectors y\mathbf{y}y that, when multiplied by our matrix AAA from the left as yTA\mathbf{y}^T AyTA, produce nothing but a row of zeros. What is the point of such a thing? It turns out this seemingly obscure space is not a minor character at all; it is the ultimate arbiter, the supreme judge that decides some of the most fundamental questions in science and engineering. Its properties echo in fields as diverse as data analysis, computer graphics, and even the study of complex networks. Let us now see this powerful idea in action.

The Ultimate Arbiter of Solvability

Imagine you are trying to achieve a certain outcome, represented by a vector b\mathbf{b}b. Your tools for getting there are a set of linear processes, encapsulated in a matrix AAA. The question "Can I achieve outcome b\mathbf{b}b?" is mathematically phrased as "Does the system Ax=bA\mathbf{x}=\mathbf{b}Ax=b have a solution?". You can think of the columns of AAA as your available ingredients, and the column space, Col(A)\text{Col}(A)Col(A), as the collection of all possible dishes you can prepare by mixing them. The system has a solution if and only if your desired dish, b\mathbf{b}b, is on the menu—that is, if b\mathbf{b}b lies in Col(A)\text{Col}(A)Col(A).

But how do you check this without trying every possible combination? This is where the left null space, N(AT)N(A^T)N(AT), plays its decisive role. You see, the left null space is the orthogonal complement of the column space. It represents a set of "anti-recipes"—directions fundamentally incompatible with your ingredients. If your target vector b\mathbf{b}b has any projection onto this "anti-recipe" space, it's impossible to create. The test is beautifully simple: a solution exists if and only if b\mathbf{b}b is orthogonal to every vector in the left null space.

Therefore, to prove a system is inconsistent, you don't need to check every vector in N(AT)N(A^T)N(AT). You just need to find one vector y\mathbf{y}y in the left null space for which the dot product yTb\mathbf{y}^T \mathbf{b}yTb is not zero. If you find such a vector, the case is closed: no solution exists. This principle, a form of the Fredholm alternative, gives us a concrete condition for solvability. Instead of an exhaustive search, we can characterize all impossible outcomes by finding a basis for N(AT)N(A^T)N(AT). Any target b\mathbf{b}b that is not perpendicular to these basis vectors is unreachable. From a higher viewpoint, this tells us that a linear transformation is surjective (it can reach every point in its target space) precisely when the only vector orthogonal to its image is the zero vector—that is, when its left null space is trivial.

The Art of the Best Guess: Least Squares

So, what do we do when the judge declares our system "unsolvable"? Do we simply give up? In the real world, this happens all the time. Our measurements are noisy, our models are imperfect, and we often have more data points than parameters in our model. This leads to overdetermined systems Ax=bA\mathbf{x}=\mathbf{b}Ax=b that have no exact solution. The vector b\mathbf{b}b of our measurements simply does not lie in the column space of our model matrix AAA.

Here, linear algebra offers not a surrender, but a beautiful compromise: the least-squares solution. If we can't land exactly on the target b\mathbf{b}b, we can find the point p\mathbf{p}p inside the column space Col(A)\text{Col}(A)Col(A) that is closest to b\mathbf{b}b. This point p\mathbf{p}p is our best possible approximation, and it is of the form p=Ax^\mathbf{p} = A\hat{\mathbf{x}}p=Ax^ for some vector x^\hat{\mathbf{x}}x^, which we call the least-squares solution.

What does "closest" mean geometrically? It means that the error vector, the difference between our data and our best approximation, e=b−p\mathbf{e} = \mathbf{b} - \mathbf{p}e=b−p, must be as short as possible. This happens when e\mathbf{e}e is perpendicular to the space we are projecting onto, Col(A)\text{Col}(A)Col(A). But we have just seen that the space of all vectors perpendicular to Col(A)\text{Col}(A)Col(A) is none other than the left null space, N(AT)N(A^T)N(AT)! So, the profound condition that defines the best possible approximation is that the error vector must live in the left null space: e∈N(AT)\mathbf{e} \in N(A^T)e∈N(AT). This simple geometric fact is the heart of regression analysis, data fitting, and countless optimization problems. And should we be so lucky that our least-squares error turns out to be zero, it means our error vector is the zero vector. This implies our data b\mathbf{b}b was in the column space all along, and our system had a perfect solution waiting to be discovered.

The Machinery of Computation

It is one thing to appreciate these beautiful geometric relationships, but quite another to compute these subspaces for a giant matrix with millions of entries. Fortunately, the architects of numerical linear algebra have given us powerful tools that act like X-rays for matrices, revealing their internal structure, including the left null space.

Two of the most important tools are the QR factorization and the Singular Value Decomposition (SVD).

When we perform a full QR factorization on an m×nm \times nm×n matrix AAA (with m>nm \gt nm>n), we decompose it as A=QRA=QRA=QR. Here, QQQ is an m×mm \times mm×m orthogonal matrix whose columns form an orthonormal basis for the entire space Rm\mathbb{R}^mRm, and RRR is an m×nm \times nm×n upper trapezoidal matrix. The first nnn columns of QQQ are constructed to form a pristine orthonormal basis for the column space of AAA. What about the remaining m−nm-nm−n columns of QQQ? By the very nature of an orthogonal matrix, they are orthogonal to the first nnn columns. They therefore form a perfect orthonormal basis for the orthogonal complement of the column space—that is, for the left null space, N(AT)N(A^T)N(AT).

The Singular Value Decomposition, A=UΣVTA = U\Sigma V^TA=UΣVT, is even more revealing. It simultaneously provides orthonormal bases for all four fundamental subspaces. For our purposes, the key is the m×mm \times mm×m orthogonal matrix UUU. The first rrr columns of UUU (where rrr is the rank of AAA) span the column space. The remaining m−rm-rm−r columns of UUU give us an orthonormal basis for the left null space. The SVD even tells us the dimension of this space directly: it is simply the number of all-zero rows in the central matrix Σ\SigmaΣ. These decompositions are not mere theoretical curiosities; they are the robust, high-performance engines running inside the software we use for everything from weather prediction to designing aircraft.

A Unifying Thread: Networks, Functions, and Beyond

The true beauty of a deep mathematical concept is that it refuses to be confined to its original context. The left null space is a prime example, appearing in surprising places with profound physical and structural interpretations.

Consider a directed graph, like a network of one-way streets or electrical circuits. We can describe its topology with a vertex-edge "incidence matrix" MMM. Let's assign a scalar value—a "potential," like voltage or altitude—to each vertex in the graph, forming a vector p\mathbf{p}p. What does it mean if this vector p\mathbf{p}p lies in the left null space of the incidence matrix, MTp=0M^T \mathbf{p} = \mathbf{0}MTp=0? The equation MTp=0M^T \mathbf{p} = \mathbf{0}MTp=0 unpacks into a simple condition for every single edge in the graph: if an edge runs from vertex viv_ivi​ to vertex vjv_jvj​, then the potentials must be equal, pi=pjp_i = p_jpi​=pj​. This implies that the potential must be constant across any connected component of the graph. The dimension of the left null space, therefore, counts something tangible: the number of separate, weakly connected components in our network!. This single algebraic idea unifies concepts from circuit theory (Kirchhoff's Voltage Law) and graph theory.

This principle of generality doesn't stop with vectors of numbers. The concepts of linear algebra apply just as well to vector spaces of functions, such as the space of polynomials P2(R)\mathcal{P}_2(\mathbb{R})P2​(R). We can define linear operators on this space, for example, an operator that involves differentiation. Such an operator can be represented by a matrix AAA with respect to a basis (like {1,x,x2}\{1, x, x^2\}{1,x,x2}). Finding the left null space of this matrix AAA reveals fundamental properties of the operator itself—it tells us about the "constraints" on the outputs it can produce.

From a simple question of solvability, we have journeyed to the heart of approximation theory, peered into the machinery of modern computation, and found echoes of the same idea in the structure of networks and functions. The left null space is a testament to the remarkable unity of mathematics, where a single, elegant concept can provide the key to understanding and solving a vast array of problems across the scientific landscape.