try ai
Popular Science
Edit
Share
Feedback
  • Generalized Eigenvector

Generalized Eigenvector

SciencePediaSciencePedia
Key Takeaways
  • Generalized eigenvectors are required for "defective" matrices, where an eigenvalue's geometric multiplicity (number of eigenvectors) is less than its algebraic multiplicity.
  • These vectors form "Jordan chains" that reveal the complete structure of a linear transformation, leading to the nearly diagonal Jordan Canonical Form.
  • In dynamical systems, generalized eigenvectors correspond to resonant behaviors that involve terms like teλtt e^{\lambda t}teλt, signifying growth beyond simple exponential trends.
  • The concept extends to quantum physics, where non-normalizable states (like plane waves) are rigorously defined as generalized eigenvectors in a Rigged Hilbert Space.

Introduction

In linear algebra, eigenvectors represent the fundamental directions of a transformation—directions that are simply scaled without being rotated. For many "diagonalizable" matrices, these eigenvectors form a complete basis, simplifying complex operations into straightforward scaling. However, a significant class of matrices, known as "defective" matrices, do not possess enough eigenvectors to span their entire space. This gap in our understanding is not a mere mathematical edge case; it describes critical behaviors in numerous physical and engineered systems. This article addresses this gap by introducing the powerful concept of the generalized eigenvector.

To build a complete picture, we will embark on a journey through two key areas. First, under "Principles and Mechanisms," we will explore the algebraic origins of generalized eigenvectors, see how they form "Jordan chains," and uncover how these chains lead to the Jordan Canonical Form—the ultimate structural map of any linear transformation. Following this, the section "Applications and Interdisciplinary Connections" will demonstrate how this abstract theory finds profound and practical relevance in fields like dynamical systems, control theory, and even the fundamental description of reality in quantum mechanics.

Principles and Mechanisms

In our previous discussions, we celebrated the eigenvector. These remarkable vectors are the fixed points of a linear transformation, the directions that remain pure and unrotated, merely stretched or shrunk by a matrix. A matrix's action on its eigenvectors is beautifully simple: Av=λvAv = \lambda vAv=λv. For many matrices, the "diagonalizable" ones, we can find a complete set of these eigenvectors that span the entire space. This is a physicist's dream! It means we can describe any vector as a combination of these special basis vectors, and the matrix's complicated action dissolves into simple scaling along each basis direction. The world, from the perspective of such a matrix, is an orderly grid of straight avenues.

But what happens when the world isn't so simple? What if a matrix doesn't have enough of these clean, straight avenues to map out its entire space? This isn't a rare or pathological case; it happens all the time in physics and engineering, from mechanical vibrations to quantum mechanics. These are the "defective" matrices, and they force us to broaden our search and look for a bigger family of special vectors.

The Defective Matrix: A Shortage of Directions

The heart of the issue lies in a mismatch between two fundamental quantities. For a given eigenvalue λ\lambdaλ, its ​​algebraic multiplicity​​ is the number of times it appears as a root of the characteristic polynomial. You can think of this as the "expected" number of dimensions associated with that eigenvalue. On the other hand, its ​​geometric multiplicity​​ is the actual number of linearly independent eigenvectors we can find for it—the dimension of the eigenspace ker⁡(A−λI)\ker(A - \lambda I)ker(A−λI).

For a diagonalizable matrix, these two multiplicities are always equal for every eigenvalue. But for a defective matrix, the geometric multiplicity of at least one eigenvalue is less than its algebraic multiplicity. We have fewer eigenvector directions than we "should."

Imagine a 2×22 \times 22×2 matrix with a single, repeated eigenvalue λ=3\lambda=3λ=3. We expect two special directions, but we find only one. This happens precisely when the matrix is not just a simple scaling matrix like (3003)\begin{pmatrix} 3 & 0 \\ 0 & 3 \end{pmatrix}(30​03​), but has some "twist" or "shear" to it, like the matrix AC=(2−114)A_C = \begin{pmatrix} 2 & -1 \\ 1 & 4 \end{pmatrix}AC​=(21​−14​). This matrix has a repeated eigenvalue λ=3\lambda=3λ=3, but its eigenspace is only one-dimensional. We are missing a basis vector. We cannot describe the full action of the matrix using eigenvectors alone. We need something more.

The Jordan Chain: A Ladder out of the Eigenspace

This is where the genius of Camille Jordan comes into play. If a vector isn't an eigenvector, then (A−λI)v(A - \lambda I)v(A−λI)v is not the zero vector. But what if it's the next best thing? What if (A−λI)v(A - \lambda I)v(A−λI)v lands us on an actual eigenvector?

Let's call our true eigenvector v1v_1v1​, which by definition satisfies (A−λI)v1=0(A - \lambda I)v_1 = \mathbf{0}(A−λI)v1​=0. Now, let's hunt for a new vector, let's call it v2v_2v2​, which is not an eigenvector, but is related to v1v_1v1​ in a special way:

(A−λI)v2=v1(A - \lambda I)v_2 = v_1(A−λI)v2​=v1​

If we find such a vector, we have discovered the first two rungs of a ​​Jordan chain​​. And we can keep going! Perhaps there is a v3v_3v3​ such that (A−λI)v3=v2(A - \lambda I)v_3 = v_2(A−λI)v3​=v2​, and so on. This creates a beautiful hierarchy of vectors:

(A−λI)vk=vk−1,…,(A−λI)v2=v1,(A−λI)v1=0(A - \lambda I)v_k = v_{k-1}, \quad \dots, \quad (A - \lambda I)v_2 = v_1, \quad (A - \lambda I)v_1 = \mathbf{0}(A−λI)vk​=vk−1​,…,(A−λI)v2​=v1​,(A−λI)v1​=0

This set of vectors {v1,v2,…,vk}\{v_1, v_2, \dots, v_k\}{v1​,v2​,…,vk​} is a Jordan chain of length kkk. The vector v1v_1v1​ is a standard eigenvector, while v2,…,vkv_2, \dots, v_kv2​,…,vk​ are called ​​generalized eigenvectors​​. Notice the structure: if you apply the operator (A−λI)(A - \lambda I)(A−λI) to any vector in the chain, you simply move one step down the ladder. Applying it to vkv_kvk​ gives vk−1v_{k-1}vk−1​, applying it again gives vk−2v_{k-2}vk−2​, and so on, until you finally reach v1v_1v1​ and the next step takes you to the "ground," the zero vector.

This gives us a precise definition of rank. A generalized eigenvector vvv has ​​rank​​ kkk if it takes kkk applications of (A−λI)(A-\lambda I)(A−λI) to annihilate it, but no fewer: (A−λI)kv=0(A-\lambda I)^k v = \mathbf{0}(A−λI)kv=0 but (A−λI)k−1v≠0(A-\lambda I)^{k-1} v \neq \mathbf{0}(A−λI)k−1v=0. From this, you can see that the vector we called vkv_kvk​ is a generalized eigenvector of rank kkk. And as a direct consequence, the vector u=(A−λI)vku = (A - \lambda I)v_ku=(A−λI)vk​ (which we know is vk−1v_{k-1}vk−1​) must be a generalized eigenvector of rank k−1k-1k−1. The operator (A−λI)(A - \lambda I)(A−λI) is a rank-reducing machine!

Let's see this in action. Consider the matrix A=(3−111)A = \begin{pmatrix} 3 & -1 \\ 1 & 1 \end{pmatrix}A=(31​−11​), which has a single eigenvalue λ=2\lambda=2λ=2 but only a one-dimensional eigenspace. If we propose that a generalized eigenvector of rank 2 is v2=(c0)v_2 = \begin{pmatrix} c \\ 0 \end{pmatrix}v2​=(c0​), we can find its partner eigenvector v1v_1v1​ just by following the rule:

v1=(A−2I)v2=(1−11−1)(c0)=(cc)v_1 = (A - 2I)v_2 = \begin{pmatrix} 1 & -1 \\ 1 & -1 \end{pmatrix} \begin{pmatrix} c \\ 0 \end{pmatrix} = \begin{pmatrix} c \\ c \end{pmatrix}v1​=(A−2I)v2​=(11​−1−1​)(c0​)=(cc​)

And you can easily check that this v1v_1v1​ is indeed a true eigenvector: (A−2I)v1=0(A-2I)v_1 = \mathbf{0}(A−2I)v1​=0. We've found our missing direction! The pair {v1,v2}\{v_1, v_2\}{v1​,v2​} now forms a basis for the whole 2D space. This same principle applies beautifully to larger matrices, allowing us to generate the "missing" vectors from a single generalized eigenvector of the highest rank in a chain.

The Grand Structure: Invariant Subspaces

This idea of chains is not just a clever trick; it reveals a profound underlying structure of the vector space. For each distinct eigenvalue λj\lambda_jλj​ of our matrix AAA, we can group together all of its associated vectors—the true eigenvectors and all the generalized eigenvectors across all its chains. This collection, along with the zero vector, forms a subspace called the ​​generalized eigenspace​​, denoted GλjG_{\lambda_j}Gλj​​.

A generalized eigenspace GλjG_{\lambda_j}Gλj​​ is the set of all vectors that are eventually sent to zero by repeated application of (A−λjI)(A-\lambda_j I)(A−λj​I). These subspaces are special because they are ​​A-invariant​​. This means that if you take any vector vvv from within a generalized eigenspace GλjG_{\lambda_j}Gλj​​ and apply the matrix AAA to it, the resulting vector AvAvAv is guaranteed to still be inside GλjG_{\lambda_j}Gλj​​. The transformation AAA never throws a vector out of its own generalized eigenspace.

The most beautiful result of all is the ​​Primary Decomposition Theorem​​. It states that the entire vector space VVV can be written as a direct sum of these invariant generalized eigenspaces:

V=Gλ1⊕Gλ2⊕⋯⊕GλrV = G_{\lambda_1} \oplus G_{\lambda_2} \oplus \cdots \oplus G_{\lambda_r}V=Gλ1​​⊕Gλ2​​⊕⋯⊕Gλr​​

This is a powerful statement. It tells us that the matrix AAA, which might seem to be mixing all the vectors in a horribly complicated way, is actually behaving in a very block-like manner. It operates on each generalized eigenspace completely independently of the others. The space decomposes into a set of smaller, non-interacting "universes." For instance, a 6-dimensional problem might decompose into a 4-dimensional universe for λ=2\lambda=2λ=2 and a separate 2-dimensional universe for λ=−1\lambda=-1λ=−1, with no cross-talk between them.

The Jordan Canonical Form: The True Map of a Transformation

We have arrived at the final destination. Within each invariant subspace GλG_{\lambda}Gλ​, we now have a basis composed of one or more Jordan chains. What does the matrix AAA look like when we use this special basis for the whole space? The result is the ​​Jordan Canonical Form​​, the simplest and most transparent representation of any linear transformation.

In this basis, the matrix J=V−1AVJ = V^{-1}AVJ=V−1AV (where VVV's columns are the basis vectors from the Jordan chains) becomes almost diagonal. It is a block diagonal matrix, where each block corresponds to one of the invariant subspaces GλG_{\lambda}Gλ​. And within each of those blocks, the structure of the chains is laid bare.

Each Jordan chain of length kkk for an eigenvalue λ\lambdaλ produces a k×kk \times kk×k ​​Jordan block​​:

Jk(λ)=(λ10⋯00λ1⋯000λ⋯0⋮⋮⋮⋱1000⋯λ)J_k(\lambda) = \begin{pmatrix} \lambda & 1 & 0 & \cdots & 0 \\ 0 & \lambda & 1 & \cdots & 0 \\ 0 & 0 & \lambda & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots & 1 \\ 0 & 0 & 0 & \cdots & \lambda \end{pmatrix}Jk​(λ)=​λ00⋮0​1λ0⋮0​01λ⋮0​⋯⋯⋯⋱⋯​0001λ​​

The diagonal entries λ\lambdaλ represent the familiar stretching action of an eigenvalue. The 1's on the superdiagonal (the line just above the main diagonal) represent the "mixing" action that links the vectors in a chain: Avi=λvi+vi−1Av_i = \lambda v_i + v_{i-1}Avi​=λvi​+vi−1​. They are the mathematical signature of the shearing, twisting part of the transformation that eigenvectors alone couldn't capture.

The complete Jordan form of a matrix tells you everything:

  • The eigenvalues λ\lambdaλ are on the diagonal.
  • The number of Jordan blocks for a given λ\lambdaλ is equal to its geometric multiplicity—the number of independent eigenvectors, which is also the number of Jordan chains.
  • The size of the largest Jordan block for λ\lambdaλ tells you the length of the longest chain. This number is also the exponent of the factor (x−λ)(x-\lambda)(x−λ) in the ​​minimal polynomial​​ of the matrix. This polynomial, unlike the characteristic polynomial, captures the most persistent part of the transformation's structure.

So, generalized eigenvectors are not just a patch for defective matrices. They are the key that unlocks the true, deep structure of any linear transformation. They show us how a complex space decomposes into simpler, invariant subspaces, and within each, how vectors are linked together in elegant chains, revealing the fundamental actions of stretching and shearing that govern the system. They provide the ultimate, canonical map for navigating the world of linear algebra.

Applications and Interdisciplinary Connections

Having journeyed through the algebraic machinery of generalized eigenvectors and Jordan chains, you might be left with a nagging question: Is this just a mathematical curiosity? A formal exercise for the corner cases where our neat theory of diagonalization breaks down? The answer, you will be delighted to find, is a resounding no. The world, it turns out, is full of "defective" systems, and understanding them is not just an academic pursuit but a practical necessity. The appearance of a generalized eigenvector is nature's way of telling us that we are at a special, critical point—a place of resonance, instability, or structural change. In these situations, the system's behavior transcends simple exponential growth or decay and acquires a richer, more complex character. Let's explore some of these fascinating arenas where generalized eigenvectors take center stage.

The Rhythms of Evolution: Dynamical Systems

Perhaps the most direct and intuitive application of generalized eigenvectors is in the study of dynamical systems, both continuous and discrete. These are systems that evolve in time, from the orbits of planets to the fluctuations of the stock market.

Consider a system described by a set of linear differential equations, dxdt=Ax\frac{d\mathbf{x}}{dt} = A\mathbf{x}dtdx​=Ax. This could model anything from a network of chemical reactions to the flow of heat in a solid. If the matrix AAA is diagonalizable, the solution is a clean superposition of pure exponential terms, eλite^{\lambda_i t}eλi​t. Each mode of the system evolves independently with its own characteristic timescale. But what happens when AAA is defective? The Jordan form reveals that a new kind of behavior emerges. The solution will contain terms of the form teλtt e^{\lambda t}teλt. This is not just a mathematical artifact; it is the signature of a resonance. The term ttt represents a "secular growth" that amplifies the exponential behavior. A classic physical example is a critically damped oscillator—think of a car's suspension or a closing door damper. It is poised precisely at the boundary between oscillating and slowly decaying. This critical behavior, which returns to equilibrium in the fastest possible way without overshooting, is governed by a defective system matrix, and its time evolution is described by these polynomial-exponential functions that arise from generalized eigenvectors. The general machinery to describe this involves computing the matrix exponential etAe^{tA}etA, which, for a defective matrix, will explicitly contain these teλtt e^{\lambda t}teλt terms in its entries.

The same principle holds for discrete systems, which evolve in steps rather than continuously, described by recurrence relations like xk+1=Axk\mathbf{x}_{k+1} = A\mathbf{x}_kxk+1​=Axk​. Such models are ubiquitous in economics, computer science, and population biology. When the matrix AAA is defective, the solution for the state xk\mathbf{x}_kxk​ will include terms like kλkk \lambda^kkλk. Again, we see that the system's evolution is not a simple geometric progression. There is an additional layer of linear growth with each step, a tell-tale sign that the underlying structure contains a Jordan chain.

The Art of Control: Engineering and Systems Theory

In the world of engineering, we don't just want to observe systems; we want to control them. Whether launching a satellite, managing a power grid, or designing a self-driving car, the principles of control theory are paramount. Here, generalized eigenvectors are not a nuisance but a central concept that dictates the very limits of what is possible.

A cornerstone of modern control theory is the state-space representation, where a system's evolution is described by x˙=Ax+Bu\dot{x} = Ax + Bux˙=Ax+Bu. The change of basis to the Jordan form, x(t)=Vz(t)x(t) = V z(t)x(t)=Vz(t), transforms the system into its "natural coordinates," revealing its fundamental dynamics in the simplest possible way. This transformation is the key that unlocks deeper questions.

One such question is ​​controllability​​. Can we, by applying an external input uuu, steer the system from any initial state to any desired final state? Intuitively, you might think so, but the answer depends critically on the Jordan structure of AAA and its relationship to the input matrix BBB. Imagine a Jordan chain of generalized eigenvectors {v1,v2,…,vm}\{v_1, v_2, \dots, v_m\}{v1​,v2​,…,vm​}. This chain represents a cascade of interconnected modes. The input uuu can only influence the first mode v1v_1v1​ through its effect on v2v_2v2​, which is influenced by v3v_3v3​, and so on. To control the entire chain, the input must be able to "push" the last vector in the chain, vmv_mvm​. If the input vector happens to be structured in a way that it has no component along the direction needed to influence this last mode, the entire chain of states becomes uncontrollable. It's like trying to move a long train of coupled cars by pushing on a middle car that has a broken link to the cars behind it—the back of the train will never move. The system is fundamentally uncontrollable, a fact revealed entirely by the interplay of the input and the generalized eigenvector structure.

The dual concept to controllability is ​​observability​​. Can we determine the complete internal state of a system just by observing its outputs? This is the problem faced by a doctor trying to diagnose an illness from symptoms, or an engineer monitoring a complex machine through a few sensors. To solve this, we often build an "observer," which is a software model of the system that uses the real system's output to correct its own state estimate. The dynamics of the estimation error are governed by a matrix we can design, Ae=A−LCA_e = A - LCAe​=A−LC. A remarkable result from control theory is that for a single-output system, if we want to make the observer's error decay as quickly as possible by assigning the same eigenvalue multiple times, we are forced to create a Jordan block structure in our observer's error dynamics. The limitation of our "view" (a single output) imposes a defective structure on the very tool we build to see.

Of course, these theoretical ideas must be connected to practice. The task of actually computing these chains of generalized eigenvectors for large, complex systems is a significant challenge in computational engineering. Because the defining equation (A−λI)vk=vk−1(A - \lambda I)v_k = v_{k-1}(A−λI)vk​=vk−1​ involves a singular matrix (A−λI)(A - \lambda I)(A−λI), naive solution methods fail. Sophisticated numerical algorithms, such as deflation and bordering techniques, are required to navigate this singularity and extract the Jordan chains one vector at a time.

The Fabric of Reality: Quantum Physics

The final stop on our tour takes us from the engineered world to the very fabric of physical reality. In quantum mechanics, the state of a particle is described by a wavefunction, which is a vector in an infinite-dimensional Hilbert space H\mathcal{H}H, typically the space L2(R3)L^2(\mathbb{R}^3)L2(R3) of square-integrable functions. Physical observables like energy or momentum are represented by self-adjoint operators acting on this space. The spectral theorem for these operators is the mathematical backbone of quantum theory.

However, a serious problem arises. Some of the most important states in quantum mechanics, such as a particle with a definite momentum (described by a plane wave like eik⋅re^{i\mathbf{k}\cdot\mathbf{r}}eik⋅r) or a particle scattering off a potential, are not square-integrable. Their wavefunctions extend to infinity and do not belong to the Hilbert space H\mathcal{H}H. So how can they be eigenvectors of the momentum or energy operator? Are they unphysical?

The beautiful and profound answer is that these are ​​generalized eigenvectors​​. They do not live in the Hilbert space H\mathcal{H}H itself, but in a larger space that contains it. This concept is formalized by the ​​Rigged Hilbert Space (RHS)​​, also known as a Gel'fand triple: Φ⊂H⊂Φ′\Phi \subset \mathcal{H} \subset \Phi'Φ⊂H⊂Φ′. Here, Φ\PhiΦ is a smaller, "nicer" space of very well-behaved (e.g., rapidly decreasing) functions, which is dense in H\mathcal{H}H. The space Φ′\Phi'Φ′ is the "dual" of Φ\PhiΦ, consisting of continuous linear functionals—mathematical objects that take a function from Φ\PhiΦ and return a number.

Within this framework, a scattering state like a plane wave is realized not as a vector in H\mathcal{H}H, but as a functional in Φ′\Phi'Φ′. It is a generalized eigenvector of the Hamiltonian operator. This elegant construction provides a rigorous mathematical home for the continuous spectrum. It allows physicists to justify the common practice of writing completeness relations that mix sums over discrete, normalizable bound states (which are true eigenvectors in H\mathcal{H}H) with integrals over a continuum of non-normalizable scattering states (which are generalized eigenvectors in Φ′\Phi'Φ′).

In a way, the Hilbert space H\mathcal{H}H is like a library that can only hold books of finite length. The bound states are these finite books, sitting neatly on the shelves. The scattering states are like infinite scrolls that cannot possibly fit. The rigged Hilbert space formalism provides the card catalog, Φ′\Phi'Φ′, which contains a precise description and location for every scroll, even the infinite ones, allowing us to work with them in a perfectly well-defined way. This shows that the concept of generalized eigenvectors is not merely a tool for solving engineering problems, but a deep and essential component in our fundamental description of the universe.