try ai
Popular Science
Edit
Share
Feedback
  • Sum of Eigenvalues and the Trace: A Fundamental Invariant

Sum of Eigenvalues and the Trace: A Fundamental Invariant

SciencePediaSciencePedia
Key Takeaways
  • The sum of the eigenvalues of any square matrix is always equal to its trace, which is the sum of its main diagonal elements.
  • This equality is an invariant property, holding true regardless of whether the matrix is diagonalizable or has repeated or complex eigenvalues.
  • The relationship between trace and eigenvalues is a powerful tool for consistency checks and gaining quick insights in diverse fields like quantum mechanics, network analysis, and numerical algorithms.
  • Understanding this principle reveals a fundamental link between a matrix's simple arithmetic properties and its deep geometric and dynamic characteristics.

Introduction

In the world of mathematics, a matrix is more than just a grid of numbers; it's a powerful operator that transforms vectors, describing everything from simple rotations to the complex evolution of quantum systems. To understand a matrix's core behavior, we seek its eigenvalues—special values that represent the fundamental scaling factors of the transformation. However, finding these eigenvalues can be a complex algebraic task. What if there was a shortcut, a clue to the matrix's inner workings hidden in its most obvious feature? This article explores a remarkable and elegant truth: the sum of a matrix's eigenvalues is always equal to its trace, the simple sum of its diagonal elements. This principle acts as a bridge between a matrix's surface-level appearance and its profound geometric soul. In the sections that follow, we will first unravel the "Principles and Mechanisms" behind this theorem, exploring why it holds true for all square matrices. Then, we will journey through its "Applications and Interdisciplinary Connections" to see how this simple identity becomes an indispensable tool across physics, chemistry, computer science, and beyond.

Principles and Mechanisms

Imagine you are given a complicated machine, a black box that takes any vector in space and transforms it, stretching, shrinking, or rotating it into a new vector. This "machine" is what mathematicians call a ​​matrix​​. To truly understand this machine, you'd want to find its most fundamental operational characteristics. The most important of these are its ​​eigenvalues​​—special "stretching factors" that describe directions in space that are left unchanged by the transformation, only scaled.

Finding these eigenvalues, however, can be a rather tedious affair. It involves setting up a characteristic polynomial and then embarking on the often-tricky quest of finding its roots. But what if there was a shortcut? What if a deep secret about the machine's inner workings was hiding in plain sight?

An Unexpected Shortcut

Let's look at a matrix. Any square matrix. There's an incredibly simple number you can calculate from it in seconds: the ​​trace​​, written as Tr(A)\text{Tr}(A)Tr(A). It's just the sum of the numbers sitting on its main diagonal, from top-left to bottom-right. A child could calculate it.

Could this simple number, the trace, possibly have any connection to the profound, hard-won eigenvalues? It seems unlikely. One is about the "skin" of the matrix, its most obvious numbers; the other is about its deep geometric "soul."

Let's be good scientists and experiment. Consider the matrix A=(4−211)A = \begin{pmatrix} 4 & -2 \\ 1 & 1 \end{pmatrix}A=(41​−21​). A quick calculation shows its trace is Tr(A)=4+1=5\text{Tr}(A) = 4 + 1 = 5Tr(A)=4+1=5. If you go through the work of finding its eigenvalues, you'll discover they are λ1=2\lambda_1 = 2λ1​=2 and λ2=3\lambda_2 = 3λ2​=3. Now, let’s sum them: 2+3=52 + 3 = 52+3=5. They match!.

A coincidence? Let's try a bigger one from another problem, A=(−214−434−113)A = \begin{pmatrix} -2 & 1 & 4 \\ -4 & 3 & 4 \\ -1 & 1 & 3 \end{pmatrix}A=​−2−4−1​131​443​​. Its trace is easy: Tr(A)=−2+3+3=4\text{Tr}(A) = -2 + 3 + 3 = 4Tr(A)=−2+3+3=4. After a bit of algebraic heavy lifting to solve its characteristic equation, we find its eigenvalues are λ1=−1\lambda_1 = -1λ1​=−1, λ2=2\lambda_2 = 2λ2​=2, and λ3=3\lambda_3 = 3λ3​=3. And their sum? −1+2+3=4-1 + 2 + 3 = 4−1+2+3=4. It matches again!.

This is no coincidence. It is a fundamental and beautiful truth of linear algebra:

​​The sum of the eigenvalues of a matrix is always equal to its trace.​​

This relationship holds regardless of how messy the matrix looks. It's a hidden bridge between the simplest arithmetic you can do on a matrix and its most profound geometric properties.

The Invariant Sum: A Deeper Look

Why should this be true? The secret lies in the very polynomial we use to find the eigenvalues. The characteristic polynomial, p(λ)=det⁡(A−λI)p(\lambda) = \det(A - \lambda I)p(λ)=det(A−λI), is constructed in such a way that its roots are the eigenvalues.

Let's peek under the hood. For a general 2×22 \times 22×2 matrix A=(abcd)A = \begin{pmatrix} a & b \\ c & d \end{pmatrix}A=(ac​bd​), the characteristic polynomial is:

p(λ)=det⁡(a−λbcd−λ)=(a−λ)(d−λ)−bc=λ2−(a+d)λ+(ad−bc)p(\lambda) = \det \begin{pmatrix} a-\lambda & b \\ c & d-\lambda \end{pmatrix} = (a-\lambda)(d-\lambda) - bc = \lambda^2 - (a+d)\lambda + (ad-bc)p(λ)=det(a−λc​bd−λ​)=(a−λ)(d−λ)−bc=λ2−(a+d)λ+(ad−bc)

By a well-known result for polynomials (Viète's formulas), the sum of the roots λ1+λ2\lambda_1 + \lambda_2λ1​+λ2​ is equal to the negative of the coefficient of the λn−1\lambda^{n-1}λn−1 term (here, the λ1\lambda^1λ1 term). That coefficient is −(a+d)-(a+d)−(a+d). So, the sum of the eigenvalues is −(−(a+d))=a+d-(-(a+d)) = a+d−(−(a+d))=a+d. And what is a+da+da+d? It's precisely the trace of the matrix!

This pattern is not a special feature of 2×22 \times 22×2 matrices. For any n×nn \times nn×n matrix, its characteristic polynomial will always begin like this:

p(λ)=λn−Tr(A)λn−1+…p(\lambda) = \lambda^n - \text{Tr}(A)\lambda^{n-1} + \dotsp(λ)=λn−Tr(A)λn−1+…

The sum of the roots of this polynomial—the eigenvalues—will therefore always be Tr(A)\text{Tr}(A)Tr(A). This proof is powerful because it depends only on the definition of the characteristic polynomial, not on whether the matrix is simple or complex, real or imaginary, or even if it's "well-behaved" (diagonalizable).

There's another, wonderfully intuitive way to see this for a special class of matrices—the ones that are ​​diagonalizable​​. A matrix is diagonalizable if it can be written as A=PDP−1A = PDP^{-1}A=PDP−1, where DDD is a diagonal matrix containing the eigenvalues of AAA on its diagonal, and PPP is some invertible matrix. This is like saying we found the perfect coordinate system where the transformation AAA is just a simple scaling.

Now, we use a magical property of the trace: it is "cyclic." This means that for any compatible matrices, Tr(XYZ)=Tr(YXZ)=Tr(ZXY)\text{Tr}(XYZ) = \text{Tr}(YXZ) = \text{Tr}(ZXY)Tr(XYZ)=Tr(YXZ)=Tr(ZXY). You can cycle the order of the matrices inside the trace without changing the result. Applying this to our diagonalizable matrix:

Tr(A)=Tr(PDP−1)=Tr(P−1PD)\text{Tr}(A) = \text{Tr}(PDP^{-1}) = \text{Tr}(P^{-1}PD)Tr(A)=Tr(PDP−1)=Tr(P−1PD)

But P−1PP^{-1}PP−1P is just the identity matrix III. So, we get:

Tr(A)=Tr(ID)=Tr(D)\text{Tr}(A) = \text{Tr}(ID) = \text{Tr}(D)Tr(A)=Tr(ID)=Tr(D)

And what is the trace of the diagonal matrix DDD? It's just the sum of its diagonal elements, which, by definition, are the eigenvalues of AAA!. This elegant argument shows that changing the basis (the PPP and P−1P^{-1}P−1 part) just shuffles the numbers around inside the matrix, but it cannot change the sum of the diagonal elements. The trace is an ​​invariant​​.

Complications and Curiosities

What happens when things get more complicated? The beauty of this law is its robustness.

​​Repeated Eigenvalues​​: What if an eigenvalue appears more than once? The rule is simple: you must count each eigenvalue according to its ​​algebraic multiplicity​​—the number of times it appears as a root of the characteristic polynomial. For instance, if a 5×55 \times 55×5 matrix has an eigenvalue of 222 with algebraic multiplicity 333 and an eigenvalue of 555 with multiplicity 222, its trace isn't 2+5=72+5=72+5=7. It's (2+2+2)+(5+5)=3×2+2×5=16(2+2+2) + (5+5) = 3 \times 2 + 2 \times 5 = 16(2+2+2)+(5+5)=3×2+2×5=16.

​​Non-Diagonalizable Matrices​​: What if a matrix isn't diagonalizable? This happens when a matrix is "defective," lacking enough distinct directions to form a full basis of eigenvectors. Our first proof using the characteristic polynomial didn't care about diagonalizability, so the rule must still hold. Indeed it does. For example, if you are told a 2×22 \times 22×2 matrix is not diagonalizable and its trace is 141414, you immediately know something profound. A non-diagonalizable 2×22 \times 22×2 matrix must have a repeated eigenvalue. Let's call it λ\lambdaλ. Then the sum of eigenvalues is λ+λ=2λ\lambda + \lambda = 2\lambdaλ+λ=2λ. We know this sum equals the trace, so 2λ=142\lambda = 142λ=14, which means the single, repeated eigenvalue must be 777. The theorem holds perfectly.

​​Complex Eigenvalues​​: A matrix with only real numbers can describe a transformation like a rotation. A pure rotation doesn't stretch any vector in real space, so how can it have a real eigenvalue? It doesn't. Its eigenvalues are complex numbers. But nature is elegant. For any real matrix, if a complex number a+bia + bia+bi is an eigenvalue, its complex conjugate a−bia - bia−bi must also be an eigenvalue. They always appear in pairs. When you sum a conjugate pair, the imaginary parts cancel out: (a+bi)+(a−bi)=2a(a+bi) + (a-bi) = 2a(a+bi)+(a−bi)=2a. This guarantees that the trace of a real matrix is always a real number, as it must be. If you're told a real matrix from a circuit model has an eigenvalue of −0.15+2.5i-0.15 + 2.5i−0.15+2.5i, you don't need to find the matrix itself to know its trace. You know the other eigenvalue must be −0.15−2.5i-0.15 - 2.5i−0.15−2.5i. The trace is their sum: −0.3-0.3−0.3.

A Detective's Tool

This theorem is far more than a mathematical party trick; it's a powerful detective's tool. It provides a fundamental constraint, a clue you get for free just by looking at the matrix.

Suppose a 3×33 \times 33×3 matrix has a trace of 666. You've done some hard work and found two of its eigenvalues are 111 and 222. Do you need to go back to the drawing board to find the third? Absolutely not. The "conservation of trace" tells you that 1+2+λ3=61 + 2 + \lambda_3 = 61+2+λ3​=6. A trivial bit of arithmetic reveals λ3=3\lambda_3 = 3λ3​=3.

The connections can be even deeper, linking disparate concepts in linear algebra. Imagine a 3×33 \times 33×3 diagonalizable matrix that has a rank of 1. You're told its only non-zero eigenvalue is 555. What is its trace? This seems like too little information, but it's not.

  • ​​Rank​​ tells you the dimension of the output space. A rank of 1 means the matrix squashes all of 3D space onto a single line.
  • This implies there must be a whole plane of vectors that get mapped to the origin, 0\mathbf{0}0. If a vector v\mathbf{v}v gets mapped to the origin, it means Av=0A\mathbf{v} = \mathbf{0}Av=0. We can write this as Av=0⋅vA\mathbf{v} = 0 \cdot \mathbf{v}Av=0⋅v.
  • This is the very definition of an eigenvector with an eigenvalue of 000! The dimension of this plane of squashed vectors (the nullity) is 2, which means the eigenvalue 000 has a geometric multiplicity of 2.
  • Since the matrix is diagonalizable, algebraic multiplicity equals geometric multiplicity. So, 000 is an eigenvalue counted twice.
  • Our full set of three eigenvalues is therefore {5,0,0}\{5, 0, 0\}{5,0,0}. The trace, their sum, is simply 555. By chasing a chain of logic from rank to nullity to eigenvalues, the trace was revealed.

The trace, that simple sum of diagonal numbers, is not so simple after all. It carries within it a deep truth about the matrix's behavior. It is an invariant—a quantity that remains fixed even when we change our point of view (our coordinate system). In physics and all of science, the search for such invariants is a search for the fundamental laws of nature. The relationship between trace and the sum of eigenvalues is a beautiful, self-contained example of such a profound principle, accessible to anyone who dares to look.

Applications and Interdisciplinary Connections

We have uncovered a remarkable fact, a sort of hidden bridge between the immediately obvious and the deeply profound. On one side, we have the trace of a matrix—a quantity so simple you can calculate it in seconds, just by summing the numbers on the diagonal. On the other side, we have the eigenvalues—the secret stretching factors of the transformation, the characteristic "notes" a system can play, which can be devilishly hard to find. The statement that these two quantities are equal, Tr(A)=∑iλi\text{Tr}(A) = \sum_i \lambda_iTr(A)=∑i​λi​, is one of those wonderfully surprising truths in mathematics. It feels like a magic trick. But it is far more than a trick; it is a fundamental tool that allows us to peer into the heart of complex systems across science and engineering. Let us now take a journey and see where this simple idea leads us.

The Algebra of Transformations

Before we venture into the physical world, let's play with the idea in its native home: the world of abstract transformations. If a matrix AAA represents some action, what can the trace tell us about related actions, like applying the action multiple times, or undoing it, or letting it evolve continuously?

Suppose we apply a transformation over and over again. What is the character of A2A^2A2, or A3A^3A3? The eigenvalues of AkA^kAk are simply λik\lambda_i^kλik​, the original eigenvalues raised to the same power. This means the trace of AkA^kAk is just the sum of the powered eigenvalues: Tr(Ak)=∑iλik\text{Tr}(A^k) = \sum_i \lambda_i^kTr(Ak)=∑i​λik​. So, without knowing the full matrix AkA^kAk, we can still find the sum of its diagonal elements just by knowing the original eigenvalues. This provides a powerful shortcut in understanding the cumulative effect of a repeated process.

The same logic applies to other functions of a matrix. What about the inverse transformation, A−1A^{-1}A−1? Its eigenvalues are 1/λi1/\lambda_i1/λi​. Therefore, the trace of the inverse matrix is simply the sum of the reciprocals of the original eigenvalues, Tr(A−1)=∑i1λi\text{Tr}(A^{-1}) = \sum_i \frac{1}{\lambda_i}Tr(A−1)=∑i​λi​1​. This gives us a quick measure of the "total retracting power" of the inverse transformation, again without the fuss of actually computing the inverse matrix itself.

Perhaps most beautifully, this extends to the matrix exponential, eAe^AeA. This object is not just a mathematical curiosity; it is the mathematical engine that drives continuous evolution in countless physical systems, from the decay of radioactive nuclei to the vibrations in a crystal lattice. The eigenvalues of eAe^AeA are eλie^{\lambda_i}eλi​. Consequently, the trace of the matrix exponential is Tr(eA)=∑ieλi\text{Tr}(e^A) = \sum_i e^{\lambda_i}Tr(eA)=∑i​eλi​. This connects the trace, a static property of the matrix, to the collective behavior of a dynamic system evolving through time.

The Symphony of Physics and Chemistry

This connection to dynamics is where our simple rule truly begins to sing. Consider a system of coupled oscillators, perhaps masses on springs, or an electrical circuit. Its behavior over time can be described by a system of differential equations, x′(t)=Ax(t)\mathbf{x}'(t) = A \mathbf{x}(t)x′(t)=Ax(t). The solutions to this equation often take the form of "modes," where the entire system oscillates or decays together at a specific rate. These rates are, in fact, the eigenvalues of the matrix AAA. If we observe the system and identify its fundamental modes of behavior, we have effectively measured its eigenvalues. By simply summing these rates, we can determine the trace of the underlying matrix AAA that governs the entire complex interaction, giving us a crucial piece of information about the system's overall stability. The trace, in this context, relates to the divergence of the system's state-space flow—whether volumes in this abstract space are, on average, expanding or contracting.

The idea finds one of its most profound expressions in the quantum world. In quantum mechanics, physical observables like energy are represented by Hermitian matrices (or operators). The eigenvalues of the Hamiltonian matrix, H\mathbf{H}H, are the possible energy levels that the system—be it an atom or a molecule—is allowed to occupy. They are the fundamental notes in the quantum symphony. The trace of the Hamiltonian, Tr(H)\text{Tr}(\mathbf{H})Tr(H), is therefore the sum of all possible energy levels. In fields like quantum chemistry, this provides an immediate check on theoretical models. For instance, in the Hückel model of a molecule, the Hamiltonian matrix is constructed from simple rules based on chemical bonds. Calculating its trace is trivial—it's just the sum of the diagonal elements, which are all equal to a parameter α\alphaα. This simple sum must equal the sum of the calculated orbital energies (the eigenvalues), providing a robust internal consistency check on the theory itself.

Furthermore, for a quantum operator represented by a normal matrix AAA, the quantity Tr(AA∗)\text{Tr}(A A^*)Tr(AA∗) has a direct physical meaning. The eigenvalues of the matrix AA∗A A^*AA∗ are the squared magnitudes of the eigenvalues of AAA, that is, ∣λi∣2|\lambda_i|^2∣λi​∣2. The sum of these, Tr(AA∗)=∑i∣λi∣2\text{Tr}(A A^*) = \sum_i |\lambda_i|^2Tr(AA∗)=∑i​∣λi​∣2, often represents a total probability or a total intensity, summed over all possible states or modes of the system. Once again, a simple sum over a diagonal gives a physically meaningful total quantity.

Weaving Networks and Building Algorithms

The reach of our eigenvalue-trace relationship extends beyond the continuous world of physics into the discrete realms of networks and computation. Imagine a network—of computers, friends, or cities. We can represent it with an adjacency matrix, where an entry AijA_{ij}Aij​ tells us if node iii is connected to node jjj. The trace of this matrix, Tr(A)=∑iAii\text{Tr}(A) = \sum_i A_{ii}Tr(A)=∑i​Aii​, has a wonderfully simple interpretation: it is the total number of self-loops in the network, the number of nodes that are connected to themselves. And, of course, this must be equal to the sum of the eigenvalues of the adjacency matrix. This is perhaps the most direct link imaginable: a visible feature of the network (self-loops) is directly encoded as the trace, which in turn is tied to the network's entire spectral personality.

This property is not merely descriptive; it is a workhorse in the field of numerical linear algebra, where we build the algorithms that actually find those elusive eigenvalues. In a technique called "deflation," once we find one eigenvalue λ1\lambda_1λ1​ and its corresponding eigenvectors, we can construct a new, "deflated" matrix that contains all the remaining eigenvalues. The construction cleverly removes λ1\lambda_1λ1​ from the spectrum. How do we know it worked? We can check the trace! The trace of the new matrix must be precisely the trace of the old matrix minus the eigenvalue we just removed: Tr(A1)=Tr(A)−λ1\text{Tr}(A_1) = \text{Tr}(A) - \lambda_1Tr(A1​)=Tr(A)−λ1​. This theoretical identity becomes a practical step in an algorithm, a quick and elegant sanity check that guides the computational process.

A Glimpse Towards the Horizon

Finally, the relationship between trace and eigenvalues serves as a foundation for some of the most powerful and advanced results in matrix theory. Consider a profoundly difficult question: if you have two systems, described by Hermitian matrices AAA and BBB, and you know their individual spectra (their eigenvalues), what can you say about the spectrum of the combined system, A+BA+BA+B? The eigenvalues of A+BA+BA+B are not simply the sums of the eigenvalues of AAA and BBB. The interaction is far more complex.

However, deep theorems related to a concept called "majorization" provide a stunning answer. They tell us that while we may not know the exact eigenvalues of A+BA+BA+B, we can place a firm upper bound on quantities like the trace of its exponential, Tr(eA+B)\text{Tr}(e^{A+B})Tr(eA+B). This maximum possible value is determined by combining the eigenvalues of AAA and BBB in a specific, ordered way. This allows us, for example, to calculate the maximum possible "response" of a combined system without ever needing to know the messy details of its final configuration. It is a predictive tool of immense power, used in fields from optimization theory to quantum information.

From a simple shortcut in matrix algebra to a stability criterion in physics, a consistency check in chemistry, a structural invariant in network theory, and a predictive bound in advanced mathematics, the equality of trace and the sum of eigenvalues is a golden thread. It ties together the seen and the unseen, the simple and the complex, revealing the underlying unity and beauty that governs the mathematical description of our world.