try ai
Popular Science
Edit
Share
Feedback
  • Lidskii-Wielandt Theorem

Lidskii-Wielandt Theorem

SciencePediaSciencePedia
Key Takeaways
  • The Lidskii-Wielandt theorem states that the eigenvalues of a sum of Hermitian matrices are majorized by the sum of their individual eigenvalues.
  • Majorization formalizes the idea that the process of matrix addition results in an eigenvalue spectrum that is less "spread out" or more uniform.
  • This theorem completely defines the entire space of possible eigenvalues for a matrix sum, known as the Horn polytope, not just its outer limits.
  • The principle provides powerful tools for predicting outcomes in quantum mechanics, perturbation theory, and numerical stability analysis.

Introduction

When we combine two systems, how do the properties of the new, combined system relate to its original components? In fields from quantum mechanics to structural engineering, systems are often described by matrices, and their fundamental properties by eigenvalues. A central problem is therefore to understand the eigenvalues of a sum of two matrices, A+BA+BA+B, given the eigenvalues of AAA and BBB. A simple addition of the eigenvalues is rarely correct, as the complex "interference" between matrices changes the outcome. While early results like Weyl's inequalities provided crucial boundaries, they fail to capture the full picture, leaving a mysterious gap between what is possible and what is actually observed. This article closes that gap by exploring a profound and elegant master rule. In the first chapter, "Principles and Mechanisms," we will introduce the concept of majorization and the powerful Lidskii-Wielandt theorem that uses it. Following this, the "Applications and Interdisciplinary Connections" chapter will demonstrate how this abstract mathematical principle provides a powerful toolkit for solving concrete problems in physics, engineering, and beyond.

Principles and Mechanisms

The Summing-Up Problem: More Than Just Adding Numbers

Let's begin with a question that seems, on the surface, almost childishly simple. Suppose you have two objects, and you know their fundamental properties. You then combine these two objects. What can you say about the properties of the new, combined object? If you mix a bucket of blue paint and a bucket of yellow paint, you get green paint. The outcome is predictable. But what if the "objects" are more abstract, like the mathematical operators that describe physical systems?

In physics and engineering, we often represent systems with matrices. For a special, important class of matrices called ​​Hermitian matrices​​, their most fundamental properties are their ​​eigenvalues​​. For a quantum system, these eigenvalues might represent the possible energy levels an electron can occupy. For a vibrating structure, they might be the natural frequencies of oscillation. So, our simple question becomes: if we have two Hermitian matrices, AAA and BBB, and we know their eigenvalues, what are the possible eigenvalues for their sum, C=A+BC = A+BC=A+B?

You might guess that if the eigenvalues of AAA are, say, {10,1}\{10, 1\}{10,1} and the eigenvalues of BBB are {5,2}\{5, 2\}{5,2}, then the eigenvalues of A+BA+BA+B might be {10+5,1+2}={15,3}\{10+5, 1+2\} = \{15, 3\}{10+5,1+2}={15,3}. This is what happens if the matrices AAA and BBB are simple number lines that you can just add point-by-point (or more technically, if they are simultaneously diagonalizable). But matrices are more complex than that. They have "directions" associated with their eigenvalues, called eigenvectors. If the directions of AAA don't align with the directions of BBB, adding them involves a kind of "interference," much like when two sets of water waves cross each other. The resulting pattern of peaks and troughs is not just the sum of the individual wave heights at each point.

The great mathematician Hermann Weyl was one of the first to provide a rigorous answer. He established famous inequalities that act like a fence, setting absolute limits on the eigenvalues of the sum. For instance, the largest eigenvalue of C=A+BC=A+BC=A+B, denoted λ1(C)\lambda_1(C)λ1​(C), cannot be larger than the sum of the largest eigenvalues of AAA and BBB: λ1(C)≤λ1(A)+λ1(B)\lambda_1(C) \le \lambda_1(A) + \lambda_1(B)λ1​(C)≤λ1​(A)+λ1​(B). Similarly, his rules give bounds for all the other eigenvalues. For example, for a 4×44 \times 44×4 case, the smallest eigenvalue, λ4(C)\lambda_4(C)λ4​(C), is bounded from above by combinations like λ4(A)+λ1(B)\lambda_4(A) + \lambda_1(B)λ4​(A)+λ1​(B). This particular bound represents a sort of "worst-case" scenario, pairing the smallest eigenvalue from one matrix with the largest from the other.

Beyond the Fence: A More Subtle Law

Weyl's inequalities are beautiful and correct. They guarantee that the eigenvalues of the sum can't wander off to infinity; they are contained. But are they the full story? Let’s imagine a hypothetical scenario to test this. Consider two 4×44 \times 44×4 matrices, AAA and BBB, whose eigenvalues are specifically chosen descending arithmetic progressions. Let the eigenvalues of AAA be {a,a−d,a−4d,a−6d}\{a, a-d, a-4d, a-6d\}{a,a−d,a−4d,a−6d} and those of BBB be {b,b−d,b−3d,b−5d}\{b, b-d, b-3d, b-5d\}{b,b−d,b−3d,b−5d} for some positive number ddd.

Using Weyl's general formula, we can calculate the absolute upper bound for the smallest eigenvalue of the sum, λ4(A+B)\lambda_4(A+B)λ4​(A+B). It turns out to be a+b−6da+b-6da+b−6d. This bound is "sharp" in the sense that one can always cook up some matrices AAA and BBB with these exact eigenvalues that hit this limit.

But now, let's consider a specific pair of matrices. Imagine they are very simple: diagonal matrices, which means their eigenvalues are just the numbers on their main diagonal. However, let's arrange them so their internal "directions" are misaligned in a particular way. Let A=diag(a,a−d,a−4d,a−6d)A = \mathrm{diag}(a, a-d, a-4d, a-6d)A=diag(a,a−d,a−4d,a−6d) and B=diag(b−3d,b,b−5d,b−d)B = \mathrm{diag}(b-3d, b, b-5d, b-d)B=diag(b−3d,b,b−5d,b−d). When we add them, we just add the corresponding diagonal entries. The smallest resulting eigenvalue is easily found to be a+b−9da+b-9da+b−9d.

Look at that! The actual result, a+b−9da+b-9da+b−9d, is a full 3d3d3d less than the general bound of a+b−6da+b-6da+b−6d that Weyl's inequality promised. There is a gap. This tells us something crucial. While Weyl's fence is the absolute outer boundary, the actual possibilities often live in a much smaller, more refined territory inside that fence. This puzzle hints that a deeper, more precise principle must be at work, a master rule that can account for this gap.

The Language of Dominance: An Introduction to Majorization

To understand this master rule, we need a new language, a wonderfully intuitive concept called ​​majorization​​. In essence, majorization is a precise way of saying that one list of numbers is "more spread out" or "less uniform" than another, while both lists add up to the same total.

Imagine you have a fixed amount of money, say 100,todistributeamongfourpeople.Onepossibledistribution,let′scallitvector100, to distribute among four people. One possible distribution, let's call it vector 100,todistributeamongfourpeople.Onepossibledistribution,let′scallitvectorx,is, is ,is{97, 1, 1, 1}.Thisisaveryunequaldistribution.Anotherdistribution,vector. This is a very unequal distribution. Another distribution, vector .Thisisaveryunequaldistribution.Anotherdistribution,vectory,couldbe, could be ,couldbe{40, 30, 20, 10}.Athird,vector. A third, vector .Athird,vectorz,couldbe, could be ,couldbe{25, 25, 25, 25},perfectequality.Wewouldintuitivelysaythat, perfect equality. We would intuitively say that ,perfectequality.Wewouldintuitivelysaythatxismore"major"or"dominant"initsinequalitythanis more "major" or "dominant" in its inequality thanismore"major"or"dominant"initsinequalitythany,and, and ,andyismoredominantthanis more dominant thanismoredominantthanz$.

Majorization formalizes this. Let's take two vectors of numbers, xxx and yyy, both of the same size nnn. First, we sort the numbers in each vector from largest to smallest, which we denote by x↓x^\downarrowx↓ and y↓y^\downarrowy↓. We say that ​​xxx majorizes yyy​​, written as x≻yx \succ yx≻y, if the following two conditions hold:

  1. The sum of the kkk largest elements of xxx is greater than or equal to the sum of the kkk largest elements of yyy, for every kkk from 111 up to n−1n-1n−1. ∑i=1kxi↓≥∑i=1kyi↓for k=1,…,n−1\sum_{i=1}^k x_i^\downarrow \ge \sum_{i=1}^k y_i^\downarrow \quad \text{for } k=1, \dots, n-1∑i=1k​xi↓​≥∑i=1k​yi↓​for k=1,…,n−1

  2. The sum of all elements is equal. ∑i=1nxi↓=∑i=1nyi↓\sum_{i=1}^n x_i^\downarrow = \sum_{i=1}^n y_i^\downarrow∑i=1n​xi↓​=∑i=1n​yi↓​

The first rule captures the "more spread out" idea: the top earners in the more unequal distribution always have at least as much as the top earners in the more equal one. The second rule is a conservation law: the total amount of "stuff" (energy, money, or the sum of eigenvalues) is the same.

The Lidskii-Wielandt Theorem: The Master Rule

Armed with the language of majorization, we can now state the master rule, a profound result known as the ​​Lidskii-Wielandt theorem​​ (part of a collection of results by Alfred Horn, Viktor Lidskii, Helmut Wielandt, and others). For any two n×nn \times nn×n Hermitian matrices AAA and BBB, it states:

λ(A)+λ(B)≻λ(A+B)\lambda(A) + \lambda(B) \succ \lambda(A+B)λ(A)+λ(B)≻λ(A+B)

In words: the list of eigenvalues of the sum, λ(A+B)\lambda(A+B)λ(A+B), is majorized by the list formed by simply adding the corresponding sorted eigenvalues of AAA and BBB.

This is a statement of remarkable power and beauty. It tells us that the process of adding matrices, with all its complicated "interference" of eigenvectors, has a universal statistical effect: it tends to average things out. The resulting spectrum of eigenvalues is always less spread out (or at most equally spread out) than the spectrum you'd get by naively adding the eigenvalues. The act of matrix addition pulls the extreme eigenvalues inward, smoothing out the distribution. All of Weyl's inequalities are now seen for what they are: simple consequences derived from the k=1k=1k=1 and other partial sum conditions of this single, elegant majorization relation.

Exploring the "Space of Possibilities"

The true power of this theorem isn't just in setting a new, tighter bound. It defines the entire ​​space of possibilities​​. To see this, let's consider a charming problem. Suppose we have two 3×33 \times 33×3 Hermitian matrices, AAA and BBB, which happen to have the exact same set of eigenvalues: {α,β,γ}\{\alpha, \beta, \gamma\}{α,β,γ}, with α>β>γ\alpha > \beta > \gammaα>β>γ. What are the possible values for the middle eigenvalue, λ2(A+B)\lambda_2(A+B)λ2​(A+B)?

A direct application of the majorization inequalities reveals that λ2(A+B)\lambda_2(A+B)λ2​(A+B) is not fixed to a single value. Instead, it can be any value within a specific range: [β+γ,α+β][\beta+\gamma, \alpha+\beta][β+γ,α+β]. The size of this range of possibilities is (α+β)−(β+γ)=α−γ(\alpha+\beta) - (\beta+\gamma) = \alpha - \gamma(α+β)−(β+γ)=α−γ, which is the spread of the original eigenvalues. The specific outcome depends entirely on the relative geometric orientation of the eigenvectors of AAA and BBB, a factor that is completely invisible if you only know their eigenvalues.

What is even more astonishing is that this works both ways. The Horn-Lidskii theorem confirms that any vector λ\lambdaλ that is majorized by the eigenvalue sums (λ(A)+λ(B))(\lambda(A) + \lambda(B))(λ(A)+λ(B)) is an achievable spectrum. In other words, you can always find some matrix BBB (by rotating its eigenvectors relative to AAA's) that will produce that exact spectrum λ\lambdaλ for the sum A+BA+BA+B. This means majorization doesn't just provide a boundary; it provides a complete and total characterization of the solution space.

This allows us to transform bewildering matrix problems into solvable optimization problems. For instance, if we have a 4×44 \times 44×4 Hermitian matrix built from blocks, with the diagonal blocks having known spectra, what is the maximum possible product of its two smallest eigenvalues, λ3λ4\lambda_3 \lambda_4λ3​λ4​? Instead of fiddling with matrices, we can simply find the vector (λ1,λ2,λ3,λ4)(\lambda_1, \lambda_2, \lambda_3, \lambda_4)(λ1​,λ2​,λ3​,λ4​) that maximizes the product λ3λ4\lambda_3 \lambda_4λ3​λ4​ subject to the majorization constraints. This converts the problem into a calculus exercise, leading to the perhaps surprising answer that the maximum occurs when the eigenvalues "pinch" together, with λ1=λ2\lambda_1=\lambda_2λ1​=λ2​ and λ3=λ4\lambda_3=\lambda_4λ3​=λ4​, yielding a maximum product of 36 in a specific case.

A Web of Connections: Perturbations and Interactions

This might seem like a beautiful mathematical curiosity, but these ideas form the bedrock of how we analyze complex systems.

One of the most vital applications is in ​​perturbation theory​​. In quantum mechanics, we often start with a simple system we can solve exactly (like a hydrogen atom), represented by a matrix A0A_0A0​. Then we introduce a small, complex interaction (like an external magnetic field), represented by a matrix EEE. The new, real-world system is described by A0+EA_0 + EA0​+E. We desperately want to know how the energy levels change. The Lidskii-Wielandt theorem gives us a direct answer. The maximum possible increase in the sum of the first kkk energy levels is precisely the sum of the kkk largest eigenvalues of the perturbation matrix EEE itself. This tells us exactly the "worst-case" impact of the perturbation.

The theorem also illuminates the behavior of coupled systems. Imagine a system composed of two parts, AAA and CCC, that initially don't interact. Its matrix representation is block-diagonal, H0=(A00C)H_0 = \begin{pmatrix} A & 0 \\ 0 & C \end{pmatrix}H0​=(A0​0C​), and its eigenvalues are just the eigenvalues of AAA and CCC pooled together. Now, we introduce an interaction between them, described by an off-diagonal block BBB, so the full system is H=(ABB∗C)H = \begin{pmatrix} A & B \\ B^* & C \end{pmatrix}H=(AB∗​BC​). How do the eigenvalues of HHH relate to those of AAA and CCC? The theorem again provides the answer, showing how the "interaction strength," captured by the singular values of the coupling matrix BBB, constrains the final eigenvalues. The relationships are so tight that knowing just one of the final system's eigenvalues can allow you to place sharp bounds on the others, revealing a delicate, interwoven web of constraints.

This unifying principle even extends beyond Hermitian matrices and their eigenvalues. A similar majorization relationship governs the ​​singular values​​ of general rectangular matrices, which measure a matrix's "magnifying power" in different directions. The familiar triangle inequality for matrix norms, ∥A+B∥≤∥A∥+∥B∥\|A+B\| \le \|A\| + \|B\|∥A+B∥≤∥A∥+∥B∥, is nothing more than the simplest (k=1k=1k=1) case of a weak majorization inequality for singular values.

From a simple question about adding matrices, we have journeyed to a profound concept that brings unity to a vast landscape of linear algebra. The principle of majorization reveals the statistical tendency of nature to average things out when systems are combined, providing not just bounds but a complete map of the possible outcomes. It is a stunning example of the hidden regularity and inherent beauty that mathematics uncovers in the world.

Applications and Interdisciplinary Connections

Beyond the theoretical framework of majorization and inequalities, the Lidskii-Wielandt theorem and its consequences provide powerful tools with practical applications across numerous scientific and engineering disciplines. These principles are not merely abstract mathematical constructs; they are instrumental in solving concrete problems, from predicting the energy levels of a quantum system to ensuring the stability of a numerical algorithm. This section will explore these interdisciplinary connections, demonstrating how the theorem is applied to derive practical bounds, characterize solution spaces, and analyze complex systems.

The Art of Bounding: Predicting the Extremes

Imagine you're an engineer or a physicist. You have two systems, AAA and BBB, that you understand perfectly. You know their characteristic frequencies, or their energy levels—in our language, their eigenvalues. Now, you're going to combine them. What can you say about the new system, A+BA+BA+B? Will it be stable? What is the highest energy it can possibly have? What is the lowest?

You might naively think the largest eigenvalue of the sum is just the sum of the largest eigenvalues. Sometimes it is, but that's just the best-case-scenario ceiling. The consequences of the Lidskii-Wielandt theorem, like the Weyl inequalities, give us much more subtle and powerful information. They provide sharp bounds, meaning they are the tightest possible without more information. They tell us the absolute limits of possibility.

For instance, the theorem gives us a beautiful, if slightly counter-intuitive, formula for the minimum possible value of the largest eigenvalue of the sum. It's not the sum of the smallest eigenvalues, but a specific mix-and-match pairing of the largest of one system with the smallest of the other. Similarly, it allows us to calculate the minimum and maximum possible values for sums of any subset of eigenvalues—for example, the minimum sum of the two largest, the maximum sum of the three smallest, or even some in the middle. This isn't just an exercise; it's about predicting the range of behaviors for a combined system and identifying potential failure points or performance peaks before you even build it.

Beyond the Extremes: Charting the Entire Space of Possibilities

But these bounds are just the signposts at the edge of the territory. The full power of the Lidskii-Wielandt theorem is that it describes the entire territory. The vector of eigenvalues for the sum, γ=(γ1,γ2,…,γn)\gamma = (\gamma_1, \gamma_2, \dots, \gamma_n)γ=(γ1​,γ2​,…,γn​), can't just be any random set of numbers that fits within the bounds. It must live inside a specific, beautifully structured geometric object: a convex polytope, sometimes called the Horn polytope. The inequalities of the theorem are the flat faces that carve out this shape in nnn-dimensional space.

The vertices of this "polytope of possibilities" are formed by simply adding the eigenvalues of AAA to all possible permutations of the eigenvalues of BBB. This gives us a complete picture. With this knowledge, we can ask much more refined questions. For example, in quantum mechanics or condensed matter physics, the "spectral gap"—the difference between two adjacent energy levels, like γ2−γ3\gamma_2 - \gamma_3γ2​−γ3​—is critically important. It can determine whether a material is a conductor or an insulator, or how stable a quantum state is to perturbations. By exploring the vertices of the eigenvalue polytope, we can find the absolute minimum possible value for this gap, even if it's zero, which signals a degeneracy or crossing of energy levels.

From Eigenvalues to a Matrix's "Personality": Probing with Non-linear Questions

So far, we've talked only about the eigenvalues themselves. But often, what we truly care about are more complex properties of a system that depend on the eigenvalues. Think about the total energy of a vibrating system, which might be proportional to the sum of the squares of its frequencies, ∑iγi2\sum_i \gamma_i^2∑i​γi2​. Or think of the partition function in statistical mechanics, which is the sum of exponentials, ∑iexp⁡(−βγi)\sum_i \exp(-\beta \gamma_i)∑i​exp(−βγi​). These are non-linear functions of the eigenvalues.

This is where the idea of majorization comes to the forefront. The statement γ≺α+β\gamma \prec \alpha + \betaγ≺α+β means that the eigenvalues of the sum are "less spread out" than the simple sum of the component eigenvalues. For any convex function fff, a wonderful theorem by Schur tells us that if x≺yx \prec yx≺y, then ∑if(xi)≤∑if(yi)\sum_i f(x_i) \le \sum_i f(y_i)∑i​f(xi​)≤∑i​f(yi​). Since f(x)=x2f(x)=x^2f(x)=x2 is convex, we can immediately find the maximum possible value for the trace of (A+B)2(A+B)^2(A+B)2, a quantity related to the system's energy. It happens when the eigenvalues are as spread out as possible, i.e., when γ\gammaγ is equal to α+β\alpha+\betaα+β sorted descendingly.

This idea is incredibly powerful. The function f(x)=exp⁡(x)f(x) = \exp(x)f(x)=exp(x) is also convex. This means we can find the maximum value for the trace of the matrix exponential, Tr(exp⁡(A+B))\text{Tr}(\exp(A+B))Tr(exp(A+B)). This quantity is not just a mathematical curiosity; it is directly related to the partition function in quantum statistical mechanics, which is the cornerstone for calculating all thermodynamic properties of a system. The theorem allows us to find the upper bound on this fundamental quantity, given only the energy spectra of the constituent parts.

Other functions, like the determinant (the product of eigenvalues), are not convex, but the principle remains. By knowing the exact polytope of allowed eigenvalues, we can perform an optimization to find the maximum possible determinant, a value related to how the matrix scales volume. The theorem provides the playground; we just have to find the highest point within it.

Interdisciplinary Bridges: From Abstract Math to Concrete Science

You've already seen how these ideas connect to quantum mechanics. But the bridges don't stop there.

​​Stability and Perturbation Theory:​​ Imagine you have a matrix AAA representing a stable system. You then apply a small 'perturbation', represented by a matrix DDD. The new system is A−DA-DA−D. A critical question for any engineer or numerical analyst is: how much can the eigenvalues of AAA change? The Hoffman-Wielandt inequality, a cousin of our main theorem, gives a beautiful and profound answer. The 'distance' between the two matrices (measured by the trace norm, ∥A−D∥1\|A-D\|_1∥A−D∥1​) is always greater than or equal to the 'distance' between their spectra (the sum of the absolute differences of their corresponding eigenvalues, ∑i∣λi(A)−λi(D)∣\sum_i |\lambda_i(A) - \lambda_i(D)|∑i​∣λi​(A)−λi​(D)∣). This gives us a solid guarantee: if the perturbation matrix is small in norm, the eigenvalues cannot have shifted by a large amount. This provides a rigorous foundation for a huge swath of perturbation theory and ensures that our numerical simulations are not leading us astray.

​​Symmetry and Simplification:​​ The real world is often structured by symmetries. In physics, symmetries lead to conserved quantities and shared eigenvectors. What happens to our eigenvalue problem if we know that matrices AAA and BBB share a common eigenvector? This extra piece of information acts like a key, allowing us to 'unlock' and separate that part of the problem. The matrices become block-diagonal in a shared basis, and our nnn-dimensional problem breaks down into a 1-dimensional problem and an (n−1)(n-1)(n−1)-dimensional one. We can then apply the same bounding principles to the smaller, simpler problem, yielding much tighter and more specific predictions than the general theory would allow. It's a perfect example of how physical insight and mathematical structure work together.

Pushing the Boundaries

And why stop at two matrices? The same fundamental thinking can be extended. What if we sum three systems, A+B+CA+B+CA+B+C? While a full-blown Lidskii theorem for three matrices is much more complex, we can still use the underlying principles, like Weyl's inequalities, to set bounds on the outcome. For instance, we can find the minimum sum of the smallest eigenvalues of A+B+CA+B+CA+B+C by cleverly thinking about the trace and the maximum possible value of the largest eigenvalue.

A Unifying Vision

So, the Lidskii-Wielandt theorem is far more than a formula. It's a story about addition. Not the simple addition of numbers we learned as children, but the rich, complex, and structured way that systems combine. It shows us that while we may not know the exact outcome when we add two matrices, we are not lost in a sea of infinite possibilities. The outcome is constrained to lie in a beautiful geometric shape, a space whose boundaries we can map and whose properties we can explore. This theorem gives us a powerful lens to peer into the heart of complex systems, revealing a hidden order that connects quantum mechanics, engineering, and pure mathematics in a profound and unified way.