Polynomial Eigenvalue Problem

SciencePedia

Key Takeaways

Polynomial eigenvalue problems (PEPs) generalize the standard eigenvalue problem to model systems where physical properties depend on the eigenvalue itself, such as in damped vibrations.
The most common method for solving PEPs is linearization, a technique that transforms the polynomial problem into a larger, but computationally tractable, generalized eigenvalue problem.
The numerical stability of the solution is not guaranteed; the choice of linearization and the use of scaling are critical to avoid ill-conditioning and ensure accurate results.
Physical phenomena like damping and rotation create specific mathematical structures in PEPs (e.g., gyroscopic, palindromic), which can be exploited by specialized algorithms for better efficiency and accuracy.

Introduction

In the study of physical systems, the standard eigenvalue problem, $Ax = \lambda x$ , provides a powerful tool for understanding intrinsic properties like natural frequencies or stable states. However, this model falls short when the system's characteristics—such as its mass, damping, or stiffness—are not constant but depend on the very frequency we seek to determine. This limitation gives rise to a more general and powerful formulation: the polynomial eigenvalue problem (PEP), where the matrix acting on a vector is itself a polynomial in the eigenvalue $\lambda$ . The PEP is the native language for a vast range of phenomena in science and engineering, from structural dynamics to electromagnetism.

This article bridges the gap between the familiar linear problem and this more complex, nonlinear world. It tackles the fundamental question of how to solve an equation where the eigenvalue is intricately embedded within a matrix polynomial. Over the following chapters, you will delve into the core concepts and practical considerations of PEPs. The "Principles and Mechanisms" section will demystify the elegant mathematical trick of linearization, its computational costs, and the crucial subtleties of numerical stability and conditioning. Following that, "Applications and Interdisciplinary Connections" will showcase how these abstract principles manifest in real-world scenarios, revealing the deep link between physical properties and the mathematical structure of the problem.

Principles and Mechanisms

Beyond the Familiar: What is a Polynomial Eigenvalue Problem?

In your first encounter with linear algebra, you meet a problem of profound elegance and utility: the standard eigenvalue problem, $Ax = \lambda x$ . You learn that for any given square matrix $A$ , there are special vectors $x$ , called eigenvectors, that, when acted upon by $A$ , are simply scaled by a corresponding number $\lambda$ , the eigenvalue. These are the intrinsic "directions" and "scaling factors" of the linear transformation that $A$ represents. If $A$ describes a physical system, its eigenvalues might correspond to the natural frequencies of vibration, the principal axes of rotation, or the stable states of a quantum system.

But what if the system's properties themselves depend on the parameter we are trying to find? Imagine modeling the vibrations of a bridge. The forces acting on it—stiffness, inertia (mass), and damping—might not be constant. They might change with the frequency of vibration, $\lambda$ . The inertial forces typically scale with $\lambda^2$ , damping forces with $\lambda$ , and stiffness forces are constant. The equation of motion might look something like this:

$(\lambda^2 M + \lambda C + K)x = 0$

This is no longer the simple $Ax = \lambda x$ . The matrix that acts on the vector $x$ is itself a polynomial in the parameter $\lambda$ . This is the gateway to a richer world: the polynomial eigenvalue problem (PEP).

In its general form, a PEP is an equation written as $P(\lambda)x=0$ , where $P(\lambda)$ is a matrix polynomial:

P(\lambda) = A_d \lambda^d + A_{d-1} \lambda^{d-1} + \dots + A_1 \lambda + A_0

Here, the coefficients $A_i$ are constant $n \times n$ matrices, $x$ is a nonzero vector of size $n$ , and $\lambda$ is the scalar eigenvalue we seek. We are looking for the special values of $\lambda$ for which the matrix $P(\lambda)$ becomes singular, allowing a nonzero vector $x$ (the eigenvector) to exist in its null space.

These problems are not mere mathematical curiosities; they are everywhere in science and engineering. They describe the resonant frequencies in acoustic design, the stability of structures, the behavior of fluid-mechanical systems, and the dynamics of control systems. The polynomial form is just one member of a larger family of nonlinear eigenvalue problems, $T(\lambda)x=0$ , where $T(\lambda)$ can be an even more exotic function of $\lambda$ —perhaps involving rational functions, exponentials from time delays, or even trigonometric functions, each corresponding to a different kind of physical dependency.

But how on Earth do we solve an equation where the eigenvalue is buried inside a polynomial of matrices?

The Magician's Trick: Linearization

The standard eigenvalue problem is something we have tamed. Over decades, mathematicians and computer scientists have developed incredibly robust and efficient algorithms, like the QR algorithm, for solving it. The generalized eigenvalue problem (GEP), $Ax = \lambda Bx$ , is similarly well-understood, with powerful tools like the QZ algorithm at our disposal. The polynomial nature of the PEP, however, puts it outside the direct reach of these methods.

This is where a truly beautiful piece of mathematical magic comes into play: linearization. The strategy is to transform the PEP of degree $d$ and size $n \times n$ into a GEP of size $dn \times dn$ that has exactly the same eigenvalues. We trade a complicated algebraic structure for a larger, but simpler, linear one.

Let's see the trick in action for the quadratic case, $P(\lambda) = \lambda^2 A_2 + \lambda A_1 + A_0$ . We can construct a new, larger matrix pencil $L(\lambda) = \lambda \mathcal{B} - \mathcal{A}$ where:

\mathcal{A} = \begin{pmatrix} 0 I \\ -A_0 -A_1 \end{pmatrix}, \quad \mathcal{B} = \begin{pmatrix} I 0 \\ 0 A_2 \end{pmatrix}

So the new, linearized problem is to find $\lambda$ and a new, larger vector $z$ such that $(\lambda \mathcal{B} - \mathcal{A})z = 0$ . How can we be sure this monster of a matrix has the same eigenvalues as our original, tidy polynomial? The secret lies in the determinant. The eigenvalues of $P(\lambda)$ are the roots of the characteristic equation $\det(P(\lambda))=0$ . Let's compute the determinant of our new pencil, which for the special case of a monic polynomial ( $A_2=I$ ) is $L(\lambda) = \begin{pmatrix} \lambda I -I \\ A_0 \lambda I + A_1 \end{pmatrix}$ . Using the formula for the determinant of a block matrix, we find a stunning result:

\det(L(\lambda)) = \det(\lambda I) \det((\lambda I + A_1) - A_0 (\lambda I)^{-1} (-I)) = \lambda^n \det(\lambda I + A_1 + \frac{1}{\lambda}A_0)

Factoring out $1/\lambda$ from the second determinant (which introduces a factor of $(1/\lambda)^n$ for an $n \times n$ matrix), we get:

\det(L(\lambda)) = \lambda^n \left(\frac{1}{\lambda}\right)^n \det(\lambda^2 I + \lambda A_1 + A_0) = \det(P(\lambda))

The characteristic polynomials are identical! The set of eigenvalues must therefore be the same. We have successfully converted our quadratic problem into a linear one. This technique, using what are known as companion matrices, is the workhorse for solving PEPs. It feels like we've gotten something for nothing, but of course, there's a trade-off. We now have to work with matrices that are $d$ times larger, demanding significantly more memory and computational effort—typically on the order of $\mathcal{O}((dn)^3)$ operations.

The Fine Print: Degrees, Grades, and Infinity

Now that we've seen the magician's trick, it's time to examine the fine print. A standard $n \times n$ eigenvalue problem has $n$ eigenvalues. Our original PEP was size $n \times n$ , but of degree $d$ . It turns out that it has $dn$ eigenvalues. Where did the extra $n(d-1)$ eigenvalues come from?

Some of them might be at infinity. An infinite eigenvalue sounds strange, but it's a perfectly well-defined concept that often corresponds to an instantaneous or impulsive response in a physical system. To "see" these eigenvalues, we perform a change of variables, $\lambda = 1/\mu$ , and look at what happens as $\mu \to 0$ . This leads to the reversal polynomial:

\operatorname{rev}_d P(\mu) = \mu^d P(1/\mu) = \mu^d \sum_{i=0}^d A_i (1/\mu)^i = \sum_{i=0}^d A_i \mu^{d-i} = A_d + A_{d-1}\mu + \dots + A_0 \mu^d

Notice that the order of the coefficient matrices has been reversed! An infinite eigenvalue for $P(\lambda)$ corresponds to a zero eigenvalue for $\operatorname{rev}_d P(\mu)$ , which occurs if and only if its constant term, $A_d$ , is singular. The number of infinite eigenvalues is related to the nullity of the leading coefficient matrix, $A_d$ .

This is also where a subtle distinction becomes important: the difference between the degree and the grade of a polynomial. The degree is the highest power of $\lambda$ with a non-zero matrix coefficient, an intrinsic property. The grade is a formal number we choose for our representation, which can be larger than the degree. If we choose a grade $d$ larger than the degree $k$ , we are effectively padding the polynomial with leading zero matrices ( $A_{k+1}, \dots, A_d$ are all zero). This doesn't change the finite eigenvalues at all. However, it does change the reversal polynomial and, consequently, the accounting of infinite eigenvalues. Each block of padding adds $n$ new eigenvalues at infinity. This isn't a flaw; it's a necessary bookkeeping device that ensures our $dn \times dn$ linearization correctly accounts for all $dn$ eigenvalues.

A Question of Stability: Not All Linearizations Are Created Equal

So, we have a method: take a PEP, linearize it into a GEP, and throw it at a computer. Problem solved? Not so fast. The world of numerical computation is a world of finite precision, where tiny rounding errors are unavoidable. The crucial question is: are our results sensitive to these small errors? This is the question of conditioning. An eigenvalue is ill-conditioned if a tiny perturbation to the coefficient matrices can cause a large change in the eigenvalue.

The sensitivity of a simple eigenvalue $\lambda$ is captured by its condition number, which, for a PEP, is related to the quantity $|y^* P'(\lambda) x|^{-1}$ , where $x$ and $y$ are the right and left eigenvectors and $P'(\lambda)$ is the derivative of the matrix polynomial. If this derivative term is small at the eigenvalue—meaning the function $P(\lambda)$ is "flat" near its root—the eigenvalue will be very sensitive to perturbations.

Here is the critical punchline: while a linearization is mathematically equivalent to the original PEP, it is not necessarily numerically equivalent. The process of linearization can itself degrade the conditioning of the problem.

Consider the simple quadratic problem $P(\lambda) = \lambda^2 I - \begin{pmatrix} -1 0 \\ 0 -4 \end{pmatrix}$ . One of its eigenvalues is $\lambda=1$ . A careful calculation shows that applying the standard companion linearization makes the condition number of this eigenvalue twice as large as the condition number of the original polynomial problem. The linearization has made the problem more sensitive to noise!

Why does this happen? The structure of the linearization matters immensely. There are many ways to linearize a PEP. The two most famous are the first and second companion forms. It turns out that one is particularly well-suited for finding eigenvalues with large magnitudes, while the other is better for small magnitudes. Using the "wrong" one can introduce factors of $|\lambda|$ or $1/|\lambda|$ into the condition number, artificially inflating it. This is especially dangerous when the norms of the coefficient matrices $A_i$ have a large dynamic range—say, $\|A_0\|$ is huge and $\|A_d\|$ is tiny. A standard linearization will create a large pencil matrix whose norm is dominated by $\|A_0\|$ . A backward-stable solver like the QZ algorithm will find eigenvalues that are correct for a slightly perturbed pencil. But when that perturbation is mapped back to the original polynomial's coefficients, the effective perturbation on the tiny $A_d$ can be enormous relative to its size, potentially destroying any accuracy in the computed eigenvalues that depend on it.

Nature doesn't care about our computational convenience. The physics is encoded in the original polynomial $P(\lambda)$ . Our linearization is a human construct, a computational crutch. And if we are not careful, that crutch can introduce its own instabilities.

Taming the Beast: The Art of Scaling and Choice

So, are we doomed to accept poor numerical behavior? No. This is where the art of numerical analysis shines. The key to taming the PEP is scaling.

The idea is to prepare the problem before we linearize it. A powerful technique is variable scaling, where we let $\lambda = \alpha \mu$ . By choosing the scaling factor $\alpha$ judiciously (for example, $\alpha \approx \sqrt{\|A_0\|/\|A_d\|}$ ), we can transform the problem into a new one in the variable $\mu$ where the coefficient matrices have norms that are all roughly of the same order of magnitude. This "balancing" act tends to move the eigenvalues of interest to have magnitudes closer to 1, which is the "sweet spot" where numerical algorithms are most accurate.

After scaling, we can then choose the appropriate companion linearization for the eigenvalue regime we are interested in. This two-step process—scale, then linearize—is a robust and powerful strategy for mitigating the pitfalls of numerical instability. It is a beautiful example of how understanding the deep structure of a problem allows us to guide our computational tools to a correct and reliable answer.

A Final Thought on Invariance

Before we leave this topic, let's consider one more elegant property. What if we decide to describe our physical system in a different coordinate system? This corresponds to applying a simultaneous similarity transformation to all coefficient matrices: $\widetilde{A}_i = S^{-1} A_i S$ for some invertible matrix $S$ . How does this affect our eigenvalues?

A simple calculation shows that the new matrix polynomial is just $\widetilde{P}(\lambda) = S^{-1} P(\lambda) S$ . Since $\det(\widetilde{P}(\lambda)) = \det(S^{-1})\det(P(\lambda))\det(S) = \det(P(\lambda))$ , the characteristic polynomial is unchanged. Therefore, the eigenvalues—the physical quantities like resonant frequencies—are completely independent of the coordinate system we use to describe them. This is as it should be. The physics is invariant, and the mathematics faithfully reflects that invariance. It's a reminder of the deep and beautiful unity between the abstract structures we build and the physical world they are meant to describe.

Applications and Interdisciplinary Connections

In our journey so far, we have explored the principles and mechanisms of the polynomial eigenvalue problem. We've seen how it generalizes the familiar $A\mathbf{x} = \lambda \mathbf{x}$ to cases where the eigenvalue $\lambda$ appears as a polynomial. Now, we might ask, is this merely a mathematical curiosity? A clever but niche generalization? The answer, you might be pleased to discover, is a resounding no. The polynomial eigenvalue problem is not some esoteric construct confined to the blackboards of mathematicians. Instead, it is a language that nature itself seems to speak. It emerges, unexpectedly and beautifully, in a vast array of physical phenomena, from the hum of a power generator to the shimmer of light in a resonant cavity. In this chapter, we will embark on a tour of these applications, discovering how this single mathematical idea provides a unifying lens through which to view a multitude of seemingly disconnected problems in science and engineering.

The Symphony of Structures: Damping and Vibration

Let’s start with something you can almost feel: the vibration of a structure. Imagine a simple guitar string. Its vibrations are pure tones, or harmonics. These correspond to the eigenvalues of a simple, elegant system. But now, picture a much larger, more complex structure—a bridge swaying in the wind, a skyscraper during an earthquake, or the chassis of a car on a bumpy road. These are not simple, frictionless oscillators. They are subject to forces that depend not just on position (stiffness, $K$ ) and acceleration (mass, $M$ ), but also on velocity. This velocity-dependent force is damping ( $C$ ), the physical mechanism that dissipates energy, often as heat.

When we write down Newton's second law for such a system, we don't get the simple harmonic oscillator equation. Instead, we find ourselves face-to-face with a second-order differential equation:

M \ddot{\mathbf{u}}(t) + C \dot{\mathbf{u}}(t) + K \mathbf{u}(t) = \mathbf{0}

To find the natural modes of vibration, we look for solutions of the form $\mathbf{u}(t) = \mathbf{x} e^{\lambda t}$ . Substituting this into the equation, the time derivatives bring down powers of $\lambda$ , and we are left with a purely algebraic problem:

(\lambda^2 M + \lambda C + K) \mathbf{x} = \mathbf{0}

This is precisely a Quadratic Eigenvalue Problem (QEP). The physical reality of mass, damping, and stiffness maps directly onto the mathematical structure of a QEP.

The introduction of the damping term $C$ does something profound. In a simple, undamped system, energy is conserved. The corresponding mathematical operator is "self-adjoint," which guarantees that the resonant frequencies are real and the vibration modes are orthogonal—they are independent, pure modes, like the harmonics of our guitar string. Damping, however, means energy is lost. The system is dissipative; it is no longer symmetric with respect to time reversal. This physical change breaks the elegant self-adjointness of the mathematics.

The consequences are immediate and dramatic. The eigenvalues $\lambda$ are no longer purely imaginary (representing oscillation) but become complex numbers. The real part of $\lambda$ dictates the rate of decay of the vibration, while the imaginary part gives its frequency. Furthermore, the mode shapes $\mathbf{x}$ are no longer orthogonal in the standard sense. They become "entangled" in a more complex way. This is the general case of nonproportional damping, a central problem in structural engineering. To analyze a building's response to an earthquake, engineers must solve this very QEP to understand how different modes will oscillate, decay, and interact. Only in the special, idealized case of proportional damping (where the damping matrix $C$ happens to be a simple combination of $M$ and $K$ ) does the system retain the beautiful simplicity of orthogonal modes.

Waves in a Box: Electromagnetism and Resonators

Let's switch fields, from the tangible world of mechanical vibrations to the invisible dance of electromagnetic waves. Consider a microwave oven. It is, in essence, a metal box—a resonant cavity. When you turn it on, you are exciting electromagnetic standing waves inside it. In a perfectly idealized oven with perfectly conducting walls, no energy is lost. The resonant modes have real frequencies, and the mathematics leads to a standard eigenvalue problem, much like our undamped structure.

But what happens in a real cavity? The walls might not be perfect conductors, or, more interestingly, they might be intentionally designed to absorb energy. This is described by a so-called impedance boundary condition. When we take Maxwell's equations and impose such a boundary condition, a familiar story unfolds. We again search for harmonic modes by assuming an electric field of the form $\mathbf{E}(\mathbf{r}, t) = \mathbf{e}(\mathbf{r}) e^{i\omega t}$ . The resulting equation, after some manipulation, takes the form:

(\mathbf{K} - \omega^2 \mathbf{M} + i\omega \mathbf{C}) \mathbf{e} = \mathbf{0}

Once again, a QEP emerges! Here, the matrices $\mathbf{K}$ and $\mathbf{M}$ relate to the curl and divergence of the fields in the volume of the cavity, while the new term, $\mathbf{C}$ , arises directly from the energy being dissipated at the lossy boundary. The eigenvalue $\omega$ is the complex resonant frequency. Its real part gives the oscillation frequency of the mode, and its imaginary part tells us how quickly the mode decays due to the energy loss at the walls. This decay rate is directly related to the cavity's "Q-factor," a critical parameter in the design of everything from particle accelerators to mobile phone filters. The physical act of energy dissipation at a surface is mirrored perfectly by the appearance of a linear term in $\omega$ and a non-Hermitian structure in the resulting QEP.

A Mathematical Menagerie: The Power of Structure

As we have seen, nature does not simply present us with random matrices. The polynomial eigenvalue problems that arise from physical modeling are often endowed with a rich and beautiful mathematical structure, a direct reflection of the underlying physics. Recognizing this structure is not merely an act of classification; it is the key to both deeper understanding and the development of powerful, efficient algorithms.

Let's browse this "mathematical menagerie":

Damped-oscillator systems, as we saw in structural dynamics, involve real, symmetric matrices $M, C, K$ that are positive definite. This structure guarantees that all eigenvalues lie in the left half of the complex plane, which is just a mathematical statement of a physical necessity: a passive, dissipative system must be stable.
Gyroscopic systems model phenomena involving rotation, such as spinning satellites or power plant turbines. Here, the equations of motion include a velocity-dependent term from the Coriolis force, which leads to a QEP of the form $(\lambda^2 M + \lambda G + K)\mathbf{x} = \mathbf{0}$ . The gyroscopic matrix $G$ is skew-symmetric ( $G = -G^T$ ). This skew-symmetry leads to a beautiful spectral symmetry: if $\lambda$ is an eigenvalue, then so is $-\bar{\lambda}$ . The spectrum is symmetric with respect to the imaginary axis.
Palindromic systems have coefficient matrices that read the same forwards and backwards (e.g., $A_k = A_{d-k}^*$ ). This seemingly abstract structure appears in problems like the analysis of vibrations on rail tracks and in certain control theory applications. It imparts a reciprocal-conjugate symmetry to the spectrum: if $\lambda$ is an eigenvalue, so is $1/\bar{\lambda}$ .

By identifying that a problem is, say, gyroscopic, we can employ specialized numerical methods that are designed to preserve the $\lambda \leftrightarrow -\bar{\lambda}$ symmetry of the spectrum. These structure-preserving algorithms are not only more accurate and efficient, but they also provide solutions that are physically meaningful. Ignoring the structure is like trying to solve a puzzle with brute force, whereas exploiting it is like finding the hidden key.

The Art of Approximation and the Perils of Linearization

So, how does one actually solve a polynomial eigenvalue problem? The most common strategy, a magnificent trick of the trade, is linearization. The idea is to transform the $n \times n$ PEP of degree $d$ into an equivalent, but much larger, $dn \times dn$ linear eigenvalue problem, $(A - \lambda B)\mathbf{z} = \mathbf{0}$ . This turns a problem we don't know how to solve directly into one for which we have a century's worth of powerful numerical algorithms. The simplest example is one you might have learned without realizing it: finding the roots of a simple scalar polynomial, $p(\lambda) = 0$ , is equivalent to finding the eigenvalues of a special "companion" matrix.

But this magic comes with a warning. We have transformed our problem into a new one. Is the solution to the big linear problem a good solution to our original polynomial problem? This brings us to the crucial concept of numerical stability. The famous Bauer-Fike theorem gives us a window into this question. It tells us that the sensitivity of eigenvalues to small perturbations (errors in our data or from floating-point arithmetic) is governed by the condition number of the matrix of eigenvectors.

A poorly chosen linearization can result in a new, larger matrix that is exquisitely sensitive to perturbation—it might have an enormous eigenvector condition number. This means that even tiny, unavoidable errors can be magnified into huge, meaningless errors in the final computed eigenvalues. The art of numerical analysis is to find "good" linearizations that don't introduce this artificial ill-conditioning. For example, for a gyroscopic problem, some linearizations are better at finding large eigenvalues, while others are better at finding small ones.

This challenge is surprisingly analogous to problems in a completely different field: optimization. When solving large-scale optimization problems, one often encounters so-called KKT systems. The stability of these systems is highly sensitive to the relative scaling of their different parts. A standard technique is to "equilibrate" the system by scaling the variables. The same idea applies with astonishing success to PEPs. By judiciously scaling the eigenvalue parameter $\lambda$ before we even build the linearization, we can often dramatically improve the conditioning and stability of the entire solution process. It is a beautiful example of a single, powerful idea—equilibration—resonating across different scientific disciplines.

Beyond Eigenvalues: Pseudospectra and Transient Dynamics

For a long time, it was thought that the eigenvalues told the whole story. They determine the long-term behavior of a system: will it oscillate forever, decay to zero, or grow uncontrollably? But in many systems, the most dramatic events happen in the short term. An airplane might experience a sudden, violent shudder in response to a gust of wind before settling back down. This is called transient growth, and eigenvalues alone cannot predict it.

This is where the story gets even more interesting. The failure of eigenvalues to capture transient behavior is linked to a property called non-normality. In some systems, the eigenvectors, instead of being nicely orthogonal, are nearly parallel to one another. In such cases, the eigenvalues can be incredibly sensitive to the smallest perturbation.

To understand these systems, we need a more powerful tool: the pseudospectrum. The $\varepsilon$ -pseudospectrum is the set of complex numbers that are "almost" eigenvalues—numbers $\lambda$ for which there is a vector $\mathbf{x}$ that is almost an eigenvector, i.e., $\|P(\lambda)\mathbf{x}\|$ is small. For a non-normal system, the pseudospectrum can be a vast region in the complex plane, even if the eigenvalues themselves are just a few isolated points.

This concept is not just an abstraction; it has profound physical meaning. The shape and size of the pseudospectrum are directly linked to the potential for transient growth in a dynamical system. A large bulge in the pseudospectrum is a warning sign that the system might exhibit a large, temporary amplification of its response. Furthermore, our choice of linearization matters here, too. A "bad," unbalanced linearization can be highly non-normal and have a huge, misleading pseudospectrum that doesn't reflect the behavior of the original polynomial system. Choosing a good, balanced linearization is therefore critical not only for finding the eigenvalues accurately but also for correctly predicting the system's dynamic response.

From the vibration of a bridge to the transient flutter of an airplane wing, the polynomial eigenvalue problem serves as a deep and unifying mathematical framework. Its study reveals a beautiful interplay between physical phenomena like damping and rotation, their corresponding mathematical structures, and the practical, subtle art of numerical computation. It reminds us that the quest to understand and engineer our world is inextricably linked to the quest to find these elegant and powerful mathematical descriptions.