Similarity Invariants

SciencePedia

Key Takeaways

Similarity invariants, such as eigenvalues and the characteristic polynomial, are fundamental properties of a linear system that remain unchanged under a change of coordinates.
The Jordan Canonical Form provides the complete set of invariants, defining the unique structure of a linear operator up to similarity.
In control theory, crucial system behaviors like transfer functions, controllability, and stability are similarity invariants, independent of the chosen state-space representation.
Physical laws in fields like mechanics and quantum mechanics rely on invariants to ensure that descriptions of reality are objective and independent of the observer's coordinate system.

Introduction

How do we distinguish what is real from what is merely a matter of perspective? In physics and engineering, we often describe the dynamics of a system using matrices. However, the specific numbers in these matrices depend entirely on the coordinate system we choose—our "viewpoint." Change the coordinates, and the matrix changes, even though the underlying physical system remains the same. This raises a critical question: What are the fundamental, intrinsic properties of the system that are independent of our arbitrary choice of description? These unchanging truths are known as similarity invariants.

This article embarks on a quest to uncover these fundamental properties. It addresses the challenge of separating the essential characteristics of a linear system from the artifacts of its mathematical representation.

First, in "Principles and Mechanisms," we will explore the mathematical foundations of similarity, starting with basic invariants like the trace and determinant and progressing to the profound role of eigenvalues and the complete description provided by the Jordan Canonical Form. We will also confront the practical challenges of computation. Following this, the "Applications and Interdisciplinary Connections" chapter will demonstrate how these abstract concepts are the bedrock of modern science and engineering, revealing the true behavior of systems in control theory, mechanics, and even quantum physics. Let us begin by defining the principles that govern what truly stays the same.

Principles and Mechanisms

Imagine you are looking at a magnificent sculpture. You can walk around it, view it from different angles, maybe even look at it through a distorted lens. Each view gives you a different perspective, a different two-dimensional projection on your retina. The matrix of pixels you perceive changes. But is the sculpture itself changing? Of course not. Its essential properties—its mass, its volume, the material it's made of—are constant. They are invariant to your choice of viewpoint.

In physics and engineering, the state of a system—be it the position and velocity of a planet, the voltages in a circuit, or the populations in an ecosystem—is often represented by a vector of numbers, a state vector $x$ . The laws governing how this system evolves in time are often captured by a matrix, $A$ , in an equation like $\dot{x} = Ax$ . The choice of numbers in our vector $x$ depends on the coordinate system we choose. If we decide to measure positions in inches instead of meters, or orient our axes differently, the numbers in our state vector $x$ will change, and consequently, the numbers in our matrix $A$ will also change. But the underlying physical reality, the "sculpture" of our system's dynamics, remains the same.

The mathematical tool for describing this change of viewpoint, this change of coordinates, is called a similarity transformation. If we define a new set of coordinates $z$ related to the old ones $x$ by an invertible matrix $P$ (so that $x = Pz$ ), the new dynamics matrix becomes $B = P^{-1}AP$ . The matrices $A$ and $B$ are said to be similar. They represent the exact same linear operator, the same physical law, just described in two different languages, or viewed from two different angles. Our central question then becomes: What are the true, intrinsic properties of the system? What is the "mass" and "volume" of our matrix that remains unchanged, no matter which coordinate system we use? These are the similarity invariants.

What Stays the Same? The Quest for Invariants

Let's begin our quest for these fundamental truths. If two matrices, $A$ and $B$ , are just different perspectives on the same underlying operator, some of their most basic properties must be identical.

A simple property is the rank of the matrix, which you can think of as the dimensionality of the space the operator maps onto. If your operator takes a 3D space and squishes it into a 2D plane, it will do so no matter how you've drawn your coordinate axes. Changing your viewpoint can't magically make that 2D plane become a 1D line. Therefore, similar matrices must have the same rank. If one matrix has a rank of 2 and another has a rank of 1, they cannot possibly be describing the same operation, and thus cannot be similar.

Two other familiar numbers are also invariant: the determinant and the trace. The determinant, $\det(A)$ , measures how the operator scales volumes. If an operator doubles the volume of any shape, it should do so regardless of the units or axes used to measure that volume. A quick calculation confirms this: $\det(B) = \det(P^{-1}AP) = \det(P^{-1})\det(A)\det(P) = \det(A)$ , since $\det(P^{-1})$ and $\det(P)$ cancel out. The trace, $\operatorname{tr}(A)$ , the sum of the diagonal elements, is also preserved, a fact that follows from the cyclic property of the trace: $\operatorname{tr}(B) = \operatorname{tr}(P^{-1}AP) = \operatorname{tr}(APP^{-1}) = \operatorname{tr}(A)$ .

It's important to distinguish this from other types of transformations. For example, a congruence transformation, $B = P^{\top}AP$ , appears when changing variables in quadratic forms (like energy functions $x^{\top}Ax$ ). Under congruence, eigenvalues, trace, and even the determinant are generally not preserved. However, for real symmetric matrices, something remarkable is preserved: the number of positive, negative, and zero eigenvalues, a result known as Sylvester's Law of Inertia. This tells us that similarity and congruence capture invariants relevant to very different mathematical contexts. For the dynamics of systems, it is similarity that holds the key.

The Soul of the Matrix: Eigenvalues and the Characteristic Polynomial

The rank, determinant, and trace are like shadows of the sculpture—they give us some information, but they don't capture the whole picture. To see the true form, we need to look for the most fundamental invariants of a linear operator: its eigenvalues and eigenvectors.

For any given operator (matrix), there are almost always special directions in space. When the operator acts on a vector pointing in one of these special directions, it doesn't rotate it or change its direction at all; it simply scales it, stretching or shrinking it by a certain factor. The special direction is an eigenvector, and the scaling factor is its corresponding eigenvalue, $\lambda$ . This relationship is captured by the iconic equation $Av = \lambda v$ .

These eigenvalues are the very soul of the matrix. They represent the intrinsic scaling factors of the operator. If our system's dynamics involves stretching space by a factor of 2 in a certain direction, it does so irrespective of our chosen coordinates. Let's see this mathematically. Suppose $v$ is an eigenvector of $A$ with eigenvalue $\lambda$ . Now consider the similar matrix $B = P^{-1}AP$ . What happens if we apply $B$ to the vector $v' = P^{-1}v$ , which is just the eigenvector $v$ described in the new coordinate system? $Bv' = (P^{-1}AP)(P^{-1}v) = P^{-1}A(PP^{-1})v = P^{-1}(Av) = P^{-1}(\lambda v) = \lambda (P^{-1}v) = \lambda v'$ Look at that! The eigenvalue $\lambda$ is exactly the same. The eigenvector is different ( $v'$ instead of $v$ ), but that's just because we are looking at it from a new angle. The intrinsic scaling factor, the eigenvalue, is invariant.

This is a profound result. It means that the entire set of eigenvalues is a similarity invariant. A more powerful way to state this is that the characteristic polynomial, $p_A(\lambda) = \det(\lambda I - A)$ , is itself an invariant. Since $p_B(\lambda) = \det(\lambda I - P^{-1}AP) = \det(P^{-1}(\lambda I - A)P) = \det(\lambda I - A) = p_A(\lambda)$ , the two matrices have the exact same characteristic polynomial. Because the polynomials are identical, all their coefficients must be identical. And what are these coefficients? It turns out that, up to a sign, $c_{n-1} = -\operatorname{tr}(A)$ and $c_0 = (-1)^n \det(A)$ . The invariance of the trace and determinant are not separate facts, but consequences of the deeper invariance of the entire characteristic polynomial!

The Full Story: The Jordan Form

So, here's the ultimate test: if two matrices have the same characteristic polynomial—meaning the same eigenvalues with the same multiplicities—are they necessarily similar? It seems like they should be. If they have the same intrinsic scaling factors, shouldn't they represent the same operator?

Let's consider two simple matrices: $A = \begin{pmatrix} 3 & 0 \\ 0 & 3 \end{pmatrix} \quad \text{and} \quad B = \begin{pmatrix} 3 & 1 \\ 0 & 3 \end{pmatrix}$ The characteristic polynomial for both is $(\lambda-3)^2$ . Both have only one eigenvalue, $\lambda=3$ , with a multiplicity of two. But are they similar? Matrix $A$ is simple: it scales every vector in the plane by a factor of 3. Matrix $B$ does something more complex. It scales vectors, but it also applies a "shear" because of that '1' in the corner. You can't turn a pure scaling into a scaling-plus-shear just by changing your coordinates. These two matrices are fundamentally different; they are not similar.

This means that even the characteristic polynomial doesn't tell the full story. The final, complete set of similarity invariants is revealed by the Jordan Canonical Form. The Jordan form is the "truest" view of our sculpture. It says that any matrix can be represented, in some special coordinate system, as a matrix that is almost diagonal. It has the eigenvalues on the diagonal, as we'd expect. But it may also have some 1s on the superdiagonal, just above the main diagonal. These 1s are grouped with repeated eigenvalues into Jordan blocks.

For our example, matrix $A$ is its own Jordan form, with two $1 \times 1$ Jordan blocks of size $\{1,1\}$ . Matrix $B$ is a single $2 \times 2$ Jordan block of size $\{2\}$ . The complete story is not just the eigenvalues, but the sizes of the Jordan blocks associated with each eigenvalue. The size of the largest block for an eigenvalue is given by its multiplicity in another invariant polynomial, the minimal polynomial. A full description of all block sizes for an eigenvalue $\lambda$ can be decoded from the dimensions of the null spaces of the powers of $(A - \lambda I)$ . Two matrices are similar if, and only if, they have the same Jordan form (up to reordering the blocks). This is the beautiful and complete answer to our quest. The Jordan structure is the ultimate fingerprint of a linear operator.

The Real World: Why Invariance Can Be Fragile

This theoretical edifice is one of the triumphs of linear algebra. But when we try to use it in the real world of engineering and scientific computing, we hit a surprising and crucial snag. Theory can be fragile.

Consider the matrix family $A_{\varepsilon} = \begin{pmatrix} \lambda & 1 \\ \varepsilon & \lambda \end{pmatrix}$ . When $\varepsilon=0$ , this is the non-diagonalizable Jordan block we saw earlier. But for any tiny, non-zero $\varepsilon$ , this matrix has two distinct eigenvalues $\lambda \pm \sqrt{\varepsilon}$ and is therefore perfectly diagonalizable. In theory, a simple similarity transformation $V_{\varepsilon}^{-1}A_{\varepsilon}V_{\varepsilon}$ converts it into a nice diagonal matrix.

The problem lies in the transformation matrix $V_{\varepsilon}$ itself, whose columns are the eigenvectors of $A_{\varepsilon}$ . As $\varepsilon$ gets closer to zero, the two eigenvalues get closer together, and the two corresponding eigenvectors point in almost the exact same direction. The matrix $V_{\varepsilon}$ becomes nearly singular—it's on the verge of being unable to map to a full two-dimensional space. The result is that the condition number of $V_{\varepsilon}$ , a measure of how much the transformation can amplify errors, explodes, growing like $1/\sqrt{\varepsilon}$ .

What does this mean in practice? It means that if your system is described by a matrix that is nearly defective (has eigenvalues that are very close), trying to transform it to the "simple" diagonal form is a numerical disaster. Any tiny error in your measurements or calculations—even just the rounding errors inside your computer—will be blown up by an enormous factor when you multiply by $V_{\varepsilon}^{-1}$ . Your theoretically beautiful solution becomes practically useless.

Is there a way out? Yes. The problem arises from the extreme "squishing" performed by the ill-conditioned matrix $V_{\varepsilon}$ . What if we restrict ourselves to nicer transformations? An orthogonal transformation, represented by a matrix $Q$ where $Q^{-1} = Q^{\top}$ , corresponds to a rigid rotation or reflection. It doesn't stretch or squash space, so it never amplifies errors—its condition number is always a perfect 1. While an orthogonal transformation cannot always diagonalize a matrix, the Schur decomposition guarantees that it can always transform any matrix $A$ into an upper-triangular form, $T = Q^{\top}AQ$ . This triangular matrix $T$ still has the eigenvalues of $A$ sitting right on its diagonal, clear as day.

Here we have a fascinating trade-off. The Jordan form, achieved via general similarity, gives us the deepest theoretical insight into the operator's structure. But the Schur form, achieved via numerically stable orthogonal similarity, is often the engineer's most trusted tool for computation. It reveals the most important invariants—the eigenvalues—without the risk of numerical catastrophe. The quest for what stays the same leads us not only to a beautiful mathematical theory but also to a profound practical lesson about the delicate dance between abstract truth and computational reality.

Applications and Interdisciplinary Connections

After a journey through the mathematical machinery of matrices, eigenvalues, and transformations, it is natural to pause and ask a simple, yet profound question: What is this all for? Is it merely an elegant game of symbols, or does it tell us something deep about the world? The answer, as is so often the case in physics and engineering, is that this elegant mathematics is the very language we use to distinguish what is real and fundamental from what is arbitrary and of our own making.

Imagine you are trying to describe a statue. You could describe it from the front, from the side, or from above. You could use inches or centimeters. You could set up your coordinate system with the x-axis pointing east or north. Each description would yield a different set of numbers, a different list of coordinates for the statue's nose, but the statue itself—its height, its volume, its shape—would remain unchanged. The true properties of the statue are invariant under your choice of description.

A similarity transformation is precisely this: a change of descriptive language, a change of coordinate basis. The similarity invariants are the properties of the "statue"—the system we are studying—that are real and objective. They are the bedrock on which physical law is built. Let us now see how this powerful idea illuminates a remarkable range of disciplines.

The Ghost in the Machine: Control Theory and Intrinsic Behavior

Nowhere is the concept of invariance more central than in the theory of systems and control. We build complex systems—from aircraft to chemical plants to the internet—and we represent their dynamics using state-space models, a set of first-order differential equations summarized by matrices ( $A, B, C, D$ ). These matrices describe the internal workings, the "state" of the system.

But what is the state? It is a set of internal variables we choose to describe the system's memory of the past. For an electrical circuit, it might be the voltages across capacitors and currents through inductors. For a mechanical system, it might be positions and velocities. What if we chose a different set of internal variables? For instance, linear combinations of the old ones? This change of variables, represented by an invertible matrix $T$ , subjects our system matrices to a similarity transformation. The matrix $A$ becomes $A' = T^{-1}AT$ , $B$ becomes $B' = T^{-1}B$ , and $C$ becomes $C' = CT$ .

Our description of the internal dynamics has changed completely! The new matrices look nothing like the old ones. So, what is real? The answer lies in what the system does. What we ultimately care about is the relationship between the input we provide ( $u$ ) and the output we observe ( $y$ ). This relationship, in the frequency domain, is captured by the transfer function, $H(s)$ . And here is the magic: the transfer function is a similarity invariant. When you compute it from the transformed matrices ( $A', B', C'$ ), the $T$ and $T^{-1}$ matrices that we introduced to change our perspective miraculously cancel each other out, leaving the transfer function completely unchanged.

This is a profound result. It tells us that the input-output behavior is an intrinsic property of the system, independent of our choice of internal coordinates. The transfer function represents the "true character" of our black box, whether we are analyzing a continuous-time process or its discrete-time computer-controlled counterpart.

The list of these essential, invariant truths runs deep and tells a complete story about the system's capabilities:

Eigenvalues: The eigenvalues of the $A$ matrix are the most famous invariants. They tell us about the system's natural modes of behavior—its tendencies to oscillate, grow, or decay. These are the system's fundamental rhythms, and they are real, regardless of our description.
Controllability and Observability: Can we steer the system to any desired configuration? (Controllability). Can we deduce the internal state by watching the output? (Observability). These are fundamental yes-or-no questions about our ability to interact with the system. Their answers cannot possibly depend on our mathematical formalism. Indeed, the rank of the controllability and observability matrices—the test for these properties—is preserved under similarity transformations.
Stabilizability and Detectability: Sometimes, full control or observation is too much to ask. We settle for a more practical question: Can we at least stabilize the unstable modes? Can we detect any unstable behavior? These properties, crucial for designing controllers that prevent systems from blowing up, are also invariants. If a system is stabilizable in one set of coordinates, it is stabilizable in all of them.
Transmission Zeros: These are special frequencies that the system completely blocks from passing from input to output. They represent fundamental "blind spots" and are, as you might guess, invariant under a change of state coordinates.

This idea of separating the essential from the arbitrary finds a stunning practical application in model order reduction. Modern engineering models can have millions of variables. To simulate or design a controller, we need a simpler version. How do we discard states without losing the soul of the system? The method of balanced truncation provides an answer by seeking a special coordinate system. In this "balanced" basis, the states are ordered by their Hankel singular values, which measure the combined energy of being both controlled by the input and observed at the output. These singular values are themselves similarity invariants! By keeping the states with large Hankel singular values and discarding the rest, we obtain a reduced model that is not only simpler but is guaranteed to be stable and provides an excellent approximation to the true input-output behavior.

The Fabric of the World: Mechanics and Quantum Physics

The principle that physical law must be independent of the observer's coordinate system is a cornerstone of physics. Similarity invariants are the mathematical embodiment of this principle.

Consider a piece of steel under load. The state of stress at any point is described by the Cauchy stress tensor, a $3 \times 3$ matrix $\boldsymbol{\sigma}$ . If we rotate our experimental setup, we are applying an orthogonal transformation $\boldsymbol{Q}$ to our basis vectors. The components of the stress matrix change to $\boldsymbol{\sigma}' = \boldsymbol{Q} \boldsymbol{\sigma} \boldsymbol{Q}^{\mathsf{T}}$ . Yet, the physical state of the material has not changed. The material itself does not know or care about our coordinate system.

Physical laws must be formulated in terms of quantities that are independent of this choice. These are the tensor invariants. The three principal invariants of the stress tensor— $I_1 = \mathrm{tr}(\boldsymbol{\sigma})$ , related to hydrostatic pressure, and two others, $J_2$ and $J_3$ , related to the magnitude and type of shear distortion—are the same no matter how you orient your axes. Yield criteria, which predict when a material will begin to permanently deform, are equations written in terms of these invariants. The von Mises yield criterion, for instance, states that yielding occurs when $J_2$ reaches a critical value. The Drucker-Prager criterion uses a combination of $I_1$ and $J_2$ . The physics is captured by the invariants.

This principle extends all the way down to the quantum realm. In quantum mechanics, the state of a system is a vector in an abstract Hilbert space, and physical observables are operators. The Hamiltonian operator, $\hat{H}$ , governs the system's evolution. When we choose a basis to write down these vectors and operators as columns of numbers and matrices, we are making a choice of representation. A different choice of basis corresponds to a unitary transformation, a special kind of similarity transformation.

A fundamental quantity in statistical mechanics is the canonical partition function, $Z$ , from which all thermodynamic properties like free energy, entropy, and specific heat can be derived. It is given by the formula $Z = \operatorname{Tr}(e^{-\beta \hat{H}})$ , where $\beta$ is related to temperature and $\operatorname{Tr}$ is the trace of the operator. The beauty of this formula lies in the trace. The trace is invariant under any unitary change of basis! This mathematical fact guarantees that the partition function, and therefore all of thermodynamics, is physically real and not an artifact of our chosen quantum basis. The cyclic property of the trace ensures that the physics remains constant, no matter how we look at it.

A View from the Summit: The Linear in the Nonlinear

One might think that these ideas are confined to the tidy world of linear systems. But their reach is far greater. Most of the world is nonlinear, described by complex, curving relationships. How do we analyze the stability of a robot arm, a planetary orbit, or a chemical reaction near an equilibrium point? We linearize. We find the best linear approximation to the dynamics in a small neighborhood of that point.

Now, what happens if we first change our coordinate system for the nonlinear world? This is no longer a simple matrix multiplication; it could be a complicated, curving, and stretching transformation of space (a diffeomorphism). It seems certain that everything will be distorted beyond recognition.

And yet, something magical happens. When you perform this nonlinear coordinate change and then linearize the new system at the new equilibrium point, the resulting linear model is related to the original linearization by a simple similarity transformation! The transformation matrix $T$ is none other than the Jacobian (the matrix of partial derivatives) of the nonlinear coordinate change evaluated at the equilibrium point.

This means that all the intrinsic properties of the linearized system—its eigenvalues, stability type, controllability, observability, transmission zeros—are preserved. The local, linearized picture of reality is robust, even under arbitrary smooth changes of our descriptive framework. This beautiful result unifies the linear and nonlinear perspectives, showing how the core truths revealed by similarity invariants form the universal language for describing local dynamics everywhere.

From the engineer's control panel to the physicist's equations for the cosmos, similarity invariants are the answer to the question, "What is real?". They are the mathematical tools that allow us to scrape away the arbitrary paint of our coordinate systems to reveal the unchanging, objective sculpture of physical law beneath. They are, in a very real sense, the signature of reality itself.