Cyclic Property of Trace

SciencePedia

Key Takeaways

The cyclic property states that the trace of a product of matrices is unchanged by cyclically permuting them, i.e., $\operatorname{tr}(AB) = \operatorname{tr}(BA)$ .
This property guarantees that the trace is a basis-invariant quantity, meaning it remains the same under similarity transformations ( $\operatorname{tr}(PAP^{-1}) = \operatorname{tr}(A)$ ).
The trace of a matrix is fundamentally the sum of its eigenvalues, a coordinate-independent property reflecting its deep structure.
The cyclic property is crucial in physics for defining objective invariants, in group theory for defining characters, and in simplifying complex calculations in QED.

Introduction

In the vast landscape of linear algebra, few concepts appear as deceptively simple as the trace of a matrix—the straightforward sum of its diagonal elements. This value seems arbitrary, a hostage to the chosen coordinate system. Yet, this simple sum holds a profound secret, a key to understanding properties that are intrinsic to the linear transformations matrices represent. The secret lies in a remarkable identity known as the cyclic property of the trace: for matrices A and B, the trace of AB is always equal to the trace of BA. This article delves into the far-reaching consequences of this single, elegant rule. First, in the "Principles and Mechanisms" chapter, we will unravel how this property proves the trace is an invariant, independent of our observational viewpoint, and reveal its true identity as the sum of the operator's eigenvalues. Following this, the "Applications and Interdisciplinary Connections" chapter will showcase how this theoretical cornerstone becomes a powerful tool in diverse fields, from guaranteeing the objectivity of physical laws and simplifying complex calculations in quantum physics to forming the basis of character theory in the study of symmetry.

Principles and Mechanisms

Imagine you're looking at a sculpture. From the front, it has a certain shape; from the side, a different one. The raw data hitting your eyes—the two-dimensional projection—changes with your every step. Yet, you have an unshakeable sense of the sculpture's true, three-dimensional form. You understand that some properties, like its total volume or mass, don't change no matter your viewing angle. These are the invariants of the object, the properties that tell you something fundamental about its nature.

In the world of linear algebra, a matrix is like a particular view of a "sculpture" called a linear operator. A linear operator is an action, like a rotation, a stretch, or a shear. When we write it down as a matrix, we are forced to choose a coordinate system, a particular "point of view." Change your coordinate system, and the numbers in your matrix change completely. So, how can we find the "volume" or "mass" of the operator itself, properties that are independent of our chosen perspective?

A Deceptively Simple Sum

Let's start with one of the simplest things you can do to a square matrix: add up the numbers on its main diagonal. This sum has a special name: the trace, denoted as $\operatorname{tr}(A)$ .

For a matrix $A = \begin{pmatrix} a & b \\ c & d \end{pmatrix}$ , the trace is simply $\operatorname{tr}(A) = a+d$ .

At first glance, this seems almost too simple to be useful. The diagonal elements are utterly dependent on our choice of coordinates. If we rotate our axes, the whole matrix changes, and we'd expect the trace to change wildly as well. It feels like measuring the height of our sculpture's shadow instead of the sculpture itself. Why should this particular sum tell us anything deep or meaningful? It seems like an arbitrary, coordinate-bound property. But nature, and mathematics, is full of wonderful surprises.

The Commuting Trick: A Hidden Symmetry

The secret to the trace's power lies in a remarkable, almost magical, algebraic identity. For any two square matrices $A$ and $B$ (of the same size), it is always true that:

\operatorname{tr}(AB) = \operatorname{tr}(BA)

This is the famous cyclic property of the trace. You can multiply the matrices in one order, take the trace, and you will get the exact same number as if you had multiplied them in the reverse order. Keep in mind that matrix multiplication is generally not commutative; $AB$ is usually very different from $BA$ . Yet, their traces are identical!

Let's quickly see why this isn't magic. For two simple $2 \times 2$ matrices, $A = \begin{pmatrix} a & b \\ c & d \end{pmatrix}$ and $B = \begin{pmatrix} e & f \\ g & h \end{pmatrix}$ :

$AB = \begin{pmatrix} ae+bg & af+bh \\ ce+dg & cf+dh \end{pmatrix}$ , so $\operatorname{tr}(AB) = ae+bg+cf+dh$ .

$BA = \begin{pmatrix} ea+fc & eb+fd \\ ga+hc & gb+hd \end{pmatrix}$ , so $\operatorname{tr}(BA) = ea+fc+gb+hd$ .

Look closely. The sums are identical, just with the terms rearranged. This isn't a deep mystery; it's a consequence of the fact that ordinary multiplication of numbers is commutative ( $bg = gb$ , etc.). This simple shuffling of terms, however, is the key that unlocks everything. Its power is far from trivial. For instance, if you are given two enormously complicated matrices $A$ and $B$ and told that $\operatorname{tr}(BA) = 19$ , you can instantly deduce that $\operatorname{tr}(AB) = 19$ without performing a single calculation involving the matrix elements.

Changing Your Point of View

Now, let's return to our sculpture. Changing our point of view is mathematically represented by a similarity transformation. If $A$ is the matrix of an operator in one coordinate system, the matrix in a new system, $A'$ , is given by $A' = PAP^{-1}$ , where the invertible matrix $P$ represents the "dictionary" for translating between the two coordinate systems.

What happens to the trace when we change our viewpoint? Let's calculate it for $A'$ :

\operatorname{tr}(A') = \operatorname{tr}(PAP^{-1})

Here is where the cyclic property works its magic. Think of $PAP^{-1}$ as the product of two matrices: $(PA)$ and $(P^{-1})$ . The cyclic property tells us $\operatorname{tr}(XY) = \operatorname{tr}(YX)$ . So, we can "cycle" the $P^{-1}$ from the end of the expression to the front:

\operatorname{tr}(PAP^{-1}) = \operatorname{tr}(P^{-1}PA) = \operatorname{tr}(IA) = \operatorname{tr}(A)

This is a spectacular result! The trace of the matrix is unchanged. It is an invariant under a change of coordinates. The number that seemed so arbitrary, so dependent on our perspective, is in fact a deep property of the underlying operator, the "sculpture" itself. It doesn't matter how you look at it; the trace remains the same. This is why a problem can ask you to calculate something like $\operatorname{tr}(SMS^{-1})$ without ever telling you what the complicated transformation matrix $S$ is—it simply doesn't matter.

The True Identity of the Trace: Sum of Eigenvalues

So, we have discovered an invariant. But what is this number? What fundamental property of the operator does it represent? To answer this, we need to find the "best" point of view from which to look at our operator. For many operators, there exists a special set of directions, called eigenvectors, along which the operator's action is incredibly simple: it just stretches or shrinks the vector by a certain factor, the eigenvalue.

If we align our coordinate axes along these special eigenvector directions, the matrix representation of our operator becomes wonderfully simple—it becomes a diagonal matrix, $D$ , with the eigenvalues $\lambda_1, \lambda_2, \dots, \lambda_n$ sitting on the diagonal. In this privileged basis, the operator $A$ is related to its diagonal form $D$ by a similarity transformation: $A = PDP^{-1}$ .

Now, we can finally reveal the trace's true identity. Using its invariance:

\operatorname{tr}(A) = \operatorname{tr}(PDP^{-1}) = \operatorname{tr}(D)

And what is the trace of the diagonal matrix $D$ ? It's simply the sum of its diagonal elements!

\operatorname{tr}(A) = \lambda_1 + \lambda_2 + \dots + \lambda_n

There it is. The trace of an operator is the sum of its eigenvalues. This is the profound, coordinate-independent meaning we were searching for. It connects the seemingly mundane act of summing diagonal elements to the deep structural properties of the operator. This link is so strong that it allows us to compute traces of complex matrix functions, like $\operatorname{tr}(A^2+2A)$ , by simply performing the same operations on the eigenvalues.

A Rule That Never Breaks

"But wait," a skeptic might ask, "what if a matrix doesn't have enough eigenvectors to form a basis? What if it can't be made perfectly diagonal?" This is a fair and important question. Such matrices exist, and they are called non-diagonalizable.

Here, the beauty of mathematics provides an even more general tool: the Jordan Canonical Form. It turns out that any square matrix can be transformed via a similarity transformation into a "nearly diagonal" matrix called its Jordan form, $J$ . This matrix has the eigenvalues on its main diagonal, just like before. The only difference is that it might have some pesky $1$ s on the superdiagonal (the line directly above the main diagonal) for the non-diagonalizable parts.

But what happens when we take the trace? The trace only cares about the main diagonal! The $1$ s on the superdiagonal are completely ignored. So, even for the most general, non-diagonalizable matrices, the relationship holds firm:

\operatorname{tr}(A) = \operatorname{tr}(J) = \sum_{i} \lambda_i

The trace is the sum of the eigenvalues, counted with their proper multiplicities. This rule is incredibly robust; it holds for any square matrix over the complex numbers, without exception.

The One and Only

The trace is not just a function with this cyclic property; in a very real sense, it is the function. It has been proven that any linear mapping $f$ from the space of matrices to the numbers that satisfies the cyclic property $f(XY) = f(YX)$ must be a scalar multiple of the trace.

Think about what this means. If you are looking for a linear, coordinate-invariant, single-number descriptor for a linear operator, you have essentially no choice but the trace. Its cyclic property singles it out as the unique tool for the job. This elevates the trace from a clever computational trick to a cornerstone of the entire theory of linear operators.

Echoes in Advanced Physics

The consequences of this simple cyclic rule echo into the most advanced areas of physics and mathematics. Consider this puzzle: in quantum mechanics, observable quantities (like energy or momentum) are represented by operators whose eigenvalues are real numbers (Hermitian operators). Other processes, related to time evolution, are described by operators with purely imaginary eigenvalues (skew-Hermitian operators).

Could we construct two operators, $S$ and $T$ , such that their product in one order, $ST$ , represents an observable (real eigenvalues), while their product in the reverse order, $TS$ , represents something else entirely (imaginary eigenvalues)?

The cyclic property gives a resounding "no." We know that $ST$ and $TS$ must have the exact same set of non-zero eigenvalues. If the eigenvalues of $ST$ must all be real, and the eigenvalues of $TS$ must all be purely imaginary, then their common non-zero eigenvalues would have to be both real and imaginary. The only number that satisfies this is zero. This forces any such construction to be trivial, meaning the operators would have no non-zero eigenvalues at all. This simple algebraic property imposes a profound structural constraint, forbidding certain kinds of physical theories from even existing.

From a humble sum of diagonal numbers to a deep statement about the fundamental structure of our physical world—that is the journey of the trace. It is a perfect example of how in mathematics, the most unassuming ideas often hide the most beautiful and powerful truths.

Applications and Interdisciplinary Connections

We have seen that the trace of a matrix possesses a seemingly modest property: it remains unchanged under a cyclic permutation of the matrices in a product. That is, for any matrices $A$ , $B$ , and $C$ of appropriate dimensions, $\operatorname{tr}(ABC) = \operatorname{tr}(BCA) = \operatorname{tr}(CAB)$ . At first glance, this might look like a minor algebraic curiosity, a neat little trick for rearranging symbols. But to a physicist or a mathematician, this is no mere parlor trick. It is a key that unlocks a surprisingly deep understanding of the world, revealing profound connections between computation, physical laws, and the very nature of symmetry. This single, simple property is a thread that weaves together disparate fields, showing them to be different facets of a single, unified structure. Let's follow this thread on a journey of discovery.

The Anchor of Reality: Invariance Under Change

Imagine you are describing a physical object, say, the stress within a steel beam or the rotational inertia of a spinning satellite. You set up a coordinate system—an $x$ , $y$ , and $z$ axis—to measure the components of the tensor that describes this physical property. But your colleague in another lab sets up a different coordinate system, rotated with respect to yours. You will both write down different matrices of numbers for the same physical object. This poses a philosophical problem: if physical reality is objective, there must be some quantities you both agree on, numbers that are intrinsic to the object and independent of your arbitrary choice of viewpoint.

The trace provides one of these anchors to reality. A change of coordinate system, like a rotation, is represented by an orthogonal matrix $R$ . If your tensor is $T$ , your colleague's is $T' = RTR^{-1}$ (or $RTR^T$ , which is the same for a rotation matrix). What is the trace of your colleague's tensor? Using the cyclic property, we find a remarkable result:

\operatorname{tr}(T') = \operatorname{tr}(RTR^{-1}) = \operatorname{tr}(R^{-1}RT) = \operatorname{tr}(IT) = \operatorname{tr}(T)

The trace is the same! It is a "scalar invariant"—a single number that all observers, no matter how their coordinate systems are rotated, will calculate to be the same. This is not just a mathematical convenience; it is a statement about the objectivity of physics. The trace represents a fundamental property of the system itself. For the inertia tensor of a rigid body, the trace is related to the body's total moment of inertia about its center of mass. For the stress tensor, it is related to the pressure. These are real, physical things.

This idea of invariance under a change of basis is the cornerstone of many powerful numerical methods. Consider the problem of finding eigenvalues—the intrinsic "scaling factors" of a linear transformation. Eigenvalues are notoriously difficult to compute directly. The celebrated QR algorithm is an iterative process that tackles this problem. It starts with a matrix $A_0$ and generates a sequence of new matrices $A_1, A_2, \dots$ , where each step involves a clever change of basis: $A_{k+1} = Q_k^{-1} A_k Q_k$ . The magic of the algorithm is that this sequence converges to a simple matrix (often triangular) from which the eigenvalues can be read off easily. But how do we know this process isn't just scrambling the numbers and losing the information we seek? The cyclic property of the trace gives us the guarantee. At every single step, the transformation is a so-called "similarity transformation," exactly of the form we just analyzed. Therefore,

\operatorname{tr}(A_{k+1}) = \operatorname{tr}(Q_k^{-1} A_k Q_k) = \operatorname{tr}(A_k)

The trace is a conserved quantity throughout the entire, complex iterative process. It's like a lighthouse in a storm, a constant value that assures us that even as the individual elements of the matrix shift and change, the fundamental properties—the sum of the eigenvalues—are perfectly preserved.

Beyond the Sum: Uncovering Deeper Invariants

The power of the trace as an invariant-detector goes even deeper. It's not limited to the trace of the matrix itself. Consider again the inertia tensor $\mathbf{I}$ of a rigid body. We know its trace, which is the sum of its eigenvalues (the principal moments of inertia, $I_1+I_2+I_3$ ), is an invariant. But what about the sum of the squares of the eigenvalues, $I_1^2 + I_2^2 + I_3^2$ ? This quantity is also a fundamental property of the body's mass distribution. We can find it by considering the trace of the square of the tensor, $\operatorname{tr}(\mathbf{I}^2)$ . In the basis where $\mathbf{I}$ is diagonal, this is clearly $I_1^2 + I_2^2 + I_3^2$ . In any other basis, the matrix changes to $\mathbf{I}' = P \mathbf{I} P^{-1}$ , and its square becomes $(\mathbf{I}')^2 = (P \mathbf{I} P^{-1})(P \mathbf{I} P^{-1}) = P \mathbf{I}^2 P^{-1}$ . Applying the cyclic property:

\operatorname{tr}((\mathbf{I}')^2) = \operatorname{tr}(P \mathbf{I}^2 P^{-1}) = \operatorname{tr}(\mathbf{I}^2)

Once again, the quantity is invariant! This allows us to calculate a fundamental physical property, $\sum_i I_i^2$ , from the components of the inertia tensor measured in any convenient, arbitrary lab frame. The cyclic property gives us a systematic way to construct a whole family of these physical invariants: $\operatorname{tr}(\mathbf{I})$ , $\operatorname{tr}(\mathbf{I}^2)$ , $\operatorname{tr}(\mathbf{I}^3)$ , and so on.

This trick of using the trace to make things vanish or simplify becomes a weapon of immense power at the frontiers of physics. In Quantum Electrodynamics (QED), when calculating the probability of particle interactions, one must evaluate monstrous expressions involving products of special matrices called Dirac gamma matrices ( $\gamma^\mu$ ). These calculations, which correspond to Feynman diagrams, are the heart of particle physics. A subfield known as "trace technology" provides a set of rules for taming these beasts. One of the most elegant results comes from combining the cyclic property with another property of a special matrix called $\gamma^5$ , which anticommutes with the other gamma matrices. When calculating the trace of a product involving $\gamma^5$ and an odd number of other gamma-matrix-based terms (let's call the product $M$ ), we can do the following:

\operatorname{tr}(\gamma^5 M) = \operatorname{tr}(M \gamma^5) \quad \text{(by cyclicity)}

But because of the anticommutation rules, we also know that $\gamma^5 M = -M \gamma^5$ . Substituting this in, we get:

\operatorname{tr}(\gamma^5 M) = \operatorname{tr}(-M \gamma^5) = -\operatorname{tr}(M \gamma^5)

The only number that is equal to its own negative is zero. So, $\operatorname{tr}(\gamma^5 M) = 0$ . This single result, born from the cyclic property, allows physicists to prove that countless pages of potential calculations are identically zero without ever computing them. It's the ultimate shortcut, turning impossible problems into trivial ones.

The Language of Symmetry: Character and Representation

Perhaps the most profound application of the cyclic property is in the mathematical theory of symmetry, known as group theory. Symmetries—whether of a crystal, a molecule, or the fundamental laws of nature—are described by abstract groups. To understand their physical consequences, we "represent" the abstract symmetry operations (like rotations or reflections) as matrices. The trace of such a matrix is called its character.

Why are characters so important? Consider two symmetry operations, $g_1$ and $g_2$ , that are "conjugate" to each other, meaning one can be turned into the other by applying some other symmetry operation $h$ and its inverse: $g_2 = h g_1 h^{-1}$ . Physically, this means $g_1$ and $g_2$ are the same type of operation, just viewed in a different orientation. For example, in a square, a $90^\circ$ rotation about the center and a $270^\circ$ rotation (which is $-90^\circ$ ) are conjugate. It stands to reason that any fundamental physical observable shouldn't be able to distinguish between them. The character proves this intuition correct. The matrix for $g_2$ is $D(g_2) = D(h)D(g_1)D(h)^{-1}$ . Its character is:

\chi(g_2) = \operatorname{tr}(D(g_2)) = \operatorname{tr}(D(h)D(g_1)D(h)^{-1}) = \operatorname{tr}(D(g_1)) = \chi(g_1)

The characters of conjugate elements are identical! This is a direct consequence of the cyclic property. This allows us to sort all the symmetry operations of a group into a few "conjugacy classes" of physically indistinguishable operations. The table of characters for these classes is a kind of fingerprint for the symmetry group, and from it, chemists and physicists can predict which spectral lines are visible, which chemical reactions are allowed, and which quantum states are possible.

The trace doesn't just classify—it decomposes. A vector space of matrices acted upon by a group can often be broken down into smaller, "irreducible" subspaces that don't mix with each other. The trace is the surgical tool for this decomposition. For instance, the 4-dimensional space of all $2 \times 2$ matrices, under the action of conjugation, splits beautifully into two invariant subspaces: the 1-dimensional space of scalar multiples of the identity matrix (which are left unchanged by conjugation), and the 3-dimensional space of matrices with trace zero. This is not an accident; it is a fundamental decomposition in physics, separating out the "trivial" part of a transformation from its more interesting, symmetry-breaking part.

A Geometry of Matrices

Finally, the trace allows us to build a whole new perspective on the space of matrices itself. We can think of the set of all $n \times n$ matrices as a vector space. Can we define geometric concepts like "length" and "angle" in this space? Yes, by defining an inner product. A very natural choice is the Frobenius inner product:

\langle A, B \rangle = \operatorname{tr}(A^T B)

Why the transpose $A^T$ ? Because the "length squared" of a matrix $A$ would be $\langle A, A \rangle = \operatorname{tr}(A^T A)$ . One can show this is equal to the sum of the squares of all of its elements, $\sum_{i,j} A_{ij}^2$ , which is guaranteed to be non-negative and is zero only for the zero matrix. This well-behaved definition turns the abstract space of matrices into a familiar Euclidean-like space.

Once we have an inner product, we have geometry. We can apply powerful geometric theorems, like the Cauchy-Schwarz inequality, which states $|\langle u, v \rangle|^2 \le \langle u, u \rangle \langle v, v \rangle$ . Translated into the language of matrices using our trace-based inner product, this becomes a non-obvious and powerful inequality:

(\operatorname{tr}(A^T B))^2 \le (\operatorname{tr}(A^T A)) (\operatorname{tr}(B^T B))

This is a deep structural constraint on matrices, derived by viewing them as vectors in a space whose geometry is defined by the trace. The concept can be taken even one level higher. In modern quantum mechanics, one studies "superoperators"—linear maps that act on matrices themselves. Even for these exotic objects, one can define a trace, and its value can be found by cleverly applying the trace properties of the matrices they act upon.

From ensuring the stability of algorithms to revealing the objective nature of physical laws, from simplifying particle physics calculations to providing the foundation for the theory of symmetry, the cyclic property of the trace is a golden thread. It is a prime example of the physicist's way of thinking: find a simple, fundamental truth and follow its consequences wherever they may lead. You may be surprised to find that a simple shuffle of matrices holds the key to understanding the deep structure of the universe.