Matrix of a Linear Transformation

SciencePedia

The matrix of a linear transformation is fundamentally constructed by using the transformed basis vectors as its columns.
Matrix multiplication directly corresponds to the composition of linear transformations, where the order of multiplication reflects the order of application.
Key properties like the determinant and trace are invariants that reveal the intrinsic nature of a transformation, regardless of the chosen basis.
The concept of representing transformations with matrices extends from geometric vectors to abstract vector spaces, including polynomials, functions, and quantum states.

Introduction

How can we describe a complex action, like a rotation in three-dimensional space or a projection onto a plane, in a simple, computable way? This is a central question in fields ranging from computer graphics to physics. The answer lies in one of the most powerful ideas in linear algebra: representing these actions, known as linear transformations, with a grid of numbers called a matrix. This matrix acts as the complete genetic code for the transformation, capturing its entire behavior in a compact form.

This article demystifies the connection between the abstract concept of a transformation and its concrete matrix representation. It addresses the fundamental problem of how to translate a geometric or algebraic action into a matrix, and how to use that matrix to understand the action's deepest properties.

Across the following sections, you will discover the core principles behind this powerful tool. The "Principles and Mechanisms" section will explain how to build the matrix of any linear transformation by simply observing what it does to a set of fundamental building blocks called basis vectors. You will also learn how the algebra of matrices elegantly mirrors the composition of transformations. Following that, the "Applications and Interdisciplinary Connections" section will take you on a journey to see this concept in action, revealing how matrices form the language of geometry, describe physical laws, and even provide a framework for understanding the strange world of quantum mechanics.

Principles and Mechanisms

Imagine you have a magical machine that can stretch, squeeze, rotate, or otherwise alter any object you put inside it. How would you describe what this machine does? You could try to list its effect on every possible object, but that would be an infinite task. A far cleverer approach would be to test it on a few fundamental building blocks. If you know how the machine transforms these basic components, you can predict its effect on any object, because every object is just a combination of these components.

This is the central idea behind the matrix of a linear transformation. A linear transformation is a special, well-behaved kind of "machine" that operates on vectors. And its entire, elaborate behavior can be captured in a simple grid of numbers: a matrix. The secret lies in understanding what the transformation does to a special set of vectors called a basis.

The Genetic Code of a Transformation

In the familiar world of two or three-dimensional space, the most convenient building blocks are the standard basis vectors. These are the vectors of length one that point along the coordinate axes. In 3D space, $\mathbb{R}^3$ , they are $\mathbf{e}_1 = (1, 0, 0)$ , $\mathbf{e}_2 = (0, 1, 0)$ , and $\mathbf{e}_3 = (0, 0, 1)$ . Any vector $\mathbf{x} = (x_1, x_2, x_3)$ can be written as a combination of these: $\mathbf{x} = x_1\mathbf{e}_1 + x_2\mathbf{e}_2 + x_3\mathbf{e}_3$ .

Because a linear transformation $T$ is, well, linear, its action on any vector $\mathbf{x}$ is determined by its action on the basis vectors: $T(\mathbf{x}) = T(x_1\mathbf{e}_1 + x_2\mathbf{e}_2 + x_3\mathbf{e}_3) = x_1 T(\mathbf{e}_1) + x_2 T(\mathbf{e}_2) + x_3 T(\mathbf{e}_3)$ This equation holds a beautiful secret. It tells us that if we just know the three vectors $T(\mathbf{e}_1)$ , $T(\mathbf{e}_2)$ , and $T(\mathbf{e}_3)$ , we know everything about the transformation. This is where the matrix comes in. The standard matrix of a linear transformation is nothing more than a neat package containing this exact information. Its columns are precisely the transformed basis vectors.

Imagine a computer graphics program that projects a 3D scene onto your 2D screen. This is a linear transformation $T: \mathbb{R}^3 \to \mathbb{R}^2$ . Suppose we find that the basis vectors are mapped as follows: $T(\mathbf{e}_1) = (1, 1)$ , $T(\mathbf{e}_2) = (-1, 1)$ , and $T(\mathbf{e}_3) = (2, 0)$ . To build the matrix $A$ for this transformation, we simply arrange these output vectors as its columns: $A = \begin{pmatrix} 1 & -1 & 2 \\ 1 & 1 & 0 \end{pmatrix}$ This simple matrix now holds the complete "genetic code" for the projection. To find where any 3D point $\mathbf{x} = (x_1, x_2, x_3)$ lands on the screen, we just multiply it by this matrix: $A\mathbf{x}$ .

This principle is wonderfully general. Consider a transformation that squishes 3D space flat by projecting every vector onto the $xy$ -plane. A vector $(x,y,z)$ becomes $(x,y,0)$ . What does this do to our basis vectors?

$T(\mathbf{e}_1) = T(1,0,0) = (1,0,0)$
$T(\mathbf{e}_2) = T(0,1,0) = (0,1,0)$
$T(\mathbf{e}_3) = T(0,0,1) = (0,0,0)$ The resulting matrix is a testament to the simplicity of the idea: $A = \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 0 \end{pmatrix}$ Similarly, a shear transformation in 2D that fixes the x-axis but pushes the y-axis sideways might map $\mathbf{e}_1$ to itself, $\begin{pmatrix} 1 \\ 0 \end{pmatrix}$ , and $\mathbf{e}_2$ to $\begin{pmatrix} 3 \\ 1 \end{pmatrix}$ . The matrix is instantly revealed: $A = \begin{pmatrix} 1 & 3 \\ 0 & 1 \end{pmatrix}$ In every case, the rule is the same: to find the matrix, just ask what the transformation does to the standard basis vectors, and write down the answers.

A Universal Language

So far, we have been thinking of vectors as arrows in space. But the power of linear algebra is that the concept of a "vector" is far broader. A vector can be a polynomial, a function, a sound wave—anything that belongs to a set where you can meaningfully add elements and multiply them by scalars. This set is called a vector space.

The amazing thing is that our method for building matrices works in these abstract spaces, too. Consider the space of simple polynomials of degree at most 1, like $p(t) = a_0 + a_1 t$ . The polynomials $1$ and $t$ can act as our basis vectors, much like $\mathbf{e}_1$ and $\mathbf{e}_2$ . Let's define a transformation $L$ that takes a polynomial and shifts it, then subtracts the original: $L(p(t)) = p(t+c) - p(t)$ , for some constant $c$ . This is a kind of discrete derivative.

To find the matrix for $L$ , we apply it to our basis "vectors," $\{1, t\}$ :

For the basis vector $1$ (i.e., $p(t)=1$ ): $L(1) = 1 - 1 = 0$ . In our basis, this is $0 \cdot (1) + 0 \cdot (t)$ , so its coordinate vector is $\begin{pmatrix} 0 \\ 0 \end{pmatrix}$ .
For the basis vector $t$ (i.e., $p(t)=t$ ): $L(t) = (t+c) - t = c$ . In our basis, this is $c \cdot (1) + 0 \cdot (t)$ , so its coordinate vector is $\begin{pmatrix} c \\ 0 \end{pmatrix}$ .

Arranging these coordinate vectors as columns gives the matrix for this "shift-and-subtract" operator: $[L]_{\mathcal{B}} = \begin{pmatrix} 0 & c \\ 0 & 0 \end{pmatrix}$ This matrix now does for polynomials what our other matrices did for geometric vectors. The same principle applies even when mapping between different types of spaces, for instance, from a space of polynomials to the geometric space $\mathbb{R}^3$ , as one might do in a signal processing application. The method is universal: find a basis, see where the transformation sends the basis elements, and write down the coordinates.

The Algebra of Action

If transformations can be written as matrices, what does it mean to multiply two matrices? The answer is one of the most elegant ideas in mathematics: matrix multiplication corresponds to composition of transformations.

Suppose you apply one transformation, $T_1$ , and then you apply another, $T_2$ , to the result. This combined operation, "do $T_1$ , then do $T_2$ ," is itself a new linear transformation, $T = T_2 \circ T_1$ . If $T_1$ is represented by matrix $A_1$ and $T_2$ by $A_2$ , then the new transformation $T$ is represented by the matrix product $A = A_2 A_1$ . (Note the order—it reflects the order of application).

Let's see this in action. Consider a two-step process in $\mathbb{R}^3$ :

First, project a vector onto the $xy$ -plane. We already know the matrix for this, let's call it $P = \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 0 \end{pmatrix}$ .
Second, rotate the resulting vector counter-clockwise around the $z$ -axis by an angle $\theta$ . The matrix for this rotation is $R_z(\theta) = \begin{pmatrix} \cos\theta & -\sin\theta & 0 \\ \sin\theta & \cos\theta & 0 \\ 0 & 0 & 1 \end{pmatrix}$ .

The matrix for the entire composite transformation is simply the product $R_z(\theta) P$ : $[T] = R_z(\theta) P = \begin{pmatrix} \cos\theta & -\sin\theta & 0 \\ \sin\theta & \cos\theta & 0 \\ 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 0 \end{pmatrix} = \begin{pmatrix} \cos\theta & -\sin\theta & 0 \\ \sin\theta & \cos\theta & 0 \\ 0 & 0 & 0 \end{pmatrix}$ This matrix elegantly captures the two-step process. The top-left $2 \times 2$ block performs the rotation, and the row and column of zeros ensure that the $z$ -component is annihilated, just as we'd expect from the initial projection. The abstract algebra of multiplying matrices perfectly mirrors the concrete geometry of combining actions.

The Matrix as an Oracle

A matrix is more than just a recipe for a transformation; it's an oracle. Once you have it, you can ask it profound questions about the nature of the transformation.

One such question is: "By how much do you scale space?" A transformation might stretch or shrink space, changing the area of a square or the volume of a cube. This scaling factor is captured by a single number called the determinant. For a 2D transformation that maps the basis vectors $\mathbf{e}_1$ and $\mathbf{e}_2$ to the vectors $\begin{pmatrix} a \\ c \end{pmatrix}$ and $\begin{pmatrix} b \\ d \end{pmatrix}$ , the matrix is $A = \begin{pmatrix} a & b \\ c & d \end{pmatrix}$ . The unit square defined by $\mathbf{e}_1$ and $\mathbf{e}_2$ is transformed into a parallelogram. The area of this new parallelogram is given by the absolute value of the determinant, $\det(A) = ad - bc$ . A positive determinant means the orientation is preserved, while a negative one means it's been flipped (like looking in a mirror).

Another crucial question is: "What is the dimension of your output?" A transformation might take a high-dimensional space and squash it into a lower-dimensional one. For example, projecting a 3D world onto a 2D screen. The dimension of the resulting image or range of the transformation is called its rank. This rank is encoded directly in the transformation's matrix.

Consider a projection of all of 3D space onto the plane defined by $x+y+z=0$ . The input space is 3-dimensional, but what is the output? It's the plane itself! A plane is a 2-dimensional object. Therefore, the rank of this transformation must be 2. We don't even need to calculate the matrix to know this; the geometry tells us the answer. The rank tells us how many dimensions "survive" the transformation. The dimensions that get squashed to zero form the "null space," and the Rank-Nullity Theorem gives us the beautiful relation: $\text{rank}(T) + \text{nullity}(T) = \text{dimension of input space}$ .

The Freedom of Perspective

There is a subtle but crucial distinction to be made: the transformation itself is a pure, abstract concept (e.g., "rotate by 30 degrees"), while its matrix is a description of that concept. And like any description, it depends on your point of view—that is, your choice of basis. If you describe the same rotation using a different set of basis vectors, you will get a different matrix.

So what stays the same? What are the intrinsic properties of the transformation, independent of our description? These are the invariants. The determinant is one such invariant. No matter which basis you use to write the matrix for a rotation, the determinant will always be 1, because a pure rotation doesn't change area or volume.

Another important invariant is the trace of a matrix—the sum of the elements on its main diagonal. Suppose you have a transformation $T$ represented by matrix $A$ in the standard basis. If you switch to a new basis $B$ , the new matrix will be $A' = P^{-1}AP$ , where $P$ is the change-of-basis matrix. While $A$ and $A'$ can look very different, their traces will be identical: $Tr(A') = Tr(A)$ . This tells us that the trace is a property of the transformation $T$ itself, not of the particular matrix we use to write it down.

This freedom of perspective is also a source of power. Sometimes we don't know what a transformation does to the standard basis, but we do know its effect on some other, more convenient basis vectors. For example, a materials scientist might observe that a deformation $L$ maps a vector $\vec{v}_1 = \begin{pmatrix} 2 \\ 1 \end{pmatrix}$ to $\begin{pmatrix} 1 \\ 5 \end{pmatrix}$ and $\vec{v}_2 = \begin{pmatrix} -1 \\ 1 \end{pmatrix}$ to $\begin{pmatrix} 4 \\ -1 \end{pmatrix}$ . The set $\{\vec{v}_1, \vec{v}_2\}$ forms a basis. By understanding the principles of how matrices change with the basis, we can work backward to find the standard matrix $A$ that describes this deformation.

The matrix of a linear transformation is therefore not just a computational tool. It is a bridge between the algebra of numbers and the geometry of space. It encodes the fundamental actions of a transformation, allows us to compose them, and serves as an oracle for revealing their deepest properties—properties that remain constant no matter how we choose to look at them.

Applications and Interdisciplinary Connections

So, we have discovered this wonderful trick for capturing the essence of a linear transformation—this twisting, stretching, or rotating of space—and bottling it up into a neat, rectangular array of numbers called a matrix. You might be tempted to think this is just a convenient piece of bookkeeping, a compact notation to make our calculations tidy. But that would be like saying a musical score is just a tidy way of storing notes. The real magic happens when you play the music! The matrix of a linear transformation is not just a description; it is a tool, a master key that unlocks profound connections between seemingly disparate worlds, from the pure geometry of space to the deepest mysteries of quantum physics and abstract mathematics. Let's take a journey and see where this key takes us.

The Symphony of Space and Geometry

The most intuitive place to start is with the space we live in. We can think of matrices as the grammar of geometry. Simple actions like a reflection or a rotation each have their own matrix. But what if we want to do one thing after another? Suppose we want to first reflect a vector across the x-axis, and then project it onto the line $y=x$ . Each step is a transformation with its own matrix. The astonishingly simple and powerful truth is that the matrix of the combined transformation is just the product of the individual matrices. The order matters, of course, just as putting on your socks and then your shoes is quite different from the reverse! This idea that composing actions is equivalent to multiplying matrices gives us an incredibly powerful language for describing complex geometric sequences.

This language isn't confined to the flatland of a two-dimensional plane. It soars into three dimensions with equal, if not greater, elegance. Imagine wanting to perform a reflection across an arbitrarily tilted plane in space. You could try to track what happens to every point with complicated trigonometry, but it would be a nightmare. Instead, linear algebra gives us a breathtakingly beautiful formula. If the plane is defined by a normal vector $\mathbf{n}$ , the matrix for the reflection is simply $I - 2 \frac{\mathbf{n}\mathbf{n}^T}{\mathbf{n}^T\mathbf{n}}$ . Look at that! The entire, complex spatial operation is captured in a compact expression built from the identity matrix $I$ and the vector $\mathbf{n}$ itself. In a similar vein, the act of projecting a vector onto a line or a plane—a fundamental operation in computer graphics, data analysis, and engineering—also has a beautifully simple matrix form, $A = \frac{\mathbf{v}\mathbf{v}^T}{\mathbf{v}^T\mathbf{v}}$ , where $\mathbf{v}$ is the vector defining the line. These are not just formulas; they are poems written in the language of mathematics.

Sometimes, we define a transformation not by what it does geometrically, but by its fundamental properties—what it preserves and what it annihilates. Imagine a transformation that crushes an entire plane of vectors down to the origin (this is its kernel) while projecting everything onto a single line (this is its image). By specifying these two abstract properties, the transformation is uniquely defined. And when we construct its matrix, we might find a familiar friend. For instance, the transformation whose kernel is the plane $x+y+z=0$ and whose image is the line spanned by the vector $(1,1,1)$ turns out to be nothing more than the orthogonal projection onto that very line. This reveals a deep connection between the algebraic structure (kernel and image) and the geometric action (projection) of a transformation.

The power of matrices also shines when we wish to change our point of view. The world doesn't always come arranged on a neat Cartesian grid. Suppose you have a skewed grid, a parallelogram, and you want to transform it into a perfect unit square. This is equivalent to finding a transformation that maps the vectors defining the parallelogram to the standard basis vectors. This "un-skewing" operation, crucial in fields like computer graphics for texture mapping, can be found by constructing a matrix from the skewed vectors and simply inverting it.

Physics, Forces, and Fundamental Directions

The universe, as far as we can tell, plays by mathematical rules, and many of them are beautifully linear. Consider the cross product, an operation central to describing rotation, torque, and the magnetic force. The operation "take the cross product with a fixed vector $\mathbf{a}$ " is itself a linear transformation. This means there must be a matrix that is the cross product—a machine that, when you feed it a vector $\mathbf{v}$ , spits out $\mathbf{a} \times \mathbf{v}$ . This matrix, known as a skew-symmetric matrix, provides a bridge between vector algebra and matrix algebra, allowing physical laws involving rotations and angular momentum to be expressed in the powerful framework of linear transformations.

Perhaps one of the most profound applications in physics (and beyond) comes from asking a simple question: for a given transformation, are there any special directions? Are there vectors that, when transformed, don't change their direction but are simply scaled—stretched or shrunk? These special vectors are called eigenvectors, and the scaling factors are their eigenvalues. Imagine a transformation that stretches everything along one line by a factor of 3 and compresses everything along a perpendicular line by a factor of $\frac{1}{3}$ . Any other vector will be twisted and moved in a complicated way. But for vectors on these two special lines, the action is simple scaling. These lines represent the "axes" of the transformation. Knowing them allows us to understand the transformation at its deepest level. This concept is absolutely central to physics and engineering. It describes the principal axes of stress in a material, the normal modes of a vibrating system (like a guitar string or a bridge), and the energy levels of an atom.

Journeys into Abstract Realms

The true power of a great idea is its generality. The concept of a vector is not limited to arrows in space, and the matrix of a linear transformation is not just for geometry. A vector space can be a collection of anything that you can add together and scale by numbers—like polynomials, functions, or even other matrices!

Consider the space of all $2 \times 2$ matrices. The matrices themselves are now our "vectors." An operation like taking the transpose of a matrix, where you flip it across its main diagonal, is a linear transformation on this space. So, it must have a matrix representation! We can construct a $4 \times 4$ matrix that, when it acts on the coordinates of a $2 \times 2$ matrix, gives the coordinates of its transpose. It's a bit mind-bending—a matrix that represents an operation on other matrices—but it shows how the framework of linear algebra can be applied to its own objects, a beautiful self-reference.

This journey into abstraction takes us to one of the crown jewels of modern science: quantum mechanics. In the quantum world, the state of a particle, like the spin of an electron, is not described by numbers like position and velocity, but by a vector in a complex vector space—a space where the scalars are complex numbers. Physical operations—like measuring a particle's spin along a certain axis or letting it evolve in time—are linear transformations. The transformation $T(z_1, z_2) = (iz_2, -iz_1)$ is a simple example. Its matrix, $\begin{pmatrix} 0 & i \\ -i & 0 \end{pmatrix}$ , is directly related to a Pauli matrix, which represents the measurement of spin for a quantum bit, or "qubit"—the fundamental building block of a quantum computer. The weirdness of quantum mechanics is, in many ways, the linear algebra of complex vector spaces made real.

Finally, we arrive at the highest level of abstraction, in the realm of pure mathematics itself. Consider a field extension like $\mathbb{Q}(\sqrt{7})$ , which consists of all numbers of the form $a + b\sqrt{7}$ where $a$ and $b$ are rational. This set is not just a field; it can be viewed as a two-dimensional vector space over the rational numbers, with basis vectors $1$ and $\sqrt{7}$ . What happens when you multiply any number in this space by, say, $3 - 2\sqrt{7}$ ? This act of multiplication is a linear transformation! And, like any other, it can be represented by a matrix—in this case, a simple $2 \times 2$ matrix with rational entries. Isn't that remarkable? An operation from pure number theory can be perfectly modeled by matrix multiplication. This is the birth of representation theory, a vast and beautiful subject that uses the concrete tools of linear algebra to study abstract algebraic structures, revealing their hidden symmetries and structures.

From the familiar geometry of our world to the bizarre rules of the quantum realm and the intricate patterns of pure mathematics, the matrix of a linear transformation is a unifying thread. It is a testament to the fact that in science, finding the right language, the right representation, is often the key to unlocking a deeper understanding of the universe.