try ai
Popular Science
Edit
Share
Feedback
  • Underdetermined System

Underdetermined System

SciencePediaSciencePedia
Key Takeaways
  • An underdetermined system has more variables than independent equations, leading to an infinite set of valid solutions that form a line, plane, or higher-dimensional affine subspace.
  • The minimum norm (L2) principle resolves this ambiguity by selecting the unique solution with the smallest Euclidean length, corresponding to the "lowest energy" state.
  • The sparsity (L1) principle finds the "simplest" solution with the maximum number of zero components, a concept that powers modern technologies like compressed sensing in MRI.
  • The choice between these principles is driven by assumptions about the underlying nature of the problem, whether the desired solution is smooth and distributed (L2) or simple and sparse (L1).

Introduction

What happens when a problem presents us with more unknowns than independent pieces of information? We enter the fascinating realm of the underdetermined system, a scenario where a single, unique answer is replaced by an entire universe of possibilities. Far from being a mathematical failure, this abundance of solutions is a common feature in science, engineering, and economics, reflecting the complexity and ambiguity inherent in the real world. The central challenge then becomes not finding a solution, but choosing the best or most meaningful one from an infinite set. This article navigates that challenge by exploring the guiding principles that allow us to turn ambiguity into insight.

The first section, ​​Principles and Mechanisms​​, will lay the foundation by delving into the mathematical structure of underdetermined systems. We will use geometric intuition and algebraic concepts like the rank-nullity theorem to understand why infinite solutions arise and how they are structured. It will introduce two powerful philosophies for selecting a unique solution: the principle of minimum norm, which seeks the most "energy-efficient" answer, and the principle of sparsity, which favors the simplest explanation. Following this, the section on ​​Applications and Interdisciplinary Connections​​ will demonstrate how these abstract principles are applied to solve concrete problems. We will see how choosing the right kind of solution enables technologies like faster MRI scans, informs financial asset pricing, and reveals the limits of scientific measurement, transforming a mathematical puzzle into a powerful framework for discovery.

Principles and Mechanisms

Imagine you are standing in a vast, flat desert. Your friend, who is in a satellite, tells you, "You are exactly 5 kilometers from the Oasis." This single piece of information isn't enough to tell you where the Oasis is. It could be anywhere on a circle with a 5-kilometer radius around you. You have one equation (your distance) but two unknowns (the north-south and east-west coordinates of the Oasis). You have an infinite number of possibilities. Now, suppose a second friend in another satellite gives you another piece of information that, unfortunately, is just a rephrasing of the first. You're still stuck with infinite possibilities. This, in a nutshell, is the delightful predicament of an ​​underdetermined system​​. It's a system that doesn't provide enough independent information to pin down a single, unique answer. Instead, it offers us a whole universe of valid solutions.

The Freedom of Infinity

Let's move from a desert to the abstract world of algebra and geometry. A linear equation with three variables, like ax1+bx2+cx3=dax_1 + bx_2 + cx_3 = dax1​+bx2​+cx3​=d, can be visualized as a flat plane in three-dimensional space. A system of two such equations, then, corresponds to two planes. The solution to the system is the set of all points that lie on both planes simultaneously—their intersection.

So, what happens when you intersect two planes in 3D space? Think about it. If the planes are not parallel, they must intersect along a straight line. A line contains infinitely many points. If the two planes happen to be parallel and distinct, they never meet, and there is no solution. If they are the same plane (disguised as two different equations), their "intersection" is the entire plane itself—again, infinitely many solutions. In no case do two planes intersect at a single, unique point. To trap a single point in 3D, you need at least three planes intersecting, just as you need at least two distinct lines to define a point in a 2D plane.

This geometric picture reveals a fundamental truth. A system with more variables (nnn) than independent equations (mmm) cannot corner a unique solution. Algebraically, this is explained by the ​​rank-nullity theorem​​. For a system written as Ax=bAx=bAx=b, where AAA is the m×nm \times nm×n matrix of coefficients, the theorem states that the rank of the matrix (the number of independent equations) plus the dimension of its ​​null space​​ must equal the number of variables, nnn. The null space is the collection of all vectors xhx_hxh​ for which Axh=0Ax_h = 0Axh​=0. They are "ghost" solutions that produce an output of zero.

When we have fewer equations than variables (mnm nmn), the rank of AAA can be at most mmm. This forces the dimension of the null space to be at least n−mn-mn−m, which is greater than zero. A null space with a dimension greater than zero contains infinitely many vectors. Now, if we find just one particular solution, let's call it xpx_pxp​, that satisfies Axp=bAx_p = bAxp​=b, we can add any vector xhx_hxh​ from the null space to it, and the result is still a valid solution: A(xp+xh)=Axp+Axh=b+0=bA(x_p + x_h) = Ax_p + Ax_h = b + 0 = bA(xp​+xh​)=Axp​+Axh​=b+0=b. Thus, the existence of one solution automatically implies the existence of an entire family of solutions, all living on a line, a plane, or a higher-dimensional equivalent called an affine subspace.

Charting the Solution Space

Let's make this concrete. Consider the system:

{x1+x3=1x2+x3=1\begin{cases} x_1 + x_3 = 1 \\ x_2 + x_3 = 1 \end{cases}{x1​+x3​=1x2​+x3​=1​

This is a system of two equations (m=2m=2m=2) in three variables (n=3n=3n=3). We can see that x1x_1x1​ and x2x_2x2​ are constrained, but we have some freedom with x3x_3x3​. Let's call x3x_3x3​ our "free parameter" and set it to any value we like, say ttt. Then we immediately get x1=1−tx_1 = 1 - tx1​=1−t and x2=1−tx_2 = 1 - tx2​=1−t. The complete solution set can be written as a vector that depends on ttt:

x(t)=(1−t1−tt)x(t) = \begin{pmatrix} 1 - t \\ 1 - t \\ t \end{pmatrix}x(t)=​1−t1−tt​​

We can rewrite this to reveal its structure:

x(t)=(110)+t(−1−11)x(t) = \begin{pmatrix} 1 \\ 1 \\ 0 \end{pmatrix} + t \begin{pmatrix} -1 \\ -1 \\ 1 \end{pmatrix}x(t)=​110​​+t​−1−11​​

This is the equation of a line in 3D space. The vector xp=(110)Tx_p = \begin{pmatrix} 1 1 0 \end{pmatrix}^Txp​=(110​)T is a ​​particular solution​​ (it's what you get when t=0t=0t=0), and the vector xh=(−1−11)Tx_h = \begin{pmatrix} -1 -1 1 \end{pmatrix}^Txh​=(−1−11​)T is a basis for the one-dimensional null space. Any multiple of xhx_hxh​ can be added to xpx_pxp​ without changing the outcome. The set of all possible solutions is this line, stretching to infinity in both directions.

The Quest for the "Best" Answer

Having an infinity of answers is both a blessing and a curse. The system has given us a space of possibilities, but now we face a new problem: which one do we choose? In science and engineering, this is not a trivial question. The choice of a solution often reflects an underlying physical principle or a desired property. We need an additional criterion, a guiding light to navigate the infinite sea of solutions and pick out the one that is most "meaningful." Two of the most powerful and widely used guiding principles are the principle of minimum norm and the principle of sparsity.

The Shortest Path: The Principle of Minimum Norm

One very natural and elegant idea is to choose the solution that is, in a sense, the "smallest." This is often motivated by a physical "principle of minimum energy", where the size of the solution vector corresponds to some cost or energy that we wish to minimize. The most common way to measure the "size" of a vector x=(x1,x2,…,xn)x = (x_1, x_2, \dots, x_n)x=(x1​,x2​,…,xn​) is its standard Euclidean length, or ​​ℓ2\ell_2ℓ2​-norm​​, defined as ∥x∥2=x12+x22+⋯+xn2\|x\|_2 = \sqrt{x_1^2 + x_2^2 + \dots + x_n^2}∥x∥2​=x12​+x22​+⋯+xn2​​. Minimizing this norm is equivalent to finding the point on the solution line (or plane) that is closest to the origin.

There is a beautiful geometric insight here. The solution vector with the minimum possible norm is orthogonal to the direction of the solution line or plane. Our solution set is x=xp+xhx = x_p + x_hx=xp​+xh​, where xhx_hxh​ is any vector from the null space. The null space defines the "direction" of the solution line/plane. Therefore, the minimum norm solution must be orthogonal to every vector in the null space.

This leads to a profound conclusion. Any vector in Rn\mathbb{R}^nRn can be uniquely split into two parts: a part that lies in the ​​row space​​ of matrix AAA (the space spanned by its row vectors) and a part that lies in its null space. These two spaces are orthogonal complements. Our minimum norm solution, by having to be orthogonal to the null space, must be the solution that lies entirely within the row space of AAA. It has no "ghost" component from the null space.

This insight gives us a direct recipe to find this special solution. The minimum norm solution, often denoted x+x^+x+, is given by the formula:

x+=AT(AAT)−1bx^+ = A^T(AA^T)^{-1}bx+=AT(AAT)−1b

This formula, which uses what is known as the ​​Moore-Penrose pseudoinverse​​ of AAA, might look intimidating, but it's just a machine for finding the one solution that lives in the row space of AAA. This approach is incredibly powerful in applications like control theory, robotics, and fitting models to data, where we want the smoothest or most efficient solution.

The Simplest Story: The Principle of Sparsity

What if our idea of a "best" solution is not about being small, but about being simple? Imagine you are a detective trying to explain a crime. A theory that involves a single culprit is simpler than one involving a complex conspiracy of ten people. This is Occam's Razor: prefer the simplest explanation that fits the facts.

In many modern problems, from medical imaging to machine learning, we believe the underlying signal or model is ​​sparse​​—meaning most of its components are zero. For instance, a brain scan might show activity in only a few localized regions. The true vector of neural activity is mostly zeros. When our measurements give us an underdetermined system, we want to recover the solution that has the most zeros.

Directly counting non-zero elements is computationally hard. So, we use a clever proxy: the ​​ℓ1\ell_1ℓ1​-norm​​, defined as ∥x∥1=∣x1∣+∣x2∣+⋯+∣xn∣\|x\|_1 = |x_1| + |x_2| + \dots + |x_n|∥x∥1​=∣x1​∣+∣x2​∣+⋯+∣xn​∣. The procedure of minimizing the ℓ1\ell_1ℓ1​-norm subject to the constraint Ax=bAx=bAx=b is called ​​Basis Pursuit​​. It turns out that this method has a stunning tendency to produce solutions with many zero entries.

A Tale of Two Norms: The Face-Off

Let's see this in action with a simple one-equation, two-variable system: 2x1+x2=42x_1 + x_2 = 42x1​+x2​=4. The solutions form a line in the x1x_1x1​-x2x_2x2​ plane. Which point on this line is "best"?

  1. ​​The L2 (Minimum Norm) Approach:​​ If we seek the solution that minimizes ∥x∥22=x12+x22\|x\|_2^2 = x_1^2 + x_2^2∥x∥22​=x12​+x22​, we are looking for the point on the line 2x1+x2=42x_1 + x_2 = 42x1​+x2​=4 closest to the origin. The solution is found to be (85,45)(\frac{8}{5}, \frac{4}{5})(58​,54​). Notice that both components are non-zero. The ℓ2\ell_2ℓ2​-norm acts democratically, spreading the "burden" across all components to keep them all small. This is typical of ​​Tikhonov regularization​​, which gives smooth, dense solutions.

  2. ​​The L1 (Sparsity) Approach:​​ If we seek the solution that minimizes ∥x∥1=∣x1∣+∣x2∣\|x\|_1 = |x_1| + |x_2|∥x∥1​=∣x1​∣+∣x2​∣, the geometry is different. The "circles" of constant ℓ1\ell_1ℓ1​-norm are diamonds centered at the origin. To find the solution, we expand this diamond until it just touches the solution line. Because the diamond has sharp corners lying on the axes, this first touch is very likely to happen at one of these corners. For our line, the minimum ℓ1\ell_1ℓ1​-norm is achieved at the point (2,0)(2, 0)(2,0), where x2x_2x2​ is exactly zero. This is the essence of why ​​LASSO​​ and other ℓ1\ell_1ℓ1​-based methods produce sparse results. They favor solutions where as many components as possible are pushed all the way to zero.

So we have two very different philosophies for navigating infinity. The minimum ℓ2\ell_2ℓ2​-norm seeks a balanced, smooth, and "low-energy" solution. The minimum ℓ1\ell_1ℓ1​-norm seeks a sparse, simple, "parsimonious" solution. The choice between them is not about which one is mathematically superior, but about what we believe is the nature of the solution we are looking for. Is it a smooth field, or a collection of isolated points? The beauty of linear algebra is that it provides us with the precise tools to find whichever one we desire.

Applications and Interdisciplinary Connections

After our journey through the principles of underdetermined systems, you might be left with a curious feeling. We have a beautiful mathematical structure, a whole space of infinite solutions, but what is its use? A situation with too many answers feels less like a solution and more like a puzzle. But it is precisely in this ambiguity, in this freedom of choice, that the true power and beauty of this concept lie. Nature, engineering, and even our economic systems are filled with situations where we have more unknowns than we have firm rules. An underdetermined system is not a failure of mathematics; it is an honest description of our world.

The great physicist Richard Feynman once said, "The game I play is a very interesting one. It's imagination in a tight straitjacket." The "tight straitjacket" is the set of equations that describe what we know—our measurements, our observations. The "imagination" is how we choose from the infinite possibilities that fit within that straitjacket. The art and science of applying underdetermined systems is about choosing a principle, a philosophy, to guide that choice. Let's explore some of these guiding principles and see where they take us.

The Principle of Minimum Energy: The Universe's Laziness

Perhaps the most intuitive principle we can apply is one of efficiency. If a system can satisfy our constraints in many ways, which way is the "simplest"? One beautiful definition of simplicity is to use the least amount of effort. In the language of vectors, "effort" or "energy" can be measured by the length of the solution vector, its Euclidean norm ∥x∥2\|\mathbf{x}\|_2∥x∥2​. Finding the solution with the minimum Euclidean norm is like finding the point on a line or plane that is closest to the origin. It is the most compact, the most centered, the "lowest energy" solution.

For any consistent underdetermined system Ax=bA\mathbf{x} = \mathbf{b}Ax=b, there exists a unique solution that is shorter than all the others. This "minimum-norm" solution isn't just a mathematical curiosity; it often corresponds to a physically meaningful state. Think of a flexible wire held in a certain shape; its default position will often be the one that minimizes its total potential energy. This principle gives us a definite, unique answer where there was once an infinitude. It's a beautifully direct way to resolve ambiguity.

But what if "shortest" isn't what we mean by "simplest"? What if our intuition about simplicity points in a different direction?

The Principle of Sparsity: Nature's Occam's Razor

Consider a different kind of simplicity, one championed by the philosopher William of Ockham: "Entities should not be multiplied without necessity." In modern terms, the simplest explanation is often the best. What would this mean for a vector solution? It might mean that the underlying phenomenon is caused by only a few significant factors, not a little bit of everything. The simplest solution, in this view, is the one with the most zeros. This is the principle of ​​sparsity​​.

How can we mathematically hunt for a solution with the most zeros? It turns out that minimizing the Euclidean norm (L2L_2L2​ norm) does the opposite—it tends to spread the "energy" out, giving small non-zero values to many components. A different ruler is needed. This ruler is the ​​L1L_1L1​ norm​​, defined as the sum of the absolute values of the components, ∥x∥1=∑i∣xi∣\|\mathbf{x}\|_1 = \sum_i |x_i|∥x∥1​=∑i​∣xi​∣.

Minimizing the L1L_1L1​ norm is a kind of magic. It has an uncanny ability to produce solutions where most components are exactly zero. This idea, known as ​​basis pursuit​​, is the engine behind one of the most significant technological revolutions of the last two decades: ​​compressed sensing​​.

Imagine taking a CT scan. The goal is to reconstruct a detailed 3D image of a human body (a vector x\mathbf{x}x with millions of pixel values) from a series of X-ray projections (a much smaller measurement vector b\mathbf{b}b). This is a massively underdetermined problem. If we ask for the minimum L2L_2L2​ norm solution, we get a blurry, fuzzy image. The algorithm spreads the information it has across all the pixels. But if we assume that the image is mostly composed of large regions of uniform tissue (bone, muscle, air), then the difference between adjacent pixels should be mostly zero. The image is "sparse in its gradient." By asking the machine to find the solution that satisfies the measurements and has the minimum L1L_1L1​ norm, we are telling it: "Find me the sharpest possible image that could have produced this data." The results are astonishingly clear. This very principle allows MRI machines to operate much faster and with lower radiation doses, as they need fewer measurements to construct a high-quality image.

Beyond Simple Norms: Incorporating Deeper Knowledge

The choice between L2L_2L2​ (minimum energy) and L1L_1L1​ (sparsity) is a choice about the fundamental nature of the signal we are looking for. But we can be even more sophisticated. In statistics and machine learning, we often have prior beliefs about our unknowns. Perhaps we expect some variables to be larger than others, or we know there are correlations between them.

We can encode this prior knowledge into a custom-made metric. Instead of minimizing the simple sum of squares x12+x22+…x_1^2 + x_2^2 + \dotsx12​+x22​+…, we can minimize a weighted sum, like xTSx\mathbf{x}^T S \mathbf{x}xTSx, where the matrix SSS contains our knowledge about the expected variances and covariances of the components of x\mathbf{x}x. This is related to the ​​Mahalanobis distance​​. Finding the solution that minimizes this quantity is equivalent to finding the "most probable" solution given our prior statistical model. This powerful technique bridges the gap between pure linear algebra and the nuanced world of statistical inference, finding applications everywhere from portfolio optimization to weather forecasting.

From Infinite Solutions to Price Intervals: The Economics of Ambiguity

The implications of underdetermined systems are not confined to the physical sciences. Consider the world of finance. The fundamental theorem of asset pricing states that in a market with no arbitrage (no risk-free profit), there must exist a set of "state prices" that can price all assets. If we have more possible future states of the world than we have traded assets, the market is called ​​incomplete​​.

When we set up the equations to find these state prices, what do we get? An underdetermined system!. There is no single, unique set of state prices. Instead, there is an entire family of valid price systems that are consistent with the observed prices of traded assets.

What does this mean for a new, exotic financial derivative that we wish to price? It means there is no single "correct" price. Instead, there is a range of possible arbitrage-free prices, an interval corresponding to the different possible solutions for the state-price vector. The ambiguity of the underdetermined system translates directly into financial reality: the bid-ask spread on a new product reflects, in part, this fundamental uncertainty. The mathematics doesn't give a single answer because the market itself hasn't provided enough information to do so.

The Edge of Ambiguity: Ill-Conditioning

Finally, let us consider the fascinating gray area between a uniquely determined system and an underdetermined one. Imagine you are performing an experiment to determine two quantities, but your two measurements are almost identical—for example, two overlapping peaks in an X-ray diffraction pattern from a crystal. Mathematically, the columns of your system matrix AAA are nearly linearly dependent. The matrix is invertible, so technically a unique solution exists. But the matrix is ​​ill-conditioned​​.

An ill-conditioned system behaves like a pathological cousin of an underdetermined one. Its condition number, a measure of how much errors in the input are magnified in the output, becomes enormous. A tiny amount of measurement noise—inevitable in any real experiment—can cause the calculated solution to swing wildly and become meaningless. The system is telling us something profound: although you have two equations for two unknowns, you don't have two independent pieces of information. Your experiment is poorly designed to distinguish between the two effects you're trying to measure. Here, the challenge is not to choose a solution from an infinite set, but to recognize that the unique solution you have is built on a foundation of sand.

A Framework for Discovery

From reconstructing images inside our bodies to pricing financial instruments and understanding the limits of scientific measurement, underdetermined systems are everywhere. They represent a fundamental truth: our data, our measurements, are often just shadows of a more complex reality. The space of solutions is not a problem to be solved, but a landscape to be explored. By choosing a guiding principle—minimum energy, maximum sparsity, statistical likelihood—we inject our own hypothesis about the world into the mathematics. In doing so, we turn ambiguity into insight and transform an infinity of possibilities into a single, powerful story.