Nonlinear Observability

SciencePedia

Key Takeaways

Nonlinear observability is the principle of determining a system's unique internal state by observing its external outputs over time, a task complicated by nonlinear dynamics.
System dynamics and control inputs can create "virtual sensors" through Lie derivatives, revealing information that a single static measurement cannot provide.
The distinction between local and global observability is critical, as methods like the Extended Kalman Filter (EKF) that rely on local linearization can fail dramatically if global ambiguities are present.
Observability theory extends to parameter estimation (structural identifiability), providing a crucial tool to determine if unknown model parameters can be identified from experimental data.

Introduction

How can we understand the complete internal state of a complex system when our view is limited to a few external measurements? This fundamental question is at the heart of countless challenges in science and engineering, from tracking a satellite to understanding a biological process. The formal framework for answering this question is known as observability. While straightforward for linear systems, the introduction of nonlinearities creates profound challenges, such as ambiguities, hidden symmetries, and state-dependent information content. Simply linearizing the system can be dangerously misleading, creating a knowledge gap between what our simplified models tell us and how the real world behaves.

This article provides a guide to the essential concepts of nonlinear observability, bridging theory and practice. First, in "Principles and Mechanisms," we will explore the core mathematical machinery, defining local and global observability, introducing Lie derivatives as "virtual sensors," and establishing the powerful Observability Rank Condition. We will also examine the common pitfalls of oversimplification, such as the divergence of the Extended Kalman Filter. Subsequently, in "Applications and Interdisciplinary Connections," we will see these principles in action, revealing how observability analysis is used to understand limitations in robotics, design motion strategies to reveal hidden states, and determine the identifiability of parameters in scientific models across diverse fields.

Principles and Mechanisms

Imagine you are standing outside a completely sealed room. You cannot see inside, but you can hear sounds from within. Your task is to figure out exactly what is happening inside—the positions and velocities of all the objects—just by listening. This is the essence of observability. The state of the system, $x$ , is what’s happening inside the room. The output, $y$ , is the sound you can measure. The laws of physics governing the room are the system equations, $\dot{x} = f(x, u)$ and $y = h(x)$ . The grand question of observability is: by knowing the rules of the game ( $f$ and $h$ ) and listening to the output ( $y(t)$ ), can we paint a complete and unambiguous picture of the initial state of the room, $x(0)$ ?

A Static Snapshot: The Limits of a Single Measurement

Before we consider things in motion, let's start with the simplest possible case: a single, static measurement, $y = h(x)$ . Can we find $x$ ? This is simply a question of whether the function $h$ is invertible. Often, it is not.

Consider one of the simplest nonlinearities imaginable: $h(x) = x^2$ . If we measure the output to be $y=4$ , can we determine the state $x$ ? Of course not. The state could be $x=2$ or $x=-2$ . We can't distinguish between a state and its negative counterpart. In the language of control theory, this system is not globally observable. There are distinct initial states that produce identical outputs, making them indistinguishable.

However, what if we have some prior information? Suppose a helpful colleague whispers, "I promise you, the state is positive." Now, if you measure $y=4$ , you know with certainty that $x=2$ . By restricting our attention to a smaller region of the state space (the positive real numbers), the function $h(x)=x^2$ becomes one-to-one. This is the idea behind local observability. A system is locally observable at a point if, within a small enough neighborhood of that point, all distinct states are distinguishable. This distinction between the local and global picture is not a mere technicality; as we will see, it is at the heart of why some estimation methods for nonlinear systems fail spectacularly.

Another real-world example of this is a sensor that saturates. Imagine a sensor that measures a state $x_1$ , but it has a physical limit: it can't read values greater than 1 or less than -1. Its output is $y = \operatorname{sat}_1(x_1)$ . If the true state $x_1$ is, say, 2, the sensor simply reports $y=1$ . If the state is 3, the sensor still reports $y=1$ . In this "saturated" region, the output is constant. It provides no information whatsoever about the true state, as long as it's beyond the saturation limit. The system is fundamentally unobservable in these regions, creating blind spots from which no information can escape.

Dynamics to the Rescue: Information in Motion

A single snapshot in time can be ambiguous. But what if we observe the output as it evolves? The dynamics of the system might just give us the extra clues we need.

Let's return to our simple output $y = h(x)$ . The output itself might not be enough, but what about its rate of change, $\dot{y}$ ? Using the chain rule, we can find its value at the very instant we start observing ( $t=0$ ):

\dot{y}(0) = \frac{d}{dt}h(x(t))\bigg|_{t=0} = \nabla h(x(0)) \cdot \dot{x}(0)

And since we know the rules of the game, $\dot{x} = f(x)$ , we can write:

\dot{y}(0) = \nabla h(x(0)) \cdot f(x(0))

This beautiful expression, the rate of change of the output along the system's natural flow, is called the Lie derivative of $h$ along the vector field $f$ , denoted $L_f h(x)$ . It acts as a "virtual sensor." We only measure $h(x)$ directly, but by observing its evolution, we gain access to $L_f h(x)$ as well.

Consider a simple pendulum, where $x_1$ is the angle and $x_2$ is the angular velocity. The dynamics are $\dot{x}_1 = x_2$ and $\dot{x}_2 = -\sin(x_1)$ . Suppose our sensor can only measure the angle: $y = h(x) = x_1$ . From a single measurement, we know the angle but not the velocity. But if we watch the measurement change, we can compute its rate of change, $\dot{y} = \dot{x}_1$ . The system dynamics tell us that $\dot{x}_1 = x_2$ . So, by measuring $y$ and calculating its time derivative, we have found $x_2$ ! We have observed the unmeasured state. In the language of Lie derivatives, $L_f h(x) = \nabla h(x) \cdot f(x) = \begin{pmatrix} 1 0 \end{pmatrix} \begin{pmatrix} x_2 \\ -\sin(x_1) \end{pmatrix} = x_2$ . The dynamics have made the unobservable observable.

A Systematic Test: The Observability Rank Condition

This idea of creating virtual sensors can be continued. We have $h(x)$ , and we have $L_f h(x)$ . What about the second time derivative of the output, $\ddot{y}(0)$ ? It's just the Lie derivative of our first virtual sensor: $L_f(L_f h(x))$ , which we write as $L_f^2 h(x)$ . We can generate a whole sequence of functions: $h, L_f h, L_f^2 h, \dots$ . Each of these functions gives us a potential piece of information about the initial state $x(0)$ .

We can stack these pieces of information into a single vector map, let's call it the observability map, $\mathcal{O}_{map}(x) = (h(x), L_f h(x), \dots, L_f^{n-1}h(x))^\top$ . The system is locally observable if this map is locally one-to-one. A powerful sufficient condition for this, known as the Observability Rank Condition (ORC), is that the Jacobian matrix of this map must have full rank (i.e., rank equal to the dimension of the state space, $n$ ). This Jacobian, whose rows are the gradients of our virtual sensor functions, is the celebrated observability matrix.

For the simple pendulum, the observability map is $(x_1, x_2)^\top$ . Its Jacobian is simply the identity matrix, $\begin{pmatrix} 1 0 \\ 0 1 \end{pmatrix}$ , which has rank 2 everywhere. This is the best possible result, telling us the system is perfectly observable. For more complex systems, the calculation can be more involved, but the principle remains the same: we are checking if the information from the output and its time derivatives is rich enough to uniquely pin down every component of the state vector.

The Power of Prodding: Control Inputs and Observability

What if we can do more than just listen? What if we can "poke" or "prod" the system with a known input, $u$ ? The dynamics become $\dot{x} = f(x) + g(x)u$ . This input gives us another tool. We can now see how the output changes not only along the natural dynamics $f$ , but also along the direction we can push, $g$ . This means we can compute Lie derivatives along both vector fields, expanding our set of virtual sensors to include functions like $L_g h$ , $L_f L_g h$ , and so on. An otherwise unobservable system can sometimes be made observable by applying a clever sequence of inputs.

This introduces a subtle but crucial distinction between weak and strong observability. A system is locally weakly observable if for any two nearby, distinct states, we can find some special input $u(t)$ that will make their outputs differ. It's a statement of possibility. A system is locally strongly observable if any admissible input $u(t)$ will distinguish between any two nearby states. It's a much more robust property. Many systems, like the one in, are only weakly observable. There may be a "bad" input (like doing nothing, $u=0$ ) that hides the difference between two states, but a "good" input (like giving the system a nudge) can reveal it.

Observability in a World of Noise

So far, we have lived in a perfect, deterministic world. Real measurements, however, are always corrupted by noise. Does the concept of observability survive in this messy, stochastic reality?

It does, but it evolves. The question is no longer about distinguishing two exact initial states, but about distinguishing two different initial probability distributions. Suppose we don't know the exact starting state, but we have two competing hypotheses about it, described by probability distributions $\mu_A$ and $\mu_B$ . The system is observable if, by looking at the statistics of the noisy output trajectories, we can tell whether the system started from distribution $\mu_A$ or $\mu_B$ . Instead of comparing two definite output paths, $y_A(t)$ and $y_B(t)$ , we must compare the entire probability laws of the output processes they generate. This beautiful generalization shows the deep conceptual unity of observability, providing a bridge from the clean world of deterministic dynamics to the fuzzy reality of stochastic processes.

A Cautionary Tale: The Dangers of Linearization

In engineering, a common approach to taming a nonlinear beast is to approximate it as a linear system around a best guess of the state. This is the core idea behind the workhorse of nonlinear estimation, the Extended Kalman Filter (EKF). The EKF's entire worldview is based on this local linear picture. But what happens when the local picture is a lie?

Let's revisit our simple system with a quadratic sensor, $y_k = x_k^2 + v_k$ , where $v_k$ is noise. As we know, this system is not globally observable due to the sign ambiguity. However, if we linearize it around any non-zero guess, say $\hat{x}=2$ , the linearized model looks perfectly observable. The EKF, trusting this linearization, believes that its measurements are highly informative. It becomes overconfident.

This overconfidence can be fatal. Imagine the true state is $x=-2.1$ , but our initial guess is $\hat{x}=+2.0$ . A new measurement comes in. The EKF, believing the system is locally linear and observable, computes a large "Kalman gain," meaning it trusts the new measurement a great deal. It aggressively updates its estimate, pulling it even further away from the true state, reinforcing its mistaken belief in the positive sign. The filter diverges, with its estimate confidently marching off in the wrong direction.

This is a profound lesson. The local observability provided by linearization can be dangerously misleading. It highlights the difference between a mathematical property holding at a single point and the robust behavior of a real-world filter. It shows that we must respect the true nonlinear nature of our systems, motivating the development of more advanced filters (like the Unscented Kalman Filter) that can handle these global ambiguities more gracefully. The journey of observability is not just a mathematical curiosity; it is a vital guide to navigating the complexities of the unseen world.

Applications and Interdisciplinary Connections

So, we have journeyed through the intricate mathematics of nonlinear observability, armed with Lie derivatives and rank conditions. But what is it all for? Does this abstract machinery actually connect to the world we see, build, and try to understand? The answer, perhaps surprisingly, is a resounding yes. The principles of observability are not confined to the blackboard; they are a universal lens through which we can understand the limits and possibilities of inference in a vast array of fields. It is, in essence, the science of seeing the unseen.

When Symmetries Deceive Us: From Simple Geometry to Navigating Worlds

Let's start with a very simple picture. Imagine a particle moving around on a plane, and all you have is a sensor at the origin that tells you the square of its distance, $y = x_1^2 + x_2^2$ . Now, if the sensor reads '25', you know the particle is on a circle of radius 5. But where? Is it at the state $(3, 4)$ , or perhaps $(-3, -4)$ ? From the measurement alone, you can't tell. These two points, and in fact any pair of antipodal points $\mathbf{x}$ and $-\mathbf{x}$ , will produce the exact same history of distance measurements. The system has a fundamental, global ambiguity due to its symmetry with respect to the origin. It is not globally observable.

This simple idea blossoms into a profound challenge in the world of robotics. Consider a robot tasked with building a map of its surroundings while simultaneously keeping track of its own position—a famous problem known as Simultaneous Localization and Mapping, or SLAM. The robot’s sensors measure its motion and the relative positions of landmarks. But think about it: if you take the robot's entire calculated map, along with its own position within it, and shift the whole thing one meter to the east, do any of the relative measurements change? No. What if you rotate the entire map and the robot's heading by ten degrees? Again, nothing the robot measures about its immediate environment will change.

This means there is a "subspace" of the total state (robot position, orientation, and all landmark positions) that is fundamentally unobservable. This unobservable subspace corresponds precisely to the absolute position and orientation of the global reference frame. The robot can build a perfect, consistent local map, but it can never know its absolute coordinates in the universe just from its own measurements. This isn't a failure of the algorithm; it's a fundamental truth revealed by observability analysis.

The Power of Motion: Shaking the Box to See What's Inside

So, some things are fundamentally hidden by symmetry. But what if a state is only hidden under certain conditions? This is where things get truly interesting. Sometimes, to see what's hidden, you need to shake the system a little.

Imagine an autonomous vehicle that has a slight, constant sideways drift, a lateral slip velocity $v_s$ , perhaps due to a misaligned wheel or a steady crosswind. If the robot is commanded to move in a perfectly straight line, its actual path will be a straight line, but at a slight angle to its intended path. An observer measuring the robot's position could easily mistake this effect for a simple error in the robot's initial heading. The effect of the lateral slip and the effect of a heading error are indistinguishable—the system is unobservable.

But now, command the robot to turn. As the robot's orientation $\theta(t)$ changes, the constant lateral slip velocity (which is always perpendicular to the robot's body) gets projected onto the global $x$ and $y$ axes in a continuously changing, sinusoidal way. This creates a unique signature in the position trajectory—a curve that cannot be explained by a simple heading error. The act of turning "modulates" the hidden state, making it visible to the observer. The state $v_s$ becomes observable the moment the robot's angular velocity is non-zero. A similar, though more subtle, analysis reveals that a constant bias in a vehicle's gyroscope sensor can become observable simply by moving forward with a known velocity, without even needing to turn.

This principle—that observability can depend on the system's trajectory—is crucial. Consider a simple pendulum. If you can only measure its potential energy, which depends on $\cos(\theta)$ , can you figure out both its angle $\theta$ and angular velocity $\omega$ ? Mostly, yes. But right at the bottom ( $\theta = 0$ ) or the very top ( $\theta = \pi$ ) of its swing, the potential energy is momentarily insensitive to small changes in angle (since the derivative of $\cos(\theta)$ , which is $-\sin(\theta)$ , is zero at these points). At these precise instants, the system becomes locally unobservable. An Extended Kalman Filter trying to track the pendulum would find it has no information about the angle from the measurement at that moment, which can lead to estimation errors if not handled carefully. Motion makes things visible, but sometimes, at specific points in that motion, we can go temporarily blind.

The Grand Unification: Parameter Estimation as Scientific Discovery

Perhaps the most powerful application of observability is when we turn its lens from hidden states to hidden laws of nature. In science and engineering, we often build models based on physical principles, but the parameters of those models—things like reaction rates, thermal coefficients, or binding affinities—are unknown. The task of finding these parameters from experimental data is called system identification.

Observability theory provides a stunning insight: parameter estimation is just another state estimation problem. We can take an unknown, constant parameter and "augment" our system's state by adding a new state variable whose value is that parameter and whose derivative is zero. Then we can ask: is this augmented state observable?

Let's imagine trying to find the thermal dissipation coefficient $\lambda$ for a satellite component. We can measure its temperature $x(t)$ and control a heater that supplies power $P(t)$ . If we leave the heater off ( $P(t)=0$ ) and the component just sits at ambient temperature, its temperature is constant. Will we ever be able to figure out $\lambda$ ? Of course not. The parameter $\lambda$ describes how quickly the component cools down, but if it's not heated up in the first place, this dynamic is never engaged. The parameter is unobservable.

But if we turn the heater on with a constant power $P_0 > 0$ , the component heats up and settles at a new, higher equilibrium temperature. This steady-state temperature explicitly depends on $\lambda$ . By measuring it, we can calculate $\lambda$ . The input "excited" the system in a way that made the parameter's effect visible. Formally, the augmented system becomes locally observable around this new equilibrium point.

This powerful idea extends across the sciences. In systems biology, a researcher might have a model of a gene regulatory network (GRN) with dozens of unknown parameters for protein production and degradation rates. In chemical engineering, one might model a reaction network with unknown rate constants. In ecology, one might model the transport of a contaminant through a lake ecosystem with unknown transfer rates between water, sediment, and fish.

In all these cases, the question "Can I determine the parameters from the data I can collect?" is precisely a question of observability, often called structural identifiability in these fields. An analysis using the tools we've discussed—be it through transfer functions, input-output differential equations, or Lie derivatives—can tell the scientist, before a single experiment is run, whether their planned experiment is even capable of identifying the parameters of interest. If a parameter is structurally unidentifiable, the theory tells us that we must change the experiment: we either need to measure a different variable or perturb the system in a new way to make that parameter's influence felt.

Furthermore, this framework helps us distinguish between fundamental limits and practical ones. Structural identifiability is a property of the model equations under ideal, noise-free conditions. Practical identifiability, on the other hand, deals with the reality of finite, noisy data. A parameter might be structurally identifiable in theory, but if its effect on the output is extremely small compared to the measurement noise, it will be practically impossible to estimate with any precision. Observability analysis provides the essential foundation, telling us what is possible in principle, while statistical analysis builds upon it to tell us what is feasible in practice. If the underlying system is structurally unobservable, no amount of data or statistical wizardry can recover the unobservable state or parameter. The theory provides a hard boundary on what is knowable.

From the simple geometry of a particle on a plane to the frontiers of mapping new worlds and uncovering the parameters of life itself, nonlinear observability provides a unified and profound language for reasoning about what we can know from what we can see.