Extended DMD

SciencePedia

Key Takeaways

Extended Dynamic Mode Decomposition (EDMD) provides a practical method to approximate the linear Koopman operator for nonlinear systems by fitting a linear model to data "lifted" into a space of user-defined observable functions.
The performance of EDMD is critically dependent on the choice of the "dictionary" of observables, which must be rich enough to capture the relevant system dynamics.
Like many machine learning techniques, EDMD involves a bias-variance trade-off that can be managed using regularization methods and appropriate cross-validation to prevent model overfitting.
EDMD enables the application of linear systems theory to nonlinear problems, with significant applications in control design, creating digital twins, and analyzing complex data in fields from nanomechanics to systems biology.

Introduction

The natural and engineered world is dominated by nonlinear dynamics, which are notoriously difficult to analyze and control. While linear models offer simplicity and a rich theoretical toolbox, they often fail to capture the complex behavior of real-world systems. This presents a fundamental challenge: can we find a way to represent a nonlinear system in a linear framework without losing its essential characteristics? The Koopman operator theory provides a remarkable affirmative answer, suggesting that by shifting our perspective from the state of a system to functions of the state (observables), we can uncover a hidden, infinite-dimensional linear structure.

However, this infinite-dimensional operator is computationally intractable. Extended Dynamic Mode Decomposition (EDMD) emerges as a powerful, data-driven method to create a finite-dimensional approximation of the Koopman operator, effectively building a practical bridge from nonlinear reality to linear analysis. This article explores the theory and practice of EDMD. The first chapter, "Principles and Mechanisms," delves into the core concepts, explaining how EDMD "lifts" data using a dictionary of observables to fit a linear model and discusses the critical challenges of dictionary selection and model regularization. Following this, the chapter on "Applications and Interdisciplinary Connections" showcases the method's versatility, demonstrating its use in solving complex problems in control, engineering, network science, and even computational biology.

Principles and Mechanisms

The world we inhabit is a symphony of ceaseless change, governed by laws that are overwhelmingly nonlinear. The path of a planet, the swirl of cream in coffee, the flutter of a flag in the wind—these are all nonlinear phenomena. For centuries, scientists and engineers have grappled with this reality, because while linear systems are beautifully simple and solvable, nonlinear systems are notoriously difficult. So, we might ask: is there a way to find a hidden linearity within the chaos? Can we put on a special pair of glasses that makes a tangled, nonlinear world look straight and orderly? The answer, remarkably, is yes. This is the magic of the Koopman operator, and Extended Dynamic Mode Decomposition (EDMD) is the practical spell book for wielding its power.

The Magician's Trick: Finding Linearity in a Nonlinear World

Let’s imagine a complex dynamical system, say, the weather. The state of the system at any moment—the temperature, pressure, and wind velocity at every point in the atmosphere—can be described by a state vector $x_k$ . The laws of physics dictate how this state evolves to the next moment, $x_{k+1} = F(x_k)$ , where $F$ is a fantastically complicated nonlinear function. Trying to predict the long-term evolution of $x_k$ by repeatedly applying $F$ is the monumental task of weather forecasting.

The Koopman operator approach suggests a radical change in perspective. Instead of tracking the state $x_k$ itself, what if we track a function of the state? We call such a function an observable. An observable could be anything: the average temperature in North America, the square of the wind speed at a specific location, or some other complex property. Let's call a generic observable $g(x)$ .

When the state evolves from $x_k$ to $x_{k+1}$ , the value of our observable changes from $g(x_k)$ to $g(x_{k+1})$ . Since $x_{k+1} = F(x_k)$ , the new value is $g(F(x_k))$ . The Koopman operator, denoted $\mathcal{U}$ , is defined as the operator that performs this time-evolution on the function $g$ itself. That is, the new function, which gives the future value of the observable for any starting state $x$ , is $(\mathcal{U}g)(x) = g(F(x))$ .

Here comes the magic. Even if the underlying dynamics $F$ are fiercely nonlinear, the Koopman operator $\mathcal{U}$ is always perfectly linear. This seems too good to be true, but a simple argument reveals the trick. Linearity means that for any two observables $g_1$ and $g_2$ and any two numbers $a$ and $b$ , the operator satisfies $\mathcal{U}(ag_1 + bg_2) = a\mathcal{U}g_1 + b\mathcal{U}g_2$ . Let's check:

(\mathcal{U}(ag_1 + bg_2))(x) = (ag_1 + bg_2)(F(x))

By the very definition of how we add and scale functions, this is:

a g_1(F(x)) + b g_2(F(x))

And recognizing the definition of the Koopman operator again, this is simply:

a(\mathcal{U}g_1)(x) + b(\mathcal{U}g_2)(x)

This holds for any state $x$ , so we have proven that $\mathcal{U}$ is a linear operator. The nonlinearity of the system hasn't vanished; it has been encoded into the action of this linear operator on a space of functions. We have "lifted" a finite-dimensional nonlinear problem into an infinite-dimensional linear one.

From Infinite to Finite: The Art of Approximation

This is a profound insight, but it comes with a catch. The space of all possible observable functions is infinite-dimensional, which is computationally intractable. To make this idea useful, we must create a finite-dimensional approximation. This is where data-driven methods like Dynamic Mode Decomposition (DMD) come into play.

The simplest possible approximation is to choose the most basic set of observables: the state variables themselves. This is the idea behind standard DMD. We assume the dynamics are approximately linear in the state space, meaning $x_{k+1} \approx A x_k$ for some matrix $A$ . Given a series of snapshot pairs $(x_k, x_{k+1})$ , we can find the matrix $A$ that best fits this relationship in a least-squares sense. From the Koopman perspective, this is equivalent to approximating the Koopman operator on the tiny subspace of linear observables.

However, if the system is truly nonlinear, this linear approximation can be misleading. Consider a simple system where one variable evolves according to $y_{k+1} = 0.5y_k + 0.1x_k^2$ . A standard DMD analysis based on the observables $(x, y)$ would likely identify an eigenvalue around $0.5$ related to the linear decay of $y$ , but it would completely miss the dynamics driven by the $x_k^2$ term. We are looking at the system through a lens that is too simple to resolve its true nature.

Extended DMD: Choosing Your "Lens"

This leads us to a natural and powerful generalization: Extended Dynamic Mode Decomposition (EDMD). Instead of being restricted to linear observables, we can choose our own, more sophisticated set of functions. This set is called a dictionary of observables. This is like crafting a custom lens to view the dynamics, one that is sensitive to the specific nonlinearities we expect or wish to capture.

The EDMD procedure is a beautiful blend of intuition and linear algebra:

Choose a Dictionary: We select a finite set of $d$ observable functions, $\boldsymbol{\psi}(x) = [\psi_1(x), \psi_2(x), \dots, \psi_d(x)]^\top$ . This could include polynomials, trigonometric functions, or any other functions that we believe are relevant to the dynamics.
Lift the Data: We take our time-series data of the state, $\{x_0, x_1, x_2, \dots\}$ , and "lift" each snapshot into the higher-dimensional space of observables by computing $\boldsymbol{\psi}(x_k)$ for each $k$ .
Fit a Linear Model: We now seek a linear operator—a matrix $K$ of size $d \times d$ —that best describes the evolution in this lifted space. We want to find the $K$ that minimizes the error in the approximation $\boldsymbol{\psi}(x_{k+1}) \approx K \boldsymbol{\psi}(x_k)$ over all our data pairs.

This is a classic linear least-squares problem. If we arrange our lifted data into two matrices, $\Psi_X = [\boldsymbol{\psi}(x_0), \dots, \boldsymbol{\psi}(x_{M-1})]$ and $\Psi_Y = [\boldsymbol{\psi}(x_1), \dots, \boldsymbol{\psi}(x_M)]$ , we are looking for the matrix $K$ that minimizes $\|\Psi_Y - K \Psi_X\|_F^2$ , where $\|\cdot\|_F$ is the Frobenius norm (the matrix equivalent of the Euclidean vector norm). The solution is elegantly given by:

K = \Psi_Y \Psi_X^\dagger

where $\Psi_X^\dagger$ is the Moore-Penrose pseudoinverse of $\Psi_X$ . This matrix $K$ is our finite-dimensional approximation of the Koopman operator.

Let's return to our example, $y_{k+1} = 0.5y_k + 0.1x_k^2$ . If we wisely choose our dictionary to be $\boldsymbol{\psi}(x,y) = [x, x^2, y]^\top$ , EDMD will not only find the eigenvalue $0.5$ but will also correctly identify another eigenvalue of $0.81$ associated with the evolution of $x^2$ (since if $x_{k+1}=0.9x_k$ , then $x_{k+1}^2 = 0.81x_k^2$ ). By choosing the right lens, we see the dynamics clearly.

The Quest for the Perfect Dictionary

The power of EDMD lies entirely in the choice of the dictionary. But how do we choose a good one? This is where scientific insight and practical wisdom come together.

The theoretical ideal is to find a set of observables that span a Koopman-invariant subspace. This means that when the Koopman operator acts on any function in our dictionary, the result is another function that can be written as a linear combination of our dictionary functions. If we can find such a magical dictionary, the approximation becomes exact, and the matrix $K$ represents the true dynamics within that subspace.

In practice, finding a perfect invariant subspace is often impossible. Instead, we seek a dictionary that is approximately invariant and captures the most important aspects of the dynamics. This involves several strategies:

Incorporate Physical Laws: If the system has conserved quantities like energy or momentum, these are Koopman eigenfunctions with an eigenvalue of 1. Including them in the dictionary is always a good idea.
Use General-Purpose Bases: Polynomials, radial basis functions, and Fourier modes are common choices that can approximate a wide range of functions.
Handle Incomplete Data: Often, we can't measure the full state $x_k$ . If we only measure a part of it, we can use delay coordinates—including past measurements like $h(x_{k-1}), h(x_{k-2}), \dots$ in our dictionary—to reconstruct the necessary information, a technique inspired by Takens' embedding theorem.
Ensure Rich Data: The quality of the EDMD model depends critically on the quality and richness of the data. The data must explore the state space sufficiently so that the least-squares problem is well-conditioned. In theoretical terms, this relates to concepts like ergodicity and the linear independence of the lifted data vectors. A single, short trajectory is rarely enough.

Navigating the Pitfalls: Bias, Variance, and Regularization

The quest for the perfect dictionary is a balancing act, a classic case of the bias-variance trade-off.

High Bias: If our dictionary is too simple, it may not have the expressive power to capture the true dynamics. The resulting model will be systematically wrong. This error, arising from projecting the infinite-dimensional truth onto an inadequate finite subspace, is called bias or approximation error.
High Variance: If our dictionary is too large and complex relative to the amount of data we have, our model can overfit. It will learn not only the true dynamics but also the noise and random quirks of our specific dataset. Such a model will have high variance; it will perform poorly on new data and would change drastically if trained on a different dataset.

To navigate this trade-off, we borrow powerful tools from machine learning: regularization. The idea is to add a penalty term to our least-squares objective that discourages overly complex solutions.

 $\ell_2$ (Ridge) Regularization: This adds a penalty proportional to the squared magnitude of the elements in $K$ . It encourages the model to find solutions with smaller entries, making the estimation process more stable and less sensitive to noise.
 $\ell_1$ (Lasso) Regularization: This adds a penalty proportional to the absolute value of the elements in $K$ . This has the remarkable property of forcing many of the smaller elements of $K$ to become exactly zero, resulting in a sparse operator. This can be seen as a form of automatic model selection, helping us discover which observable-to-observable interactions are truly important.

The strength of this regularization is a hyperparameter that must be tuned. We can't use standard cross-validation because our data is a time series with strong correlations. Doing so would be like letting a student peek at the answers before an exam. Instead, we use methods like blocked cross-validation, which respects the arrow of time by always training on the past and testing on the future.

The Frontiers: Kernels and the Subtlety of Eigenvalues

The idea of choosing basis functions can be taken one step further with the "kernel trick," leading to Kernel EDMD. By using a kernel function, we can implicitly work in an infinite-dimensional dictionary space without ever constructing it explicitly. This connects EDMD to the powerful world of kernel methods in machine learning, allowing for even greater flexibility in capturing complex dynamics.

Ultimately, the goal of this entire procedure is to analyze the system. The eigenvalues of our matrix $K$ are approximations of the true Koopman eigenvalues. Their magnitudes tell a story about the stability of the system: an eigenvalue $|\lambda| 1$ indicates a decaying mode (stability), $|\lambda| > 1$ indicates a growing mode (instability), and $|\lambda| = 1$ points to an oscillating or neutrally stable mode.

However, we must remain humble. Consider a seemingly simple system, $e_{k+1} = e_k + a e_k^2$ , which can model phenomena where linear analysis fails. At the fixed point $e=0$ , the linearized dynamics have an eigenvalue of 1, telling us nothing about stability. If we apply EDMD with a dictionary of polynomials $\{1, e, e^2, \dots, e^m\}$ , we find that our approximate Koopman matrix $K$ will also only have eigenvalues of 1, regardless of the true stability determined by the sign of $a$ . Our polynomial lens, while powerful, is blind to the subtle dynamics at play here. The stability information is hidden, requiring a more clever choice of observables or an understanding of the operator's continuous spectrum.

This is the nature of scientific discovery. EDMD provides a powerful framework for imposing linearity on a nonlinear world, turning difficult problems into manageable linear algebra. But it is not an automatic machine. It is a tool that, when guided by physical intuition, mathematical rigor, and a healthy respect for the subtleties of nature, allows us to see the beautiful, simple patterns hidden within the complexity.

Applications and Interdisciplinary Connections

In our previous discussion, we uncovered the beautiful core of Extended Dynamic Mode Decomposition (EDMD). We saw how, by viewing a system through the right set of "goggles"—our dictionary of observables—we can take the tangled, unpredictable motions of a nonlinear world and see them as clean, orderly, linear transformations. This is the magic of the Koopman operator viewpoint. It's a profound mathematical trick, but is it just a trick? Or can we use this "magic lens" to do real work, to understand and shape the world around us?

The answer, it turns out, is a resounding yes. The true power of this framework reveals itself not in abstract theorems, but in its remarkable ability to solve concrete problems across a breathtaking range of disciplines. From steering a spacecraft to mapping the fate of a living cell, EDMD provides a unifying language. Let us now embark on a journey through some of these applications, to see how this one elegant idea blossoms into a thousand practical tools.

The Art of Control: Taming Nonlinear Beasts

Perhaps the most immediate application of turning a nonlinear system into a linear one is control. The entire edifice of modern control theory is built on the bedrock of linear systems. Designing a controller for a system described by $\dot{x} = Ax + Bu$ is a well-understood, powerful art. But what if your system—a robot arm, a chemical reactor, a power grid—doesn't obey such a simple law?

This is where EDMD shines. Imagine you have a complex, nonlinear plant whose inner workings are a mystery. By observing its behavior—collecting data of its state $x_k$ and the inputs $u_k$ we apply, and seeing the resulting state $x_{k+1}$ —we can use EDMD to build an approximate linear model, not in the original state space, but in our lifted space of observables $z_k = \psi(x_k)$ . The result is a simple, effective surrogate model of the form $z_{k+1} = A z_k + B u_k$ .

This is the heart of a Digital Twin, a virtual replica of a physical system. We can use this linear model to run simulations thousands of times faster than the real system. For instance, in Model Predictive Control (MPC), we can ask our linear model: "Given our current state, what is the sequence of inputs over the next few seconds that will get us closest to our goal?" Because the model is linear, finding this optimal action is often a fast and reliable computation. We can then apply the first step of this optimal plan to the real system, observe the new state, and repeat the process, continually steering the complex reality using our simple linear map.

But a subtle and beautiful question arises. If we are controlling the system in this abstract, lifted space of $z$ , are we truly controlling the physical state $x$ ? This brings us to the concept of lifted controllability. It's not enough for our matrix pair $(A, B)$ to be mathematically controllable. We must also ensure that controlling $z$ gives us authority over $x$ . This hinges on our choice of dictionary. If our observables $\psi(x)$ include the original state variables themselves (e.g., $\psi(x) = [x, x^2, \dots]^T$ ), then controlling the vector $z$ directly implies control over $x$ . The power to steer the shadow implies the power to steer the object, but only if the shadow faithfully represents the object in the first place.

Furthermore, the influence of control isn't always a simple additive push. Sometimes, the input changes the system's internal dynamics in a more complex, multiplicative way. EDMD's flexibility allows us to capture this too. We can enrich our dictionary not just with functions of the state $x$ , but also with functions of the input $u$ , and even with cross-terms that model the interaction between state and input, such as those formed by a Kronecker product. By building a regressor that includes terms like $z_k$ , $\phi(u_k)$ , and $z_k \otimes \phi(u_k)$ , we can create models that capture rich, bilinear dynamics, giving us a far more nuanced and powerful handle on the system we wish to control.

A New Lens for Science: From the Nanoscale to Cosmic Webs

While control is about changing a system, science is about understanding it. Here, too, EDMD provides a revolutionary new lens. It allows us to distill the essential dynamic modes—the fundamental patterns of behavior—from complex, high-dimensional data.

Let's start at the smallest of scales, in the world of nanomechanics. Imagine an Atomic Force Microscope (AFM), where a tiny, vibrating cantilever "feels" a surface. The force between the tip and the sample is a highly nonlinear function of their separation, making the cantilever's motion complex. By choosing a dictionary of observables that consists of polynomials of the tip's position and velocity, we can use EDMD to transform this nonlinear oscillation into a linear system in a higher-dimensional space. The eigenvalues of our learned Koopman matrix then reveal the true frequencies and damping rates of the system, including the subtle shifts caused by the nonlinear tip-sample interaction. This works because, as mathematicians like Weierstrass taught us, smooth nonlinear functions can be wonderfully approximated by polynomials. EDMD leverages this principle to "linearize" the dynamics, but it also teaches us a lesson in humility: our model is only as good as our data. If the cantilever never actually touches the surface during our experiment, the data will contain no information about the contact forces, and our model will be blind to them, no matter how clever our dictionary is.

Now, let's scale up from a single vibrating tip to a network of interacting nodes—be it a network of neurons, a power grid, or a social network. Consider a simple two-node network where the dynamics are nonlinear. A standard linear analysis might fail to see any connection between the nodes. However, EDMD, equipped with a nonlinear dictionary, might reveal a hidden pathway. For instance, if the state of node 1 depends on the square of the state of node 2, adding $x_2^2$ to our dictionary of observables suddenly makes this connection visible in our lifted linear model. But this also reveals a profound limitation. If we can only observe $x_2^2$ , we can never know the sign of $x_2$ . Two different initial states, $(x_1, x_2)$ and $(x_1, -x_2)$ , will be indistinguishable. Our Koopman lens can reveal hidden structures, but it can also project different realities onto the same image. Understanding what is lost in this projection is as important as understanding what is gained.

For truly massive networks, a generic polynomial dictionary is too clumsy. Here, we must be more creative, tailoring our observables to the very structure of the problem. If we are studying a system on a graph, why not use observables derived from the graph itself? By using graph-structured observables, such as those based on heat diffusion patterns or the graph's own "Fourier modes" (the eigenvectors of its Laplacian), we can tune our analysis to the geometry of the network. This allows us to identify localized phenomena—like a fault propagating through a small section of a power grid—that would be washed out in a global analysis. It is the ultimate expression of choosing the right "goggles": we shape our lens to match the contours of the world we wish to see.

Bridging Worlds: From Silicon Systems to the Code of Life

The true triumph of a fundamental idea is its ability to bridge disparate fields, to show that the same principles govern the behavior of a machine and a living organism. EDMD achieves just this, providing a common framework for understanding complex dynamics everywhere.

In modern engineering, the concept of a Digital Twin is paramount. The goal is to have a living, breathing software model that mirrors a real-world asset like a jet engine or a wind turbine. EDMD is central to this vision. We can build the entire twin from scratch using data, creating a complete operational pipeline: raw sensor data is ingested and synchronized, a Koopman model is continuously calibrated on a sliding window of recent behavior, and this model is used in a real-time prediction-and-update loop (much like a Kalman filter) to track the health of the physical asset and forecast its future.

Even more powerfully, EDMD can be used to augment, rather than replace, our existing knowledge. Often, we have a physics-based model of a system that is good, but not perfect. It captures the main behavior but misses subtle nonlinear effects or unmodeled disturbances. We can use EDMD not to model the whole system, but to model the residual: the difference between what our physics model predicts and what the real world does. This creates a hybrid model that combines the strength of first principles with the flexibility of data-driven learning, resulting in a digital twin that is both physically grounded and astonishingly accurate.

And now, for the most astonishing leap. Let's take this same mathematical machinery and point it at the fundamental processes of life itself. In computational systems biology, scientists collect vast datasets from single-cell experiments. Each cell's state can be represented as a point in a high-dimensional space of gene expression. By using techniques like RNA velocity to infer which states lead to which, we get snapshot pairs of cellular dynamics—exactly the kind of data EDMD thrives on.

By applying EDMD to this data, we can reconstruct the underlying "dynamical landscape" that governs cell development. The Koopman eigenvalues tell a profound story:

Eigenvalues very close to 1 correspond to nearly invariant observables, which in turn pinpoint the system's attractors. These are the stable, terminal states of a cell—its ultimate fate, like becoming a skin cell or a neuron.
Eigenvalues with magnitude near 1 but with a complex phase reveal oscillatory behavior. This is the signature of the cell cycle, the rhythmic process of growth and division.

By choosing our dictionary wisely—for example, using localized functions like radial basis functions to separate different attractor basins—we can draw a map of cellular destiny, identifying the paths cells take and the choices they make, all from passively observing their gene expression. It is a breathtaking application, showing that the modes and eigenvalues of a Koopman operator can encode the very logic of life.

From the practicalities of control, through the deep questions of physics, to the blueprint of biology, Extended Dynamic Mode Decomposition offers more than just a method. It offers a perspective—a way of seeing the simple, linear order that lies hidden just beneath the chaotic surface of the nonlinear world.