Restarted GMRES

SciencePedia

Key Takeaways

Restarted GMRES (GMRES( $m$ )) is a practical compromise that saves significant memory by limiting its search space, but this can lead to convergence failure or stagnation.
The phenomenon of stagnation, where the algorithm fails to progress, is particularly common for non-normal matrices found in applications like fluid dynamics.
Advanced techniques such as preconditioning, Flexible GMRES (FGMRES), and Krylov subspace recycling are essential for overcoming stagnation and improving convergence.
As a "black-box" solver, GMRES is a core component in many scientific fields, often used within Newton-Krylov methods to solve nonlinear problems or paired with multigrid methods for acceleration.

Introduction

In the world of scientific computing, many of the most challenging problems—from forecasting weather to designing aircraft—boil down to solving enormous systems of linear equations. The Generalized Minimal Residual (GMRES) method stands as one of the most powerful and robust iterative algorithms designed for this task. It offers a theoretically perfect path to a solution, but this perfection comes at an often-untenable price: a voracious appetite for computer memory. This practical constraint forces a critical compromise, giving rise to the workhorse algorithm known as restarted GMRES.

This article delves into the dual nature of restarted GMRES, exploring both its utility and its inherent weaknesses. We will navigate the trade-offs between computational resources and algorithmic performance, revealing why a simple modification to save memory can lead to a frustrating dead end. First, the chapter on Principles and Mechanisms will unpack the core dilemma of memory versus perfection, explain the dangerous phenomenon of stagnation through a clear example, and introduce the elegant strategies—such as preconditioning, flexibility, and recycling—developed to overcome it. Following this, the chapter on Applications and Interdisciplinary Connections will showcase how this versatile 'black-box' solver becomes the engine for tackling complex nonlinear and time-dependent problems across a multitude of scientific and engineering domains.

Principles and Mechanisms

Imagine you're trying to solve an incredibly complex puzzle, a system of millions of interwoven equations describing something like the airflow over a wing or the heat distribution in a processor. The Generalized Minimal Residual (GMRES) method is a masterful strategy for such puzzles. At its heart, it's an iterative process of refinement. You start with a guess, check how wrong it is (this is called the residual), and then intelligently choose a direction to correct your guess. You repeat this, getting closer and closer to the true answer.

The "full" version of GMRES is, in a sense, perfect. With each step, it builds an ever-expanding library of search directions, a so-called Krylov subspace. Each new direction is generated by seeing how the system's governing matrix, let's call it $A$ , transforms the previous direction. After $k$ steps, GMRES has a library of $k$ directions. It then looks at every possible combination of these directions to find the single best update, the one that minimizes the error in a way no other combination can. This method is so powerful that, in a world of perfect arithmetic, it's guaranteed to find the exact solution in at most $n$ steps, where $n$ is the number of equations.

But here lies a great dilemma, a classic clash between theoretical perfection and practical reality.

The Dilemma: Memory vs. Perfection

Each of those "search directions" isn't just a number; it's a list of millions of values, a vector in a high-dimensional space. Storing the entire, ever-growing library of these vectors consumes a colossal amount of computer memory. For a problem with $n=10^6$ variables, taking just 300 steps would require storing 301 vectors of this size. This could easily gobble up over 2.4 gigabytes of memory for the library alone, not to mention the storage for the puzzle matrix $A$ itself. In many real-world scenarios, we simply don't have that much memory to spare, or the cost of accessing it becomes a bottleneck.

So, we are forced to make a compromise. This compromise is called restarted GMRES, or GMRES( $m$ ). The idea is brilliantly simple, if a little brutal. We decide on a fixed library size, say $m=50$ . We run the GMRES process for $m$ steps, building our library of 50 directions. We find the best possible correction using this limited library. Then, to keep from running out of memory, we do something drastic: we throw the entire library away. All we keep is our new, improved guess. Then we start the process all over again from that new position, building a fresh library for the next $m$ steps.

This trade-off is stark and quantifiable. That same problem that required 2.4 GB for 300 steps of full GMRES would, with GMRES(50), require only about 400 MB for its library, regardless of how many restart cycles are run. This is a massive, nearly five-fold reduction in memory, often making the difference between a problem being solvable or completely intractable. We've traded the guarantee of perfection for the possibility of a solution. But this "amnesia"—this act of forgetting our hard-won search directions at every cycle—comes with a hidden and sometimes terrible price.

The Price of Amnesia: Stagnation

What happens when the information we discard is precisely the information we need to move forward? This can lead to a frustrating phenomenon called stagnation, where the algorithm spins its wheels, making virtually no progress cycle after cycle.

This is especially true for certain "nasty" matrices, called non-normal matrices, which frequently appear in problems involving flow, like our convection-diffusion example. Think of a normal matrix as a simple grid of streets; moving north always takes you north. A non-normal matrix is more like a city with powerful, swirling winds; trying to go north might initially push you east before you start heading in the right direction. The short-term behavior can be counter-intuitive and complex.

Let's look at a wonderfully simple, almost devious, puzzle that lays this problem bare. Consider a system of four equations governed by a matrix $A$ that simply shuffles the components of a vector: component 1 moves to position 2, 2 to 3, 3 to 4, and 4 back to 1.

A = \begin{pmatrix} 0 0 0 1 \\ 1 0 0 0 \\ 0 1 0 0 \\ 0 0 1 0 \end{pmatrix}

Suppose we want to find a solution $x$ so that $Ax$ equals the vector $b = (\sqrt{3}, 0, 0, 0)^T$ . We start with a guess of $x_0 = 0$ . The initial error, or residual, is just $r_0 = b$ . Now we apply GMRES(2), meaning our library size is just two.

The first direction in our library is the normalized residual, $v_1 = (1, 0, 0, 0)^T$ . The second direction is what we get by applying our shuffling matrix $A$ , which gives $v_2 = (0, 1, 0, 0)^T$ . So our library, $\mathcal{K}_2(A, r_0)$ , spans the first two coordinate directions. GMRES seeks a correction by looking at where these library vectors are sent by $A$ . The matrix $A$ sends $v_1$ to $(0, 1, 0, 0)^T$ and $v_2$ to $(0, 0, 1, 0)^T$ . The "search space" for the correction is therefore spanned by the second and third coordinate directions.

Here's the trap. Our current error, $r_0 = (\sqrt{3}, 0, 0, 0)^T$ , is pointing entirely along the first coordinate axis. The algorithm is looking for a correction in the space of the second and third axes. These two spaces are completely orthogonal! The best possible update it can find is... zero. The solution doesn't improve. At all. After the cycle, we restart. But since our solution hasn't changed, the new residual is identical to the old one. We run the next cycle, build the same library, look in the same fruitless space, and get stuck again. The algorithm has stagnated completely, and the error will remain $\sqrt{3} \approx 1.732$ forever.

This isn't just a theoretical curiosity. This kind of behavior can happen in real problems. By discarding its library, the algorithm forgets the "history" or "transient behavior" that would reveal the true path forward. Geometrically, the new search space it builds can be almost entirely orthogonal to the one it just threw away, representing a near-total loss of useful information.

The Art of the Comeback: Preconditioning and Flexibility

How do we escape this trap? The most direct approach is to increase the library size, $m$ . A larger library gives the algorithm a longer memory, making it less likely to be fooled by short-term transient effects. But this brings back our original memory problem. A more elegant solution is to change the puzzle itself.

This is the art of preconditioning. Instead of solving the difficult system $Ax=b$ , we solve a related, but easier, system. With right preconditioning, we find an auxiliary matrix $M$ (the preconditioner) and solve the system $(AM^{-1})y = b$ . The idea is to choose $M$ so that the new matrix, $AM^{-1}$ , is much better behaved—less non-normal, with its crucial properties (eigenvalues) clustered nicely away from zero. A good preconditioner is like putting on a pair of glasses that makes the distorted, swirling puzzle appear clear and straightforward.

One of the beautiful aspects of right preconditioning is that it doesn't cheat. The GMRES algorithm applied to the preconditioned system minimizes the norm of the preconditioned residual, $\|b - (AM^{-1})y_k\|$ . By substituting the original variables back in ( $y_k = M x_k$ ), this is mathematically identical to $\|b - A x_k\|$ , the norm of the true residual. We are solving an easier problem while still tracking our progress on the real one.

But what if our "glasses" are magical and change their prescription from moment to moment? Some of the most powerful preconditioners, like certain multigrid methods, are not fixed linear operators; they can be nonlinear or change at every single step. Standard GMRES, which relies on the repeated application of a fixed operator to build its library, breaks down completely.

This requires a brilliant adaptation: Flexible GMRES (FGMRES). FGMRES embraces the variability. Instead of assuming the library is built from a fixed operator, it explicitly stores the direction vectors that result from applying the variable preconditioner at each step. Let's call these $z_k = M_k^{-1}v_k$ . The final solution is then built as a combination of these $z_k$ vectors. It requires a bit more storage to save these $z_k$ 's, but in return, it regains the precious residual-minimizing property, even with a shifty, changing preconditioner. It's a testament to the robustness of the core idea: if you can't rely on a fixed structure, just store what you actually did and optimize over that.

Learning from the Past: The Dawn of Recycling

This brings us to a final, beautiful synthesis. We started by restarting to save memory, but found it caused amnesia. Preconditioning helps, but is there a way to restart without being so forgetful?

The answer is yes, through a family of methods that practice Krylov subspace recycling. Instead of throwing away the entire library at the end of a cycle, we perform a sort of intellectual triage. We intelligently identify the most important "books" in the library—the search directions that correspond to the most troublesome, slowly converging aspects of our puzzle—and carry them over to the next cycle.

How does the algorithm know which directions are important? Amazingly, the information is already there, hidden in plain sight. During its $m$ steps, GMRES builds a small $m \times m$ "summary" matrix, the Hessenberg matrix $H_m$ . This small matrix is a coarse-grained portrait of the enormous matrix $A$ . By analyzing $H_m$ , we can compute special vectors called harmonic Ritz vectors, which are wonderfully accurate approximations of the directions that cause stagnation.

The next GMRES cycle then begins with an augmented search space: this small, curated collection of "recycled" problem-vectors, plus a fresh set of new directions generated in the usual way. This is known as deflated or augmented GMRES. The effect is profound. The algorithm is "deflating" the problem by explicitly handling the most difficult parts with the recycled vectors. The iterative part is then free to work on the remaining, much easier, part of the puzzle. It no longer wastes precious cycles re-learning the same difficult information over and over again. It builds on its past knowledge.

Krylov subspace recycling elegantly bridges the gap between the theoretical perfection of full GMRES and the practical constraints that force us to restart. It is a story of acknowledging a limitation and turning it into a strength, creating a method that is both memory-efficient and intelligent—an algorithm that, finally, learns from its past.

Applications and Interdisciplinary Connections

After our journey through the inner workings of the Generalized Minimal Residual method, you might be left with a question that animates all of science: "That's very clever, but what is it good for?" The answer, in the case of GMRES, is wonderfully broad. Because of its unique design, GMRES is not just another tool in the numerical toolbox; it is a master key, capable of unlocking problems across a breathtaking range of scientific and engineering disciplines. Its power stems from a single, profound principle that we must appreciate before we see it in action.

The Power of the Black Box

Most numerical methods for solving a linear system $A x = b$ require you to have the matrix $A$ in your hands, to see its entries, and to manipulate them. For the enormous problems at the frontier of science, this is often an impossible demand. Imagine trying to model the airflow over a new aircraft wing. The "matrix" that connects the velocity and pressure at millions of points on a grid might have trillions of entries, far too many to store on any computer. Or, in some problems in quantum mechanics, the action of the operator $A$ on a vector is the result of another complex simulation, not a simple table of numbers.

Here is where the genius of Krylov subspace methods, and GMRES in particular, shines. GMRES doesn't need to know the matrix $A$ . It only needs a function—a "black box"—that, when you give it a vector $v$ , returns the product $A v$ . This is a revolutionary shift in perspective. The problem is no longer "what the matrix is," but "what the matrix does." This matrix-free approach liberates us. As long as we can simulate the action of our physical system on some state, we can wrap GMRES around it and try to find a solution. This makes GMRES a universal engine for discovery, applicable to any process that can be modeled as a linear operator.

The Engine Inside the Engine: Solving Nonlinear Worlds

The real world is rarely linear. The flow of water, the bending of steel, the reactions in a chemical soup—these are all fundamentally nonlinear phenomena. We typically model them with a nonlinear system of equations, $F(u) = 0$ . One of the most powerful techniques for solving such systems is Newton's method, where we start with a guess $u_k$ and iteratively improve it by solving a linear approximation:

J(u_k) s_k = -F(u_k)

Here, $J(u_k)$ is the Jacobian matrix, which represents the best linear approximation of the system around the current guess, and $s_k$ is the update step. At each step of this "outer" Newton iteration, we must solve a new, large linear system. This is the perfect job for an "inner" iterative solver. This combination is called a Newton-Krylov method, and it is the workhorse of modern computational fluid dynamics, structural analysis, and countless other fields.

GMRES is a star player in this role. The Jacobians that arise from discretizations of transport phenomena, like convection-diffusion equations, are often strongly nonsymmetric and non-normal. For these difficult matrices, the smooth, monotonic residual reduction of GMRES provides a robustness that more flighty methods like BiCGSTAB may lack. However, this partnership reveals the delicate art of using restarted GMRES. If the restart parameter $m$ is too small, the inner GMRES solver may stagnate and fail to find a good enough step, causing the outer Newton method to grind to a halt and lose its celebrated fast convergence. Conversely, choosing $m$ too large consumes memory and, due to the accumulation of rounding errors in finite precision, can lead to its own numerical instabilities. The computational scientist must act as a skilled navigator, choosing a restart length that balances robustness, memory, and accuracy—a central challenge in the practical application of these powerful methods.

Journeys Through Time and Disciplines

Many of the most exciting simulations are not static pictures but movies—journeys through time. Consider modeling the weather, the spread of a pollutant, or the vibrations in a bridge. At each tiny step forward in time, we must solve a linear system $A(t_k) x_k = b(t_k)$ . The beauty is that the system at time $t_k$ is only slightly different from the one at $t_{k-1}$ .

A naive approach would be to solve each system from scratch. But that's like waking up every morning with amnesia. Why not use what we learned yesterday? The simplest way is a "warm start": use the solution from the previous time step, $x_{k-1}$ , as the initial guess for the current one. This often works beautifully. However, in some problems, like those dominated by advection, where features are physically transported, the old solution can be a poor guess for the new one. A more sophisticated initial guess might involve physically shifting the old solution to where you expect it to be now.

But we can do even better. GMRES, in the process of solving the system at step $k-1$ , discovers the "hard parts" of the problem—the components of the solution that are slow to converge. These are often associated with approximate invariant subspaces of the operator. When we restart GMRES, or move to the next time step, this "wisdom" is typically thrown away. Subspace recycling methods are a brilliant innovation designed to prevent this. They extract information about these hard-to-solve components and "recycle" it, feeding it into the GMRES solver for the next time step. This allows the solver to focus its efforts on the truly new parts of the problem, dramatically accelerating simulations of transient phenomena.

This same logic applies across disciplines. In computational chemistry, the Polarizable Continuum Model (PCM) is used to study how a solvent affects a molecule. Depending on the specific mathematical formulation chosen—for example, a Galerkin versus a collocation approach—the resulting matrix can be symmetric positive definite or nonsymmetric. In the first case, the elegant Conjugate Gradient method is the tool of choice. In the second, the system is a perfect candidate for GMRES. Similarly, in computational mechanics, certain advanced discretization techniques or block preconditioning strategies for complex problems like Stokes flow can take an originally symmetric problem and produce a nonsymmetric preconditioned system, again calling for GMRES to step in. This shows us that the need for GMRES doesn't just come from the underlying physics, but often from the sophisticated mathematical tools we build to study it.

Beautiful Partnerships: GMRES as a Team Player

Perhaps the most profound applications of restarted GMRES are not when it works alone, but when it partners with other powerful algorithms, each compensating for the other's weaknesses.

One of the most elegant examples of this synergy is the pairing of GMRES with multigrid methods. Imagine trying to paint a detailed mural. A multigrid preconditioner is like a painter who is brilliant with both broad brushes and fine-tipped brushes. It can quickly lay down the overall structure of the image (the low-frequency error components) and fill in the tiny details (the high-frequency error components). However, it might struggle with a few awkward, medium-sized features. These stubborn error components are the "outliers" that slow the whole process down. This is where GMRES comes in. GMRES, as a Krylov method, is exceptionally good at hunting down and eliminating a small number of specific error modes.

When they work together, the result is astonishing. The multigrid preconditioner is applied at each step of a GMRES iteration. Multigrid eliminates the vast majority of the error, leaving a problem that is "almost solved." The spectrum of the preconditioned operator consists of a large cluster of eigenvalues near 1 (the "easy" part handled by multigrid) and just a few outlier eigenvalues corresponding to the stubborn modes. GMRES, acting as an "accelerator," then needs only a small Krylov subspace—and thus a small restart parameter $m$ —to build a residual polynomial that annihilates these few remaining outliers. It's a perfect division of labor, creating some of the fastest known solvers for elliptic partial differential equations.

This theme of "divide and conquer" also appears in solvers for stiff differential equations, both ordinary and stochastic. In Implicit-Explicit (IMEX) schemes, a problem is split into a "stiff" part that is difficult but structurally simple, and a "non-stiff" part that is complex but less challenging. The scheme treats the stiff part implicitly (requiring a linear solve) and the non-stiff part explicitly. GMRES is often the ideal solver for the linear system arising from the stiff part, which may be, for instance, a simple diffusion operator for which excellent preconditioners exist. This strategy can be vastly more efficient than a fully implicit method that lumps the easy and hard parts together into one much more difficult linear system at every step.

The Frontier: Pushing the Limits of Speed and Scale

As powerful as GMRES is, it is not a magic bullet. As scientists push to solve ever-larger problems on massive supercomputers, the algorithm itself faces new challenges and is constantly being adapted.

One of the biggest hurdles is communication. On a parallel computer with a million processor cores, the time it takes to send messages between cores can far exceed the time spent doing arithmetic. A key part of the GMRES algorithm, the Gram-Schmidt orthogonalization process, requires repeated global "dot product" calculations. Each dot product forces all million cores to compute a local sum and then participate in a global communication to add up the results. This global synchronization is a major performance bottleneck. The cost of these communications, which grows with the restart parameter $m$ , limits the parallel scalability of GMRES and has spurred a whole field of research into "communication-avoiding" algorithms.

Another frontier is the adaptation of GMRES to modern hardware like Graphics Processing Units (GPUs). These devices offer incredible performance, particularly for low-precision arithmetic like half precision (FP16). A revolutionary idea is to design mixed-precision solvers. The strategy is to perform the heavy, memory-intensive work—the sparse matrix-vector product—using fast FP16 arithmetic. However, the numerically sensitive parts of the algorithm, like the vector updates and the orthogonalization, are kept in stable double precision (FP64). This is like using a power sander for the bulk of a woodworking project but switching to fine-grit sandpaper by hand for the finishing touches. Of course, using low precision for the matvec means the result is not perfectly accurate. To solve this, the entire mixed-precision GMRES solver can be wrapped in an "iterative refinement" loop. This outer loop computes the true residual in high precision and uses the fast inner solver to find an approximate correction. This two-level approach can achieve the full accuracy of a double-precision solver while harnessing most of the speed of low-precision hardware.

Conclusion: The Practitioner's Compass

From the black-box modeling of complex physics to the engine room of nonlinear solvers, from tracking transient phenomena to synergistic partnerships with other algorithms, restarted GMRES has proven itself to be one of the most versatile and important ideas in computational science. It is not a static relic but a living algorithm, constantly being refined to tackle new challenges on the frontiers of computing.

Yet, for the scientist or engineer facing a new problem, a choice must be made. Which solver should I use? GMRES? BiCGSTAB? Something else? There is no universal answer. The true mark of an expert is not knowing a single "best" tool, but having a robust strategy for finding one that works. A pragmatic approach, born of experience, is to start with what is most robust. Begin with restarted GMRES, using the largest restart parameter your memory budget allows. If it converges, your job is done. If it stagnates, then it is time to try a different approach, perhaps a short-recurrence method like BiCGSTAB or IDR(s), which may succeed where GMRES( $m$ ) failed. This tiered strategy—a practitioner's compass—guides us through the complex landscape of numerical solvers, allowing us to focus on what truly matters: finding the solution and understanding the science it reveals.