The Havel-Hakimi Algorithm

SciencePedia

Key Takeaways

The Havel-Hakimi algorithm is a definitive, step-by-step test to determine if a simple network can be constructed from a given list of connection counts (a degree sequence).
Its core principle is recursive: it repeatedly connects the most demanding node to other high-degree nodes and solves the resulting smaller problem.
The algorithm's validity rests on the theorem that a sequence is graphic if and only if its reduced sequence is also graphic.
A graphic sequence can have multiple valid structures (realizations), which may possess different properties like connectedness, planarity, or the presence of a Hamiltonian cycle.

Introduction

How do you know if a blueprint for a network—be it a social network, a computer system, or a molecule—is even possible to build? This fundamental question lies at the heart of graph theory and network science. Given a simple list of desired connections for each node, known as a degree sequence, we need a reliable way to determine if a corresponding simple graph can be constructed. While basic rules like the Handshaking Lemma can quickly rule out some impossible designs, they fall short of providing a complete answer, leaving a critical knowledge gap for network architects and scientists. This article bridges that gap by providing a comprehensive exploration of a powerful and elegant solution. In the following chapters, you will delve into the recursive logic and formal steps of the Havel-Hakimi algorithm, and then explore its diverse applications, from engineering feasibility studies to the chemical world of molecular isomers. We begin by examining the core principles that make this systematic verification possible.

Principles and Mechanisms

Imagine you're at a party. A simple question comes to mind: how many hands were shaken in total? You might not know the exact number, but you know one thing for sure. If you go around and ask everyone how many hands they shook, and then add up all those numbers, the final sum must be even. Why? Because every handshake is a pact between two people. It adds one to Alice's count and one to Bob's count, contributing a total of two to the grand sum. You can't have an odd number of total handshakes any more than you can have a lone hand shaking itself in a crowd.

This simple, almost trivial observation is a cornerstone of graph theory, known as the Handshaking Lemma. In the language of networks, where people are "vertices" and handshakes are "edges," the number of connections a vertex has is its degree. The lemma states that the sum of the degrees of all vertices in a graph is always an even number—exactly twice the number of edges.

This gives us our first powerful, if blunt, tool for a fascinating problem. Suppose a network architect gives you a list of desired connection counts for a new network—say, a list of integers like $S = (d_1, d_2, \dots, d_n)$ . Can such a network actually be built? This list is called a degree sequence, and if a simple network (with no self-loops or multiple edges between the same two nodes) can be constructed with these exact degrees, the sequence is called graphic.

Our Handshaking Lemma is a quick first check. A sequence like $(4, 3, 3, 2, 1)$ can be dismissed immediately. The sum of degrees is $4+3+3+2+1 = 13$ , an odd number. No such network can exist; it's a physical impossibility. We also have a common-sense constraint: in a network of $n$ nodes, no single node can be connected to more than the $n-1$ other nodes. So a sequence like $(6, 5, 4, 3, 2, 2)$ for a 6-node network is also impossible, because one node is demanding 6 connections when only 5 other nodes are available.

But these simple tests aren't enough. A sequence like $(4, 4, 4, 1, 1)$ for a 5-node network passes both checks: the sum is 14 (even), and no degree exceeds $5-1=4$ . Yet, as it turns out, this sequence is not graphic. Something more subtle is at play. We need a deeper, more systematic way to know for sure.

The Architect's Dilemma: A Recursive Solution

Let's put ourselves in the shoes of the network architect. We have our sorted list of desired connections, $S = (d_1, d_2, \dots, d_n)$ , with the most ambitious node, $v_1$ , wanting $d_1$ connections. The core of the problem is interdependence; we can't satisfy $v_1$ 's needs without affecting the needs of its partners.

This is where a beautifully simple, recursive idea comes to the rescue. Let's focus on that most demanding node, $v_1$ . It needs to be connected to $d_1$ other nodes. So... let's just do it. Let's connect $v_1$ to $d_1$ of its peers. Once we've done that, its needs are met. We can, in a sense, remove $v_1$ from our list of worries.

What about the nodes it just connected to? Their own connection quotas have been partially filled. Each of the $d_1$ nodes that we connected to $v_1$ now needs one fewer connection. So, we can create a new, smaller problem: we now have $n-1$ nodes, and their required degrees have been updated. If we can solve this smaller problem, then we can simply add $v_1$ back in with its connections, and we will have solved our original problem!

This is the essence of recursion: breaking a large, complicated problem into a smaller, slightly simpler version of the very same problem. We can keep applying this logic, step by step, reducing the size of our network plan each time. The process stops when we reach a state that is trivially easy to verify. For example, if we are left with a list of all zeros, we know it's graphic—it represents a network with no connections left to make. The job is done. Conversely, if at any step we are asked to create an impossible situation, like a node needing a negative number of connections, we know our original plan must have been flawed from the start.

The Havel-Hakimi Algorithm: An Elegant Recipe

This recursive logic is formalized in a delightful procedure called the Havel-Hakimi algorithm. It provides a concrete recipe for our "just do it and see what's left" strategy. But it adds one crucial, brilliant ingredient. When we pick $d_1$ nodes for our most-connected vertex to shake hands with, which ones should we choose? Does it matter?

The amazing insight from Václav Havel and S. L. Hakimi is this: if the sequence is graphic at all, then it must be realizable by connecting the vertex with the highest degree, $d_1$ , to the vertices with the next $d_1$ highest degrees. We don't need to test every possible combination of partners; this single, greedy choice is sufficient.

Let's see this recipe in action. Consider the proposed design $S_0 = (4, 4, 3, 3, 2, 2)$ .

The sequence is sorted. The highest degree is $d_1=4$ . We "satisfy" this node by connecting it to the next four nodes.
We remove the first '4' and subtract 1 from the next four degrees: $(4-1, 3-1, 3-1, 2-1)$ , leaving the final '2' untouched.
Our new list of remaining degrees is $(3, 2, 2, 1, 2)$ . To continue, we must re-sort it into non-increasing order: $S_1 = (3, 2, 2, 2, 1)$ .
Now we have a new, smaller problem. We repeat the process with $S_1$ . The highest degree is 3. We remove it and subtract 1 from the next three degrees: $(2-1, 2-1, 2-1)$ , leaving the '1' untouched.
The new list is $(1, 1, 1, 1)$ , which we can call $S_2$ . This is already sorted.
We can continue, but we might recognize $S_2=(1, 1, 1, 1)$ as the degree sequence of a graph with two separate edges. It's clearly graphic. Since the reduced sequence $S_2$ is graphic, the algorithm guarantees that $S_1$ was graphic, and therefore our original sequence $S_0$ is graphic as well. If we did continue, we would eventually reach a sequence of all zeros. Success!

The algorithm can also fail spectacularly. Take the sequence $(5, 5, 4, 2, 1, 1)$ .

Remove the first '5' and subtract 1 from the next five degrees: $(5-1, 4-1, 2-1, 1-1, 1-1)$ .
This gives the new sequence $(4, 3, 1, 0, 0)$ .
Now, let's apply the process again. Remove the '4' and subtract 1 from the next four degrees: $(3-1, 1-1, 0-1, 0-1)$ .
The result is $(2, 0, -1, -1)$ . A negative number! This is a physical absurdity—you can't have a negative number of connections. The algorithm terminates and declares the original sequence is not graphic.

The Magic of "If and Only If": Why the Recipe Works

The Havel-Hakimi algorithm is more than just a clever heuristic; its power lies in the mathematical certainty of "if and only if." The Havel-Hakimi theorem states that a sequence $S$ is graphic if and only if the reduced sequence $S'$ is graphic (assuming $S'$ doesn't contain nonsensical values). This two-way logical street is what makes it a perfect test.

The "if" part is constructive: if you show me a graph for the smaller problem $S'$ , I can construct a graph for the original problem $S$ by adding back the first vertex and its connections.

The "only if" part is the gatekeeper: if the reduced sequence $S'$ is not graphic, then there was no hope for the original sequence $S$ to begin with. This is why a single failure, like a negative number appearing, is enough to doom the entire process. It tells us that our initial assumptions led to an inescapable contradiction.

To truly appreciate the elegance of connecting to the highest degree nodes, let's consider a tempting alternative. What if we tried to connect our most popular node to the $d_1$ least popular nodes? Let's call this the "Modified Havel-Hakimi" algorithm. Does it work?

Let's test it on the sequence $S=(3,3,1,1,1,1)$ . This sequence is genuinely graphic (imagine two central nodes, $u$ and $v$ , connected to each other, with $u$ also connected to two "leaf" nodes and $v$ connected to the other two leaves). Now, let's apply our modified algorithm.

The highest degree is 3. We connect this node to the three smallest available degrees, which are three of the 1s.
The remaining degrees are $(3, 1-1, 1-1, 1-1, 1)$ , which gives $(3, 1, 0, 0, 0)$ after sorting.
Now we apply the modified rule again. Highest degree is 3. We connect it to the three smallest: the three 0s.
The new remaining degrees are $(1, 0-1, 0-1, 0-1)$ , which is $(1, -1, -1, -1)$ . Failure!

Our modified algorithm incorrectly rejected a perfectly valid graphic sequence. This experiment reveals the subtlety of the original algorithm. By connecting the highest-degree nodes to each other, the Havel-Hakimi algorithm systematically reduces the graph's complexity in the most stable way possible. The modified version, by contrast, can create new, problematic structures that cause it to fail. The standard algorithm isn't just one possible recipe; it's a very special one that guarantees it will find a solution if one exists.

From Recipe to Reality: Mastering the Connections

Armed with this robust algorithm, we can tackle more nuanced questions about network design. Suppose an architect is designing a 7-node network with a proposed degree set of $\{6, 3, 3, 2, 2, 1, x\}$ , where one node's connectivity $x$ is flexible. What are the possible values of $x$ ? First, we apply our basic filters. The sum $17+x$ must be even, so $x$ must be odd. And $x$ must be between 0 and $n-1=6$ . So our candidates are $x \in \{1, 3, 5\}$ . How do we choose? We simply test each one with the Havel-Hakimi algorithm. For each case, we form the sequence, sort it, and run the recipe. As it turns out, all three values result in a graphic sequence, revealing a surprising degree of flexibility in the network's design. The sum of these values is $1+3+5=9$ .

We can even push the logic further. Instead of just asking if a sequence is graphic, we can ask if it can be realized with specific connections. Imagine a research institute with 8 researchers and a degree sequence of $(4, 4, 4, 4, 2, 2, 2, 2)$ . We know this is graphic. But can we build the network such that Researcher 1 (degree 4) is connected specifically to the team $\{5, 6, 7, 8\}$ ? The logic is a natural extension of the algorithm itself. We assume those connections are made. Researcher 1's requirement is met. The degrees of researchers 5, 6, 7, and 8 are each reduced by one. The new problem becomes: is the residual degree sequence for researchers 2 through 8, which is $(4, 4, 4, 1, 1, 1, 1)$ , graphic? We run this new sequence through the Havel-Hakimi algorithm, and we find that it is not. Therefore, that specific collaboration team is not viable. By testing other teams, like $\{2, 3, 7, 8\}$ , we can find which ones are viable, giving us a powerful tool for exploring the space of possible network structures.

Finally, what kind of network structure is the "hardest" for the Havel-Hakimi algorithm to unravel? That is, which sequence on $n$ vertices requires the maximum number of iterations? The algorithm reduces the number of vertices by one at each step. So for $n=8$ , the maximum number of steps is $7$ . To achieve this, the sequence must not terminate early. By working backward from the final $(0)$ sequence, we can construct the required precursor at each step. This reverse journey leads to a surprisingly simple and beautiful answer: the sequence that takes the longest to solve is $(n-1, n-1, \dots, n-1)$ . For $n=8$ , this is $(7,7,7,7,7,7,7,7)$ . This is the degree sequence of the complete graph, $K_8$ , where every researcher is connected to every other researcher. It is a moment of perfect symmetry: the most interconnected, densest possible network is also the one that our recursive algorithm must peel back, layer by single layer, for the longest possible time.

Applications and Interdisciplinary Connections

We have seen the beautiful, clockwork logic of the Havel-Hakimi algorithm. It provides a definitive answer to a seemingly simple question: given a list of desired connections for a set of nodes, can we actually build a simple network that satisfies it? This question, it turns out, is not just a mathematical curiosity. It is the starting point for a fascinating journey into the design, analysis, and understanding of networks all around us. The principles we've uncovered are not confined to the abstract world of vertices and edges; they extend into engineering, chemistry, and even the very structure of scientific problems themselves.

Let's begin with the most direct application: a feasibility study. Imagine you are designing a new social network simulation or a peer-to-peer communication system for a fleet of autonomous drones. You might start with a "wish list" of connections: this user should have five friends, that user should have three, and so on. Before you write a single line of code or launch a single drone, you must ask: is this configuration even possible? The Havel-Hakimi algorithm acts as your initial reality check. It prevents you from wasting resources trying to build an impossible structure, whether that structure is a virtual friendship graph or a physical drone communication network. It's the sober engineer sitting at the drawing board, checking if the architectural blueprint is structurally sound before the first brick is laid.

But science and engineering are rarely about just checking someone else's plan. They are about creating new ones. What if the blueprint isn't set in stone? Suppose you are a systems architect designing a server network where one primary server's connectivity, let's call it $k$ , is configurable. To minimize cost and complexity, you want to find the smallest possible value of $k$ that still results in a viable network. Here, the principles underlying graphic sequences become powerful design tools. You can systematically test values of $k$ , guided by fundamental rules like the handshaking lemma (the sum of degrees must be even), to find the most efficient and economical design that works. This shifts our perspective from that of an auditor to that of a creative designer, using the rules of graph theory as our guide. Similarly, if an initial network specification is found to be impossible, we can ask how to "fix" it with minimal change—a form of error correction or network debugging.

This is where the story gets truly interesting. The Havel-Hakimi algorithm doesn't just give a "yes" or "no" answer; it's constructive. It can actually build a graph for you. But here lies a crucial and subtle point: the graph it builds is only one of potentially many possible realizations for a given degree sequence. A single blueprint can be used to construct many different buildings.

This "one blueprint, many buildings" dilemma has profound consequences. For instance, you might run the algorithm on a valid degree sequence, and it might produce a graph that is disconnected—split into two or more separate networks. You might conclude that your network cannot be fully integrated. However, another valid construction for the very same degree sequence could result in a fully connected graph!. This is a critical lesson: the specific output of a constructive algorithm is not the whole story. The set of all possible realizations might contain graphs with vastly different properties.

This ambiguity extends to some of the most important features a network can have. Consider the famous "traveling salesman" problem, which asks if there's a tour that visits every node exactly once. In graph theory, this is known as a Hamiltonian cycle. A given degree sequence might have one realization that is Hamiltonian, allowing for a perfectly efficient tour of all nodes, and another non-Hamiltonian realization where no such tour is possible. The degree sequence alone is not enough to tell you. The same is true for a property called bipartiteness, which determines if the nodes can be split into two groups, $U$ and $V$ , where edges only connect a node in $U$ to a node in $V$ . This is fundamental to solving matching and assignment problems. Yet again, a single degree sequence can sometimes be realized as a bipartite graph and other times as a non-bipartite one. The simple list of degrees conceals a rich world of structural variety.

The connections of our topic now spread far and wide, bridging the gap between abstract mathematics and the physical world.

Perhaps the most tangible connection is to planarity. A degree sequence might be perfectly graphic, like $(5, 5, 5, 5, 5, 5)$ , which corresponds to six nodes all connected to each other (the complete graph $K_6$ ). You can certainly draw this on paper. But can you manufacture it on a circuit board without any wires crossing? The answer is no. There is a beautiful and deep result in graph theory, related to Euler's formula for polyhedra, that places a strict limit on the number of edges a planar graph on $n$ vertices can have: $E \le 3n - 6$ . Our $K_6$ graph has 15 edges, but the planar limit for 6 vertices is only $3(6)-6=12$ . Since $15 > 12$ , no physical realization of this network can exist on a flat plane without crossovers. A logically sound network blueprint may be physically impossible to build under certain constraints.

The connections to chemistry are even more profound. In chemical graph theory, molecules are modeled as graphs where atoms are vertices and chemical bonds are edges. The degree of a vertex corresponds to the valency of an atom (how many bonds it can form). A degree sequence, then, is like a list of atoms available to form a molecule. Different molecules that are composed of the same atoms but have different structures are called isomers. In our language, isomers are simply non-isomorphic realizations of the same graphic sequence! These different structures are not just academic curiosities; they have different physical and chemical properties, like boiling point or reactivity. Topological indices, such as the Zagreb index, are numerical values calculated from the graph structure that aim to predict these properties. It turns out that for a fixed degree sequence, the value of the Zagreb index can vary significantly between different realizations. The "one blueprint, many buildings" problem is the very reason isomers exist, and graph theory gives us the tools to explore and quantify their differences.

Finally, what happens when we relax the rules? A "simple graph" forbids loops (an edge from a vertex to itself) and multiple edges between the same two vertices. But the real world is full of such things: feedback loops in electronic circuits, multiple runway connections between two airport hubs. The core logic we've learned is robust enough to handle these generalizations. Consider a pseudograph, which allows loops and multiple edges. A loop at a vertex adds 2 to its degree. To see if a degree sequence is realizable as a pseudograph with a certain number of loops, we can simply subtract the degree contributions of the loops and then check if the remaining degrees can be realized by a loopless multigraph. This elegant extension shows the power of the underlying principle: it's all about accounting for where the "stubs" of each vertex's degree connect.

From a simple party puzzle, we have journeyed through network design, discovered the crucial subtlety of non-uniqueness, and forged connections to the physical constraints of circuit boards and the chemical diversity of molecules. The Havel-Hakimi algorithm and the theory of graphic sequences are far more than a mathematical tool; they are a lens through which we can appreciate the intricate and often surprising relationship between a simple list of parts and the complex wholes that can be built from them.