Stirling Numbers

SciencePedia

Key Takeaways

Stirling numbers of the second kind count the ways to partition a set into non-empty subsets, while unsigned Stirling numbers of the first kind count permutations with a given number of cycles.
They act as essential "connection coefficients," providing a mathematical bridge to convert between standard polynomial powers and falling factorials.
Both types of Stirling numbers obey simple, elegant recurrence relations and have compact generating functions that reveal their inverse relationship.
Beyond combinatorics, Stirling numbers have profound applications in probability, statistics, number theory, and even mathematical logic, illustrating their fundamental nature.

Introduction

How many ways can you group a set of distinct items? This simple question, whether it involves arranging people at party tables or sorting data on servers, leads to a rich and beautiful area of mathematics governed by Stirling numbers. Named after the 18th-century mathematician James Stirling, these numbers provide the fundamental language for counting partitions and permutations. While they arise from straightforward combinatorial problems, their influence extends far beyond, weaving together seemingly disparate fields and revealing the deep, underlying unity of mathematical thought. This article delves into the world of these remarkable numbers.

In the first chapter, "Principles and Mechanisms," we will explore the two distinct faces of Stirling numbers—those of the first and second kind. We will uncover their elegant growth laws through recurrence relations, aunderstand their pivotal algebraic role as connection coefficients between polynomial bases, and witness the power of generating functions in capturing their properties.

Following this, the chapter on "Applications and Interdisciplinary Connections" will take us on a journey to see these numbers in action. We will discover their crucial role in probability and statistics, their surprising appearances in analysis and number theory, and their utility in solving problems in graph theory and even the abstract realm of mathematical logic.

Principles and Mechanisms

Imagine you are at a party. The host, a rather eccentric mathematician, poses two different challenges. First, take a group of people and divide them into smaller, non-empty conversational circles. Second, take the same group of people and seat them at a set of identical round tables. The number of ways you can accomplish these tasks, seemingly simple party games, are counted by two related families of numbers known as the Stirling numbers. These numbers, named after the 18th-century mathematician James Stirling, form a fundamental bridge connecting different areas of mathematics, from combinatorics and algebra to calculus and even number theory. Let's peel back the layers and discover the beautiful machinery that governs them.

The Two Faces of Partitioning

At the heart of the matter are two distinct ways of thinking about "grouping" things.

First, consider the task of partitioning a set of $n$ distinct objects into exactly $k$ non-empty, indistinguishable groups. Think of having $n$ different books and wanting to put them into $k$ identical boxes, ensuring no box is empty. The number of ways to do this is given by the Stirling numbers of the second kind, denoted as $S(n, k)$ or $\left\{ \begin{matrix} n \\ k \end{matrix} \right\}$ .

For example, how many ways can we partition a set of 4 objects, let's call them $\{1, 2, 3, 4\}$ , into 2 non-empty subsets?

We can group one object with three others: $\{\{1\}, \{2,3,4\}\}$ , $\{\{2\}, \{1,3,4\}\}$ , $\{\{3\}, \{1,2,4\}\}$ , $\{\{4\}, \{1,2,3\}\}$ . (4 ways)
We can group two objects with two others: $\{\{1,2\}, \{3,4\}\}$ , $\{\{1,3\}, \{2,4\}\}$ , $\{\{1,4\}, \{2,3\}\}$ . (3 ways) In total, there are $4+3=7$ ways. So, $S(4,2) = 7$ . As you can imagine, calculating this by hand gets complicated quickly. For instance, to partition 5 objects into 3 groups, the number of ways, $S(5,3)$ , is 25. If you sum up the number of ways to partition $n$ objects over all possible numbers of groups (from 1 to $n$ ), you get the total number of ways to partition the set, a quantity known as the $n$ -th Bell number, $B_n$ . Thus, we have the beautiful and direct relationship $B_n = \sum_{k=0}^{n} S(n,k)$ .

Now for the second challenge: arranging $n$ people into $k$ non-empty, indistinguishable circular table arrangements. The key difference is that at a round table, the arrangement is cyclic—only who sits next to whom matters, not the specific chair. The number of ways to do this is given by the unsigned Stirling numbers of the first kind, denoted $c(n,k)$ or $\genfrac{[}{]}{0pt}{}{n}{k}$ . These numbers count permutations of $n$ elements that are composed of exactly $k$ disjoint cycles.

This interpretation might seem abstract, but it has a surprisingly intuitive cousin: counting left-to-right maxima in a permutation. A left-to-right maximum is an element in a sequence that is greater than all preceding elements. For the permutation 31524, the left-to-right maxima are 3 and 5. Amazingly, the number of permutations of $n$ elements with exactly $k$ left-to-right maxima is also $c(n,k)$ . This unexpected link between cyclic arrangements and ordered sequences is a first hint at the deep unifying power of these numbers.

The Unseen Laws of Growth: Recurrence Relations

How do these numbers grow as we increase $n$ and $k$ ? It turns out they follow simple, elegant rules. Let's think about adding a new person, the $(n+1)$ -th person, to our arrangements.

For Stirling numbers of the second kind, $S(n+1, k)$ , our new person has two choices:

Form a new group all by themselves. The remaining $n$ people must then be partitioned into $k-1$ groups, which can be done in $S(n, k-1)$ ways.
Join one of the $k$ existing groups. The original $n$ people are already in $k$ groups (in $S(n,k)$ ways), and the new person can join any of these $k$ groups. This gives $k \cdot S(n, k)$ possibilities.

Adding these two cases gives the recurrence relation for the Stirling numbers of the second kind: $S(n+1, k) = k S(n, k) + S(n, k-1)$ This simple law allows us to build a whole table of these numbers starting from the basic facts that $S(n,n)=1$ (everyone in their own group) and $S(n,1)=1$ (everyone in one big group).

For the Stirling numbers of the first kind, $c(n+1, k)$ , a similar logic applies. Our new $(n+1)$ -th person again has two choices:

Sit at a new table, all alone. The other $n$ people must be arranged at $k-1$ tables, which can be done in $c(n, k-1)$ ways.
Join one of the existing tables. If there are already $n$ people seated, where can the new person sit? They can sit to the immediate left (or right, it's equivalent) of any of the $n$ people already there. This creates a new, distinct cyclic arrangement. So, there are $n$ places to insert the new person. This gives $n \cdot c(n, k)$ possibilities.

This gives the recurrence for the unsigned Stirling numbers of the first kind: $c(n+1, k) = n c(n, k) + c(n, k-1)$

Notice the stunning similarity! The structure of the rules is identical; only a single factor differs— $k$ versus $n$ . This is mathematics whispering to us that these two types of numbers, born from different problems, are deeply related, like two sides of the same coin.

A Universal Translator: Stirling Numbers as Connection Coefficients

The story takes an algebraic turn. Polynomials can be written in different "bases." The most common is the basis of powers: $\{1, x, x^2, x^3, \dots\}$ . Another very useful basis, especially in the calculus of finite differences, is the basis of falling factorials: $\{(x)_0, (x)_1, (x)_2, \dots\}$ , where $(x)_k = x(x-1)(x-2)\cdots(x-k+1)$ .

Stirling numbers emerge as the universal conversion factors, or connection coefficients, between these two fundamental bases.

The signed Stirling numbers of the first kind, $s(n,k) = (-1)^{n-k} c(n,k)$ , are the coefficients that convert falling factorials into standard powers: $(x)_n = \sum_{k=0}^{n} s(n,k) x^k$ For example, $(x)_3 = x(x-1)(x-2) = x^3 - 3x^2 + 2x$ . The coefficients are $s(3,3)=1$ , $s(3,2)=-3$ , and $s(3,1)=2$ .

Conversely, the Stirling numbers of the second kind, $S(n,k)$ , perform the reverse transformation, converting standard powers into falling factorials: $x^n = \sum_{k=0}^{n} S(n,k) (x)_k$ For instance, we can check that $x^3 = 1 \cdot (x)_3 + 3 \cdot (x)_2 + 1 \cdot (x)_1$ . The coefficients are $S(3,3)=1$ , $S(3,2)=3$ , and $S(3,1)=1$ . The general relationship provides a powerful identity that can be used to evaluate complex sums.

This dual role is profound. The Stirling numbers are not just for counting party arrangements; they are the gears that connect two different languages of algebra. This inverse relationship is the algebraic reflection of the combinatorial duality we saw earlier.

The Combinatorialist's Telescope: Generating Functions

How can we study an entire infinite family of numbers all at once? The physicist's and mathematician's answer is a powerful tool: the generating function. The idea is to "package" an infinite sequence of numbers $a_0, a_1, a_2, \dots$ into a single function, like $A(x) = \sum a_n x^n / n!$ . The function $A(x)$ now holds all the information about the sequence. Manipulating the function is equivalent to performing operations on the entire sequence.

For Stirling numbers, this approach yields breathtakingly elegant results. The bivariate exponential generating function for the Stirling numbers of the second kind packages all $S(n,k)$ values into one compact expression: $B(x, z) = \sum_{n=0}^{\infty} \sum_{k=0}^{\infty} S(n, k) \frac{x^n}{n!} z^k = \exp\left(z(e^x-1)\right)$

This beautiful formula is a veritable factory for combinatorial results. By expanding it, you can extract any $S(n,k)$ you wish. For a fixed $k$ , the generating function is $\frac{(e^x-1)^k}{k!}$ .

Now, what about the Stirling numbers of the first kind? Given their inverse relationship to the second kind, we might expect a related generating function. And indeed, the generating function for the signed numbers $s(n,k)$ is: $\sum_{n=k}^{\infty} s(n,k) \frac{x^n}{n!} = \frac{(\ln(1+x))^k}{k!}$

Look at this! One involves the function $e^x-1$ , and the other involves its compositional inverse, $\ln(1+x)$ . This is no coincidence. It is the generating function's way of telling us about the deep inverse relationship between the two kinds of Stirling numbers.

These compact functions are not just for show. They are powerful computational engines. For example, using the generating function for $c(n,k)$ , one can prove a remarkable result: the total number of left-to-right maxima, summed over all $N!$ permutations of $\{1, \dots, N\}$ , is exactly $N! H_N$ , where $H_N = 1 + \frac{1}{2} + \dots + \frac{1}{N}$ is the $N$ -th harmonic number. Trying to prove this by brute force would be a nightmare; with generating functions, it becomes an elegant derivation.

The View from Infinity: Asymptotics and the Physics of Large Numbers

What happens when $n$ is enormous? Say, the number of atoms in the universe. We can't calculate $S(n,k)$ exactly, but we can find its approximate behavior. This is where the methods of statistical physics become invaluable. The Stirling numbers can be expressed as complex integrals, and for large $n$ , these integrals can be approximated using the saddle-point method. The idea is that the value of the integral is overwhelmingly dominated by the contribution from a single point—the saddle point—where the integrand is maximal.

Applying this to $S(n,k)$ , we find that its logarithm, $\ln S(n,k)$ , which is related to the entropy of the partition, behaves in a predictable way that depends on the ratio $\alpha = k/n$ . For the Stirling numbers of the first kind, the analysis reveals something different. The number of cycles $k$ in a random permutation of $n$ elements tends to be close to $\ln n$ . The saddle-point method gives us a precise formula for $c(n,k)$ when $k$ is near this average value, showing that the distribution looks like a bell curve: $c(n, \ln n) \approx \frac{n!}{\sqrt{2\pi \ln n}}$ This is a manifestation of the central limit theorem in the world of permutations.

Echoes of the Primes: Hidden Arithmetic Symmetries

The story has one final, astonishing twist. These numbers, which arise from counting and algebra, are also deeply sensitive to the properties of prime numbers. Consider the Bell number $B_p$ , the total number of ways to partition a set with a prime number $p$ of elements. If we look at the remainder of $B_p$ when divided by $p$ , we find a stunning regularity, a result known as Touchard's Congruence. For any prime $p$ : $B_p \equiv 2 \pmod p$ This means that if you have 7 objects, the total number of ways to partition them, $B_7 = 877$ , leaves a remainder of 2 when divided by 7 (since $877 = 125 \times 7 + 2$ ). The same holds for $p=3$ ( $B_3=5 \equiv 2 \pmod 3$ ), $p=5$ ( $B_5=52 \equiv 2 \pmod 5$ ), and every other prime.

This is a piece of mathematical magic. The proof relies on another gem of number theory, Fermat's Little Theorem, and reveals a hidden arithmetic structure within the Stirling numbers. It shows that the world of combinatorics is not isolated; it is interwoven with the deepest properties of numbers, demonstrating the profound and often surprising unity of mathematics. From simple party games, we have journeyed through algebra, calculus, and physics, only to find the unmistakable fingerprints of the prime numbers.

Applications and Interdisciplinary Connections

We have spent some time getting to know our new friends, the Stirling numbers of both kinds. We have seen how they are born from simple questions about arranging things—permutations into cycles, and sets into subsets. It's a fine game of counting, and one could be perfectly happy leaving it there. But to do so would be to miss the real magic. The true wonder of these numbers is not in what they are, but in what they do. They are not museum pieces to be admired behind glass; they are tireless workers, appearing in the most unexpected corners of science, weaving together threads from seemingly distant fields. They are part of the deep grammar of the mathematical world.

Let’s go on a journey and see where these numbers show up. We will see that the simple act of partitioning a set is a fundamental pattern that nature, in its astonishing variety, uses over and over again.

The Logic of Chance: Probability and Statistics

Perhaps the most natural place to find Stirling numbers at work is in the realm of probability. So many questions about chance boil down to counting possibilities, and as we’ve seen, that is the Stirling numbers' home turf.

Imagine you are running a large-scale experiment, like throwing $n$ darts at a wall with $k$ numbered targets. If you are a particularly good shot and your darts are thrown randomly, you might ask: what is the probability that exactly $m$ of the targets are hit at least once? This is a classic problem, a cousin of the famous "coupon collector's problem". The total number of ways the $n$ distinct darts can land on the $k$ distinct targets is simply $k^n$ , with each outcome equally likely. To find our desired probability, we need to count the "favorable" outcomes.

First, which $m$ targets are we hitting? There are $\binom{k}{m}$ ways to choose them. Now, for that chosen set of $m$ targets, we must distribute our $n$ darts among them such that every single one is hit. This is precisely the problem of counting surjective (or "onto") functions from a set of $n$ darts to a set of $m$ targets. And we know the answer to that! It's $m! S(n,m)$ , where $S(n,m)$ is the Stirling number of the second kind. Putting it all together, the probability is a beautiful, compact formula involving our familiar number. The simple combinatorial idea of partitioning a set becomes the key to unlocking a probabilistic puzzle.

This same logic applies to more worldly problems. Consider a distributed data system with $M$ servers holding $N$ data packets. The statistical entropy of a state where exactly $k$ servers are empty depends directly on the number of ways to partition the $N$ packets among the $M-k$ non-empty servers. Boltzmann's famous principle, $S = k_B \ln(\Omega)$ , connects the physical quantity of entropy ( $S$ ) to the number of microscopic arrangements ( $\Omega$ ). And how do we count $\Omega$ ? With Stirling numbers, of course! They provide the crucial combinatorial factor that tells us how many ways the system can realize that particular state. From abstract counting to a tangible physical property, the path is paved by partitions.

But the role of Stirling numbers in statistics is more profound than just counting outcomes. They act as essential "translators" or "connection coefficients". In statistics, we are often interested in the moments of a probability distribution, like the mean ( $E[X]$ ), variance, and higher-order terms like $E[X^4]$ . Sometimes, calculating these "power moments" directly is a headache. However, it might be much easier to calculate what are called factorial moments, $E[X(X-1)\cdots(X-j+1)]$ . This is especially true for distributions like the binomial and Poisson distributions, which arise from counting successes.

So we have these two different "languages" for describing a distribution: the language of power moments ( $x^k$ ) and the language of factorial moments ( $(x)_j$ ). How do we translate between them? The Stirling numbers of the second kind are the answer! They provide the exact conversion formula: $x^k = \sum_{j=0}^{k} S(k, j) (x)_j$ By taking the expectation of both sides, we can use the easily-computed factorial moments to find the power moments we were after. For instance, calculating the fourth moment $E[X^4]$ of a binomial random variable becomes a straightforward application of this formula, using a few known values of $S(4,j)$ . It’s as if the Stirling numbers form a bridge, allowing us to walk from an easy calculation to a hard one.

This interplay can be taken a step further. What if we turn the combinatorial function itself into a random variable? Imagine a process that produces a random number of items, say $X$ , following a Poisson distribution. We could then ask, what is the expected number of ways to partition these $X$ items into $k$ groups? We are asking for the value of $E[S(X, k)]$ . This might seem like a strange and contrived question, but it shows up in the study of random structures. The calculation is a bit of a marvel: it combines the probability mass function of the Poisson distribution with the exponential generating function for the Stirling numbers, and out pops a beautifully simple closed-form answer. It is a testament to the deep and often surprising connections that generating functions reveal.

A Web of Connections: Unifying Mathematics

One of the joys of physics, and mathematics, is seeing a single, simple idea appear in wildly different contexts. It gives us a sense that there is an underlying unity to it all. Stirling numbers are masters of this kind of unexpected appearance.

Let's stay with generating functions for a moment. They are like a clothesline on which we hang a sequence of numbers, and by manipulating the clothesline (the function), we learn about the numbers. The generating function for the Stirling numbers of the second kind, $\frac{(e^t - 1)^k}{k!}$ , is a powerful tool. But what happens if we start to play with it? For example, what if we sum these generating functions over $k$ , but with a peculiar set of weights, like $(-1)^k \frac{k!}{k+1}$ ? It seems like an arbitrary thing to do. Yet, if you carry out the sum, an astonishing thing happens. The infinite series of combinatorial terms collapses into an incredibly important and simple function: $\frac{t}{e^t - 1}$ . This is none other than the exponential generating function for the Bernoulli numbers—numbers that are absolutely fundamental in number theory and analysis, appearing in everything from the Riemann zeta function to the Euler-Maclaurin formula. Why on earth should a weighted sum over set partitions be related to the Bernoulli numbers? The calculation shows it's true, but the "why" hints at a profound connection between the discrete world of combinatorics and the continuous world of analysis.

The Stirling numbers of the first kind are not to be outdone. They too have their generating function, related to powers of $-\ln(1-z)$ . And just as before, this allows them to sneak into problems in analysis. Suppose you wanted to find the Maclaurin series for a function like $\cos(\ln(1-z))$ . This seems like a pure calculus problem. But when you expand the cosine series and substitute the series for the logarithm, you find that the coefficients of the resulting power series are given by a neat sum involving the unsigned Stirling numbers of the first kind $\genfrac{[}{]}{0pt}{}{n}{k}$ . Once again, a question from one field (complex analysis) finds its answer in another (combinatorics).

These numbers also build bridges within discrete mathematics itself. Consider the problem of coloring a graph. The chromatic polynomial, $P_G(\lambda)$ , tells you how many ways you can properly color the vertices of a graph $G$ using $\lambda$ colors. For some graphs, this polynomial has a special structure. Take a complete multipartite graph, which is formed by partitioning the vertices into several sets and connecting every vertex to every other vertex that is not in its own set. To color such a graph, all vertices in a given partition set can share colors, but any two vertices in different sets must have different colors. This forces us to partition the available $\lambda$ colors into groups, one for each vertex partition. The problem of counting colorings becomes a problem of counting partitions of vertices and partitions of colors, and the coefficients of the chromatic polynomial in the falling factorial basis turn out to be expressed in terms of Stirling numbers of the second kind.

Perhaps the most abstract, and thus most surprising, connection is found in the depths of mathematical logic. Logicians study formal theories, like the theory of a dense linear order without endpoints (think of the rational numbers $\mathbb{Q}$ with their usual $$ relation). They ask questions like: how many fundamentally different ways can you describe the relationship between $n$ variables $x_1, \dots, x_n$ ? Each such complete description is called an " $n$ -type". In this theory of dense orders, an $n$ -type is determined by specifying, for every pair of variables, whether $x_i x_j$ , $x_i = x_j$ , or $x_j x_i$ . This amounts to first partitioning the set of $n$ variables into equivalence classes (where variables in a class are equal) and then placing these classes into a strict linear order.

How many ways can we do this? For a fixed number of equivalence classes, say $k$ , there are $S(n,k)$ ways to partition the variables and $k!$ ways to order the classes. Summing over all possible values of $k$ , the total number of $n$ -types is $\sum_{k=1}^{n} S(n,k) k!$ . These are the "ordered Bell numbers". Look at this! A deep question from the foundations of logic—about what can be said in a formal language—is answered by a simple combinatorial count. This is the same count we would use to determine how many ways we can assign $n$ distinct projects to a variable number of newly created, distinct labs, where each lab gets at least one project. The abstract structure is identical.

From probability to physics, from number theory to formal logic, the Stirling numbers appear, not as a mere curiosity, but as a fundamental part of the toolkit. They are a beautiful example of how a simple, intuitive concept—the act of partitioning—can ripple through the vast and interconnected landscape of scientific thought, revealing its inherent beauty and unity.