try ai
Popular Science
Edit
Share
Feedback
  • Fermat's theorem on sums of two squares

Fermat's theorem on sums of two squares

SciencePediaSciencePedia
Key Takeaways
  • A positive integer can be written as a sum of two squares if and only if every prime factor of the form 4k+34k+34k+3 in its prime factorization appears with an even exponent.
  • The problem is best understood by reframing it in the system of Gaussian integers (Z[i]\mathbb{Z}[i]Z[i]), where the sum of two squares, a2+b2a^2+b^2a2+b2, is the norm of the number a+bia+bia+bi.
  • A prime number ppp from the regular integers is a sum of two squares if and only if it is no longer prime (it "splits") in the ring of Gaussian integers, a condition met if and only if p≡1(mod4)p \equiv 1 \pmod 4p≡1(mod4).
  • This theorem has profound applications beyond pure mathematics, influencing fields like geometry (distances in crystal lattices), physics (frequencies of a vibrating torus), and cryptography (security of protocols).

Introduction

The world of numbers holds elegant patterns and deep secrets, often hidden in plain sight. One of the most classic of these is the question: which whole numbers can be expressed as the sum of two perfect squares? For example, 5 can be written as 12+221^2 + 2^212+22, but 7 cannot. Why is this so? This simple question, first answered by Pierre de Fermat, opens a door to a richer understanding of number theory, revealing a profound connection between arithmetic, algebra, and geometry. The gap in knowledge wasn't just what the rule was, but why it worked, a mystery that took over a century to unravel.

This article will guide you through this beautiful piece of mathematics. In the first chapter, "Principles and Mechanisms," we will explore the theorem's core logic, moving from simple modular arithmetic to the powerful framework of Gaussian integers, where the problem finds its natural solution. In the second chapter, "Applications and Interdisciplinary Connections," we will discover how this seemingly abstract theorem has far-reaching consequences in fields as diverse as crystal geometry, abstract algebra, and even modern cryptography. Prepare to see how a simple question about sums of squares illuminates a vast and interconnected mathematical landscape.

Principles and Mechanisms

Have you ever played with numbers? Not in the sense of balancing a checkbook or calculating a trajectory, but playing with them just to see what they do. It’s a game of patterns, of finding strange and beautiful rules in the abstract world of integers. One of the oldest and most beautiful of these games is figuring out which numbers can be written as the ​​sum of two squares​​. For instance, 5=12+225 = 1^2 + 2^25=12+22, and 25=32+4225 = 3^2 + 4^225=32+42 (or 52+025^2 + 0^252+02), but you will never find two integers whose squares add up to 3, 6, or 7. Why? What's the secret?

The answer is a journey that will take us from simple arithmetic to a new and wonderful landscape of numbers, revealing a deep connection between factoring, geometry, and the very definition of what a "prime" number is.

A Curious Pattern: The Rule of Four

Let's begin with a simple observation. Pick any integer, say kkk. What happens when you square it and divide by 4?

If kkk is an even number, we can write it as k=2mk=2mk=2m. Then its square is k2=(2m)2=4m2k^2 = (2m)^2 = 4m^2k2=(2m)2=4m2. When you divide this by 4, the remainder is 0.

If kkk is an odd number, we can write it as k=2m+1k=2m+1k=2m+1. Its square is k2=(2m+1)2=4m2+4m+1=4(m2+m)+1k^2 = (2m+1)^2 = 4m^2 + 4m + 1 = 4(m^2+m) + 1k2=(2m+1)2=4m2+4m+1=4(m2+m)+1. When you divide this by 4, the remainder is always 1.

So, any perfect square, when divided by 4, gives a remainder of either 0 or 1. That's it. No other possibility. Now what about a sum of two squares, n=a2+b2n=a^2+b^2n=a2+b2? We can figure out its possible remainders modulo 4 by simply adding the remainders of a2a^2a2 and b2b^2b2:

  • 0+0=00 + 0 = 00+0=0
  • 0+1=10 + 1 = 10+1=1
  • 1+0=11 + 0 = 11+0=1
  • 1+1=21 + 1 = 21+1=2

Notice what's missing? The remainder 3! This simple exercise in ​​modular arithmetic​​ gives us a powerful, ironclad rule: ​​an integer that leaves a remainder of 3 when divided by 4 can never be written as the sum of two squares.​​

Try it with a number like 199. Since 199=4×49+3199 = 4 \times 49 + 3199=4×49+3, we know instantly, without checking a single pair of squares, that it's impossible. We have discovered a necessary condition. But is it sufficient? If a number doesn't leave a remainder of 3, is it always a sum of two squares? The answer is more subtle, and to find it, we must first look at the building blocks of numbers: the primes.

A New World: The Gaussian Integers

The question of which primes can be written as a sum of two squares was answered by Pierre de Fermat around 1640. He claimed that an odd prime ppp can be written as a sum of two squares if and only if it leaves a remainder of 1 when divided by 4. (The prime 2 is a special case: 2=12+122 = 1^2+1^22=12+12.) For example, 5≡1(mod4)5 \equiv 1 \pmod 45≡1(mod4) and 5=12+225=1^2+2^25=12+22. Then 13≡1(mod4)13 \equiv 1 \pmod 413≡1(mod4) and 13=22+3213=2^2+3^213=22+32. But 7≡3(mod4)7 \equiv 3 \pmod 47≡3(mod4), and good luck trying to write it as a sum of two squares!.

For over a century, this was just a mysterious fact. The "why" remained elusive until another great mathematician, Carl Friedrich Gauss, had a brilliant insight. He realized that to understand sums of two squares, we have to expand our concept of "integer".

Imagine the familiar number line, populated by integers …,−2,−1,0,1,2,…\dots, -2, -1, 0, 1, 2, \dots…,−2,−1,0,1,2,…. Now, let's add a new dimension. Let's create a grid on the complex plane, where the points have integer coordinates. These are numbers of the form a+bia+bia+bi, where aaa and bbb are regular integers and iii is the imaginary unit, −1\sqrt{-1}−1​. Gauss studied this system so extensively that we now call these numbers the ​​Gaussian integers​​, denoted Z[i]\mathbb{Z}[i]Z[i].

Just like regular integers, you can add, subtract, and multiply Gaussian integers. For example: (2+3i)×(1−i)=2−2i+3i−3i2=2+i−3(−1)=5+i(2+3i) \times (1-i) = 2 - 2i + 3i - 3i^2 = 2 + i - 3(-1) = 5+i(2+3i)×(1−i)=2−2i+3i−3i2=2+i−3(−1)=5+i The true magic happens when we consider the "size" of a Gaussian integer z=a+biz=a+biz=a+bi. In the complex plane, its distance from the origin is a2+b2\sqrt{a^2+b^2}a2+b2​. Gauss worked with a related concept he called the ​​norm​​, which is the square of this distance: N(a+bi)=a2+b2N(a+bi) = a^2+b^2N(a+bi)=a2+b2 Look familiar? The sum of two squares is simply the norm of a Gaussian integer! Our entire problem can be rephrased: Which regular integers nnn are the norm of some Gaussian integer?

This reframing is incredibly powerful because the norm has a beautiful property: it's multiplicative. For any two Gaussian integers z1z_1z1​ and z2z_2z2​, the norm of their product is the product of their norms: N(z1z2)=N(z1)N(z2)N(z_1 z_2) = N(z_1)N(z_2)N(z1​z2​)=N(z1​)N(z2​). This isn't just a convenient trick; it's the famous Brahmagupta-Fibonacci identity in disguise, and it arises naturally from the structure of these new numbers.

When Primes Are No Longer Prime

Now we can state the central idea. An integer ppp is a sum of two squares, p=a2+b2p = a^2+b^2p=a2+b2, if and only if we can write it in terms of the norm: p=N(a+bi)p = N(a+bi)p=N(a+bi). But we can also write the norm as N(a+bi)=(a+bi)(a−bi)N(a+bi) = (a+bi)(a-bi)N(a+bi)=(a+bi)(a−bi).

So, the statement "ppp is a sum of two squares" is exactly the same as saying "ppp can be factored in the world of Gaussian integers". p=a2+b2  ⟺  p=(a+bi)(a−bi)p = a^2+b^2 \iff p = (a+bi)(a-bi)p=a2+b2⟺p=(a+bi)(a−bi) This is the moment of revelation. A prime number from our familiar world, like 5, might not be prime anymore when we view it as a Gaussian integer. It becomes ​​reducible​​. We have 5=(1+2i)(1−2i)5 = (1+2i)(1-2i)5=(1+2i)(1−2i). Neither 1+2i1+2i1+2i nor 1−2i1-2i1−2i is a trivial factor (a "​​unit​​" like 1,−1,i,1, -1, i,1,−1,i, or −i-i−i. They are new, smaller, "prime" Gaussian integers. We call them Gaussian primes.

On the other hand, a prime like 3 cannot be factored in this way. It remains a prime element in Z[i]\mathbb{Z}[i]Z[i], and so we call it ​​irreducible​​ or inert.

So, Fermat's theorem is really a statement about how regular primes behave in the larger system of Gaussian integers:

  • Primes p≡1(mod4)p \equiv 1 \pmod 4p≡1(mod4) (like 5, 13, 17...) split into a product of two distinct Gaussian primes, p=ππˉp = \pi \bar{\pi}p=ππˉ.
  • Primes p≡3(mod4)p \equiv 3 \pmod 4p≡3(mod4) (like 3, 7, 11...) remain inert—they are still prime in Z[i]\mathbb{Z}[i]Z[i].
  • The prime p=2p=2p=2 is special; it ramifies. 2=(1+i)(1−i)2 = (1+i)(1-i)2=(1+i)(1−i), but 1+i1+i1+i and 1−i1-i1−i are associates (differ only by a unit factor: 1−i=−i(1+i)1-i = -i(1+i)1−i=−i(1+i)). So 2=−i(1+i)22 = -i(1+i)^22=−i(1+i)2.

This classification tells us precisely which primes are sums of two squares: the ones that aren't inert!

The Secret of '1 mod 4'

We've reached the final "why". Why do primes that are 1(mod4)1 \pmod 41(mod4) split, while primes that are 3(mod4)3 \pmod 43(mod4) remain inert? The answer lies in a beautiful chain of logical equivalences that ties everything together.

Consider an odd prime ppp. The following statements are all equivalent—if one is true, they all are; if one is false, they all are.

  1. ​​ppp is NOT a prime in Z[i]\mathbb{Z}[i]Z[i] (i.e., it is reducible).​​
  2. ​​ppp can be written as the sum of two squares.​​
  3. ​​The equation x2≡−1(modp)x^2 \equiv -1 \pmod px2≡−1(modp) has a solution.​​

We've already seen why (1) and (2) are the same. But what about (3)? The existence of a number xxx whose square is −1(modp)-1 \pmod p−1(modp) is the key.

If such an xxx exists, then x2+1x^2+1x2+1 is a multiple of ppp. Writing this in Z[i]\mathbb{Z}[i]Z[i], we have that ppp divides the product (x+i)(x−i)(x+i)(x-i)(x+i)(x−i). Now, if ppp were a prime in Z[i]\mathbb{Z}[i]Z[i], it would have to divide one of the factors, either (x+i)(x+i)(x+i) or (x−i)(x-i)(x−i). But this is impossible! For ppp to divide x+ix+ix+i, we would need x+i=p(a+bi)=pa+pbix+i = p(a+bi) = pa + pbix+i=p(a+bi)=pa+pbi. Comparing the imaginary parts, we'd get 1=pb1 = pb1=pb, which is impossible for integers p>1p>1p>1 and bbb. Thus, ppp divides the product without dividing either factor, which proves that ppp is not a Gaussian prime!

So, the whole question boils down to: For which primes ppp does x2≡−1(modp)x^2 \equiv -1 \pmod px2≡−1(modp) have a solution? This is a classic result in number theory: a solution exists if and only if p≡1(mod4)p \equiv 1 \pmod 4p≡1(mod4).

And there it is—the complete, beautiful picture. The simple rule of four we started with is a shadow of a much deeper algebraic structure. The behavior of a prime ppp in the ring of Gaussian integers Z[i]\mathbb{Z}[i]Z[i]—whether the ideal (p)(p)(p) is maximal and hence if the quotient ring Z[i]/(p)\mathbb{Z}[i]/(p)Z[i]/(p) is a field—is perfectly dictated by its remainder modulo 4.

Building Numbers: The Art of Composition

What about composite numbers like 656565? We know that 65=5×1365 = 5 \times 1365=5×13. Both 5 and 13 are primes of the form 4k+14k+14k+1.

  • 5=12+22  ⟹  5=(1+2i)(1−2i)5 = 1^2 + 2^2 \implies 5 = (1+2i)(1-2i)5=12+22⟹5=(1+2i)(1−2i)
  • 13=22+32  ⟹  13=(2+3i)(2−3i)13 = 2^2 + 3^2 \implies 13 = (2+3i)(2-3i)13=22+32⟹13=(2+3i)(2−3i)

To find a representation for 65, we just multiply the Gaussian integers! 65=5×13=[(1+2i)(1−2i)]×[(2+3i)(2−3i)]65 = 5 \times 13 = [(1+2i)(1-2i)] \times [(2+3i)(2-3i)]65=5×13=[(1+2i)(1−2i)]×[(2+3i)(2−3i)] Let's group them differently: (1+2i)(2+3i)=(2−6)+(3+4)i=−4+7i(1+2i)(2+3i) = (2 - 6) + (3+4)i = -4+7i(1+2i)(2+3i)=(2−6)+(3+4)i=−4+7i The norm of this product is N(−4+7i)=(−4)2+72=16+49=65N(-4+7i) = (-4)^2 + 7^2 = 16+49=65N(−4+7i)=(−4)2+72=16+49=65. So we've found one representation! Alternatively, we could have used the conjugate: (1+2i)(2−3i)=(2+6)+(−3+4)i=8+i(1+2i)(2-3i) = (2 + 6) + (-3+4)i = 8+i(1+2i)(2−3i)=(2+6)+(−3+4)i=8+i The norm is N(8+i)=82+12=64+1=65N(8+i) = 8^2 + 1^2 = 64+1=65N(8+i)=82+12=64+1=65. A second representation! This composition, stemming from the multiplicative property of the norm, is a general principle.

A positive integer nnn can be written as a sum of two squares if and only if in its prime factorization, every prime of the form 4k+34k+34k+3 appears with an even exponent. If 333 divides nnn, 999 must also divide nnn for it to be a sum of two squares. Why? Because an inert prime q≡3(mod4)q \equiv 3 \pmod 4q≡3(mod4) can only contribute to a norm as part of q2=(q+0i)(q−0i)q^2 = (q+0i)(q-0i)q2=(q+0i)(q−0i), effectively hiding itself in the representation.

The Bigger Picture: A Universe of Forms

The story of sums of two squares is so perfect, so complete, you might wonder if it's a special case. It is. The beautiful correspondence—a prime splits if and only if it's represented by the norm form—is a direct consequence of the fact that Z[i]\mathbb{Z}[i]Z[i] has "unique factorization" of elements (it is a Principal Ideal Domain, or PID). Its ​​class group​​, which measures the failure of unique factorization, is trivial.

If we try to play the same game with other forms, say a2+5b2a^2+5b^2a2+5b2, we are now working in the ring Z[−5]\mathbb{Z}[\sqrt{-5}]Z[−5​]. This ring does not have unique factorization; its class group is non-trivial. As a result, there are primes (like 3) that "split" in the sense of their ideals, but are not representable by the form a2+5b2a^2+5b^2a2+5b2. The link is broken.

The simple question posed by Fermat becomes a gateway to one of the deepest and most profound areas of modern mathematics: algebraic number theory and class field theory. It's a stunning example of how a simple game of patterns with whole numbers can lead us to entirely new worlds, revealing a hidden unity and an elegance that is the true soul of mathematics.

Applications and Interdisciplinary Connections

Now that we have grappled with the inner workings of Fermat's beautiful theorem on sums of two squares, we might be tempted to file it away as a delightful, but perhaps isolated, gem of number theory. Nothing could be further from the truth. The question of whether a number can be written as the sum of two squares is not a mere intellectual puzzle; it is a fundamental structural property, a question that nature, in its own way, asks in a surprising variety of contexts. Its answer echoes in the patterns of crystal lattices, the tones of a vibrating drum, the very architecture of abstract algebra, and even the security of our digital communications. Let us embark on a journey to see just how far the ripples of this single idea can travel.

The Geometry of Squares: From Lattices to Sonic Fingerprints

Perhaps the most direct and intuitive place to see the theorem at work is in geometry. Imagine a vast, two-dimensional grid, like an infinite sheet of graph paper. This is the integer lattice, the set of all points (x,y)(x, y)(x,y) with integer coordinates. We can think of it as a simplified model of a perfect crystal, with atoms placed at each point. Now, what are the possible squared distances between any two atoms in this crystal? The distance formula tells us that the squared distance between (x1,y1)(x_1, y_1)(x1​,y1​) and (x2,y2)(x_2, y_2)(x2​,y2​) is (x1−x2)2+(y1−y2)2(x_1-x_2)^2 + (y_1-y_2)^2(x1​−x2​)2+(y1​−y2​)2. Since the coordinates are integers, their differences, let's call them aaa and bbb, are also integers. The squared distance is simply a2+b2a^2 + b^2a2+b2.

Suddenly, our geometric question about possible distances has transformed into a number-theoretic one: which integers can be written as the sum of two squares? And for that, we have a complete answer! A number can be a squared distance on the integer lattice if and only if all of its prime factors of the form 4k+34k+34k+3 appear with an even exponent. An integer like 777 or 111111 can never be the squared distance between two lattice points, nor can 2023=7×1722023 = 7 \times 17^22023=7×172, because of that lone factor of 7. It is a forbidden distance, a gap in the fabric of the lattice prescribed by a 17th-century theorem.

This connection between geometry and sums of squares reaches a beautiful crescendo in the world of physics and analysis. Imagine a drum shaped like a two-dimensional torus (the surface of a donut). When you strike this drum, it vibrates at a set of fundamental frequencies, its characteristic "sound." These frequencies are the eigenvalues of a mathematical operator called the Laplacian. For the flat torus, these eigenvalues are precisely the numbers of the form n2+m2n^2+m^2n2+m2 for integers nnn and mmm. To know the possible notes a torus drum can play, you must know which numbers are sums of two squares. The multiplicity of a given frequency—how many different vibrational modes produce the same note—is simply the number of ways its corresponding eigenvalue can be written as n2+m2n^2+m^2n2+m2. So, the 'timbre' of this geometric object is dictated by arithmetic.

A Deeper Unity: The World of Abstract Algebra

The true home of Fermat's theorem, where its power is most clearly revealed, is abstract algebra. The expression a2+b2a^2+b^2a2+b2 is practically begging to be factored—not in the world of integers, but in a larger world containing the imaginary unit iii. This is the world of Gaussian integers, numbers of the form a+bia+bia+bi. Here, the sum of two squares is revealed as a norm: a2+b2=(a+bi)(a−bi)a^2+b^2 = (a+bi)(a-bi)a2+b2=(a+bi)(a−bi).

In this new world, our familiar prime numbers can behave in surprising ways. It turns out that a prime ppp from our ordinary world remains prime in the Gaussian world if and only if p≡3(mod4)p \equiv 3 \pmod 4p≡3(mod4). If, however, p≡1(mod4)p \equiv 1 \pmod 4p≡1(mod4), it is no longer prime! It "splits" into a product of two distinct Gaussian primes, p=ππ‾p = \pi \overline{\pi}p=ππ. For example, the prime 292929, which is 1(mod4)1 \pmod 41(mod4), is not prime in Z[i]\mathbb{Z}[i]Z[i]. It factors into (5+2i)(5−2i)(5+2i)(5-2i)(5+2i)(5−2i), reflecting its identity as 52+225^2+2^252+22. Fermat's theorem is thus a statement about how primes behave when you change your number system.

This principle extends far beyond integers. It dictates the structure of more exotic algebraic systems. Consider the quaternions, an extension of complex numbers with multiple imaginary units. One can construct a system called a quaternion algebra over the rational numbers, denoted H(−1,p)QH(-1, p)_{\mathbb{Q}}H(−1,p)Q​. A key question for any such algebra is whether it is a "division ring," a well-behaved system where every non-zero element has a multiplicative inverse. It turns out that this property hinges entirely on our old friend, the modulus 4 condition. The algebra H(−1,p)QH(-1, p)_{\mathbb{Q}}H(−1,p)Q​ fails to be a division ring precisely when p=2p=2p=2 or p≡1(mod4)p \equiv 1 \pmod 4p≡1(mod4). A simple arithmetic property determines the fundamental nature of a sophisticated four-dimensional algebra.

The theorem's reach extends even to the grand hierarchies of Galois theory, which studies the symmetries of number fields. If you have a number field like Q(d)\mathbb{Q}(\sqrt{d})Q(d​), you might ask if it can be "embedded" within a larger, more structured field—specifically, a cyclic extension of degree 4. The answer, coming from the depths of class field theory, is remarkable: such an embedding is possible if and only if the positive, square-free integer ddd is a sum of two squares. The ability to be written as a2+b2a^2+b^2a2+b2 is not just a property; it's a passport that grants a number field entry into a specific family of algebraic extensions.

Modern Echoes: Cryptography and Asymptotic Landscapes

It is one thing for an old theorem to unify disparate branches of pure mathematics. It is another for it to have direct consequences for 21st-century technology. This is precisely what happens in cryptography. The security of many cryptographic systems, like the Diffie-Hellman key exchange, relies on the difficulty of a specific mathematical problem—the Discrete Logarithm Problem (DLP)—within a finite group.

A cryptographer might be tempted to implement this protocol in the group of units of the Gaussian integers modulo a prime ppp. But they must be careful! The structure of this group depends critically on p(mod4)p \pmod 4p(mod4).

  • If p≡3(mod4)p \equiv 3 \pmod 4p≡3(mod4), the group is the multiplicative group of a finite field, which is cyclic and robust for cryptography.
  • If, however, p≡1(mod4)p \equiv 1 \pmod 4p≡1(mod4), the algebraic structure splits. Because x2≡−1(modp)x^2 \equiv -1 \pmod px2≡−1(modp) has a solution, the ring of Gaussian integers modulo ppp decomposes into two copies of the integers modulo ppp. This allows an attacker to use the Chinese Remainder Theorem to break the single, hard DLP into two much smaller, easier DLPs. A system that is secure for one type of prime becomes catastrophically insecure for the other. The abstract splitting of primes we saw in Z[i]\mathbb{Z}[i]Z[i] becomes a tangible security vulnerability.

Finally, let us zoom out and look at the "big picture." How are the Gaussian primes distributed across the complex plane? Are they scattered randomly, like stars in the night sky? A beautiful problem in analytic number theory suggests a subtle order. If one sums the Gaussian primes in the first quadrant up to a certain size, one finds that the sum of their real parts and the sum of their imaginary parts are not equal. There is a systematic, predictable bias. The cause of this asymmetry? The primes p≡3(mod4)p \equiv 3 \pmod 4p≡3(mod4). These primes lie on the real axis, contributing to the real part of the sum but not the imaginary part. The primes that split, p≡1(mod4)p \equiv 1 \pmod 4p≡1(mod4), contribute complex prime factors (such as a+bia+bia+bi) whose real and imaginary parts are more evenly distributed. The inert primes, by contrast, lie strictly on the real axis and thus leave a lasting statistical fingerprint on the landscape of primes. Even in the realm of statistical asymptotics, the dichotomy of Fermat's theorem holds sway. As a final, playful illustration of this interplay, the representation p=a2+b2p=a^2+b^2p=a2+b2 even dictates the geometric dance of the powers of the complex number a+iba+iba+ib, determining whether its first four powers can land in four distinct quadrants of the plane—a condition which charmingly depends on the ratio a/ba/ba/b being less than 2−1\sqrt{2}-12​−1.

From solid state physics to the security of your bank transactions, Fermat's simple observation about sums of two squares has proven to be an astonishingly powerful and unifying principle. It is a testament to the interconnectedness of mathematics, where a single, elegant idea can illuminate a vast and varied landscape of inquiry.