Base Representation

SciencePedia

Key Takeaways

Positional notation represents numbers as a sum of coefficients multiplied by powers of a base, where a digit's position determines its value.
A universal algorithm of repeated division and taking remainders allows for the conversion of any integer between different positive bases.
A fraction terminates in a given base if and only if all the prime factors of its denominator are also prime factors of the base.
The choice of base is critical in computer science, affecting data storage, floating-point accuracy, algorithm analysis, and complexity proofs.
Base representation reveals deep connections in mathematics, linking a number's digit patterns to concepts in combinatorics, number theory, and fractal geometry.

Introduction

We use numbers every day, but rarely do we stop to consider the elegant system that gives symbols like "123" their meaning. This system, known as base representation or positional notation, is one of the most foundational concepts in mathematics, yet its power and flexibility are often taken for granted. This article addresses this gap by moving beyond simple arithmetic to explore the deep structure of our number systems. In the chapters that follow, we will first dissect the core "Principles and Mechanisms," understanding the machinery of place value, the universal algorithms for converting between bases, and the intriguing nature of fractional, negative, and even mixed-radix systems. Following this, the journey continues into "Applications and Interdisciplinary Connections," where we will uncover how the choice of base has profound, real-world consequences, influencing everything from computer programming and financial calculations to the analysis of complex algorithms and the study of abstract mathematical structures.

Principles and Mechanisms

You might think you know what a number is. You see them every day: 10, 25, 3.14. But have you ever stopped to wonder about the idea behind the symbols? What makes the symbol "123" represent the number one hundred and twenty-three? Why isn't it just one, then two, then three? The answer lies in one of the most powerful inventions of the human mind: the positional number system, or base representation.

It's a concept of such profound simplicity and elegance that we teach it to children, yet it contains depths that power everything from quantum computers to theoretical mathematics. So, let's take a journey together, not just to review what we know, but to truly understand the machinery humming beneath the surface of the numbers we use every day.

The Secret of Place Value: More Than Just Position

Imagine you have a pile of pebbles. To count them, you could just make a scratch mark for each one. This is an additive system. The value is just the sum of the marks. Some ancient systems, like early Roman numerals, worked a bit like this: III is just $1+1+1$ . The order doesn't matter; it's still three.

But what if you have thousands of pebbles? Your parchment would be full of scratch marks. The genius of a positional system is the invention of place value. Let’s stick with our familiar base 10. The number $123$ isn't $1+2+3$ . It's a shorthand for a polynomial:

(123)_{10} = 1 \times 10^2 + 2 \times 10^1 + 3 \times 10^0

Each position, as we move from right to left, represents a higher power of the base, in this case, 10. The symbols in those positions, the digits $\{0, 1, 2, ..., 9\}$ , tell us how many of each power we have. We have one $100$ , two $10$ s, and three $1$ s. The order is everything. Flip the digits to $321$ and you have a completely different number. This structure is the key. A number is not a string of symbols; it's a recipe for adding up powers of a base.

This structure also gives rise to a wonderfully simple, recursive property. If you have a number $x$ , say $12$ , and you append a new digit, say $3$ , to make $123$ , what have you actually done? You've shifted all the old digits one place to the left, which is the same as multiplying their value by the base, and then you've added the new digit. In other words, the new value is simply $b \cdot x + d$ . For our example, that's $10 \times 12 + 3 = 123$ . This "shift and add" operation is the secret behind how computers effortlessly read streams of digits and convert them into numbers.

The Universal Algorithm: A Journey Between Worlds

If our number system is just a recipe using powers of a base, what happens if we change the base? What does $(123)_{10}$ look like in, say, base 8? Or base 2 (binary), the language of computers? It might seem like a daunting translation, but the underlying principle gives us a beautiful and universal algorithm.

The method is one of repeated division, and it's like peeling an onion, one layer at a time. To convert an integer $N$ to base $b$ , we rely on the fundamental truth of division: for any $N$ and $b$ , we can always find a unique quotient $q$ and remainder $r$ such that $N = qb + r$ , where the remainder is a valid digit, $0 \le r b$ .

Think about the base- $b$ expansion: $N = d_k b^k + \dots + d_1 b^1 + d_0$ . If we factor out a $b$ , we get:

N = (d_k b^{k-1} + \dots + d_1) \cdot b + d_0

Look familiar? It's exactly the form $N = qb + r$ . The remainder $r$ is the least significant digit, $d_0$ ! And the quotient $q$ is the rest of the number, just waiting for its turn. So, we can find all the digits by just repeating the process: divide the number by the base, the remainder is your next digit (from right to left), and the new number to work with is the quotient. You keep going until the quotient becomes zero.

This elegant process is so robust that it works for any integer base $b \ge 2$ . You could be asked to convert a colossal number like $987,654,321,098$ into an absurdly large base like $b=999$ , and the exact same, simple "peel-off-the-remainder" algorithm works flawlessly to find the digits.

A Family of Bases: The Programmer's Shortcut

Some bases are more closely related than others. You may have seen programmers use hexadecimal (base 16), with its strange digits A, B, C, D, E, F representing values 10 through 15. Why not just use binary (base 2)?

The answer is beauty and convenience. Because $16 = 2^4$ , every single hexadecimal digit corresponds to a unique block of exactly four binary digits. For example, the digit E in hex is the value 14. In binary, 14 is $(1110)_2$ . So, the hex digit 'E' is just a compact, human-friendly nickname for the binary string "1110".

This means you can convert between base 16 and base 2 without ever going through base 10. You just swap each hex digit with its corresponding 4-bit binary block. The same trick works for any bases that are powers of the same number, like base 4 and base 16 ( $16 = 4^2$ ). To convert from hex to base 4, you can simply expand each hex digit into a pair of base-4 digits. This is no coincidence; it’s a direct consequence of the polynomial structure of numbers, a beautiful piece of mathematical symmetry that makes digital design infinitely cleaner.

The Fractional Frontier: The Mystery of Terminating Decimals

So far, we've only talked about integers. But the beauty of positional notation is that it extends seamlessly to fractions. We just keep the pattern going into negative powers of the base:

(0.d_1 d_2 d_3 \dots)_b = d_1 b^{-1} + d_2 b^{-2} + d_3 b^{-3} + \dots

This is where we encounter a familiar puzzle. In our base-10 world, the fraction $\frac{1}{4}$ is a nice, terminating decimal, $0.25$ . But the fraction $\frac{1}{3}$ becomes the endlessly repeating $0.333\dots$ . Why?

The answer has nothing to do with $\frac{1}{3}$ being an intrinsically "difficult" number. It has everything to do with the base, 10. A fraction $\frac{p}{q}$ (in simplest form) has a terminating representation in base $b$ if and only if every prime factor of its denominator $q$ is also a prime factor of the base $b$ .

Let's break that down. The prime factors of our base 10 are 2 and 5.

For $\frac{1}{4}$ , the denominator is $4 = 2^2$ . The only prime factor is 2, which is also a factor of 10. So, it terminates.
For $\frac{1}{3}$ , the denominator is 3. The prime factor is 3, which is not a factor of 10. So, it is doomed to repeat forever.

Think of it like trying to measure something with a ruler. If your ruler is marked in inches (which can be divided by 2), you can exactly measure lengths like $\frac{1}{2}$ , $\frac{1}{4}$ , $\frac{1}{8}$ inches. But if you try to measure a third of an inch, you'll never land exactly on a mark. You'll get closer and closer, but you'll always have a little bit left over, leading to an infinite sequence.

This principle is not just a mathematical curiosity; it has profound real-world consequences. Imagine designing a digital signal processor that needs to represent the calibration coefficients $\frac{11}{60}$ , $\frac{7}{45}$ , and $\frac{13}{24}$ perfectly. If your processor uses a standard binary (base 2) or decimal (base 10) system, some of these will have repeating representations, leading to rounding errors that could accumulate and destabilize the whole system. To find the right base, you find the prime factors of all the denominators:

$60 = 2^2 \cdot 3 \cdot 5$
$45 = 3^2 \cdot 5$
$24 = 2^3 \cdot 3$

To represent all of them without error, your base $B$ must have all these prime factors: 2, 3, and 5. The smallest integer base that satisfies this is $B = 2 \cdot 3 \cdot 5 = 30$ . In a base-30 system, all three of those tricky fractions would terminate beautifully.

Embracing the Infinite: The Hidden Order of Repeating Patterns

What about those repeating expansions? Are they just a messy inconvenience? Far from it. They hide a beautiful mathematical structure: the geometric series.

A number with a repeating fractional part, like $N = 0.\overline{14}_5$ , which is shorthand for $(0.141414\dots)_5$ , can be written as an infinite sum:

N = \frac{1}{5^1} + \frac{4}{5^2} + \frac{1}{5^3} + \frac{4}{5^4} + \dots

This may look intimidating, but we can group the terms into a repeating block:

N = \left(\frac{1}{5} + \frac{4}{25}\right) + \left(\frac{1}{125} + \frac{4}{625}\right) + \dots = \left(\frac{9}{25}\right) + \left(\frac{9}{625}\right) + \dots

This is a geometric series with the first term $a = \frac{9}{25}$ and a common ratio $r = \frac{1}{25}$ . Using the simple formula for the sum of such a series, $S = \frac{a}{1-r}$ , we find that this infinite string of digits corresponds to the perfectly rational and finite number $\frac{9/25}{1 - 1/25} = \frac{9}{24} = \frac{3}{8}$ . Every repeating decimal, in any base, is simply a geometric series in disguise, revealing a sublime order within the infinite.

Into the Looking Glass: What if a Base Were Negative?

Now for a bit of fun. We've always assumed our base $b$ must be a positive integer greater than 1. Why? What happens if we try to break that rule? Let's explore a truly strange and wonderful system: base $-2$ , or negabinary.

The place values are now powers of $-2$ : $\dots, (-2)^3, (-2)^2, (-2)^1, (-2)^0$ . This gives the alternating sequence $\dots, -8, 4, -2, 1$ . We still use the same digits, $\{0, 1\}$ .

Let's try to write a number. What is $(111)_{-2}$ ?

1 \times (-2)^2 + 1 \times (-2)^1 + 1 \times (-2)^0 = 1 \times 4 + 1 \times (-2) + 1 \times 1 = 4 - 2 + 1 = 3

What about a negative number? Can we write $-3$ ? It turns out we can: $(1101)_{-2}$ .

1 \times (-2)^3 + 1 \times (-2)^2 + 0 \times (-2)^1 + 1 \times (-2)^0 = -8 + 4 + 0 + 1 = -3

This is astonishing! Using only the digits 0 and 1, and no minus sign, we can represent every integer, positive or negative. The sign is not an extra piece of information; it's woven directly into the fabric of the number's representation. The alternating powers provide the mechanism for both addition and subtraction, allowing the digits to build up to any target value.

The same "repeated division" algorithm we used for positive bases works here too, with a slight twist to ensure the remainders are always 0 or 1. Converting the familiar binary number $(110101)_2 = 53$ into negabinary, for instance, yields the representation $(1110101)_{-2}$ . This exploration of negative bases shows how the foundational principles of positional notation are even more general and powerful than we might have imagined.

A Symphony of Radices: Beyond a Single Base

The final generalization of our journey is to realize that the "place values" don't even have to be powers of a single base. A mixed radix system is one where the place values are products of a sequence of different moduli.

The most common example is time. A timestamp like "3 days, 4 hours, 30 minutes, 10 seconds" is a number in a mixed radix system. The value of each position is determined by the size of the unit to its right:

A minute is 60 seconds.
An hour is 60 minutes.
A day is 24 hours.

The "place values" are $1$ (for seconds), $60$ (for minutes), $60 \times 60 = 3600$ (for hours), and $24 \times 3600 = 86400$ (for days). The representation is of the form $c_1 + c_2 m_1 + c_3 m_1 m_2 + \dots$ , where the moduli are $m_1=60, m_2=60, m_3=24, \dots$ . This might seem like a niche application, but this idea is deeply connected to powerful results in number theory like the Chinese Remainder Theorem and has applications in cryptography and high-speed computation.

From a simple count of pebbles to the exotic worlds of negative and mixed bases, the principle of positional representation is a thread that weaves through all of mathematics. It is a testament to how a single, elegant idea can create a universe of structure, order, and infinite possibility. The numbers haven't changed, but hopefully, your understanding of the beautiful machinery that brings them to life has.

Applications and Interdisciplinary Connections

We have spent some time understanding the machinery of base representation—the simple, yet profound, idea that a number's value can be built from powers of a chosen base. At first glance, this might seem like a mere bookkeeping exercise, a trivial choice of notation. Do we write ten as "10" (in base ten), "1010" (in base two), or "A" (in base sixteen)? Who cares? It's all the same number.

Ah, but this is where the fun begins. The choice of base is not just a choice of spelling; it is a choice of perspective. It is like choosing to look at a statue from the front, the side, or above. The statue doesn't change, but the features you notice, the patterns you see, and the connections you can make, change dramatically. By changing our numerical base, we can make hard problems easy, reveal hidden symmetries, and build bridges between seemingly disconnected fields of thought. Let us go on a journey and see where this simple idea takes us.

The Secret Language of Numbers: Clues in the Base

We are so accustomed to our base-ten world that we often mistake its convenient properties for universal truths of mathematics. Consider the simple tricks we learn in school: a number is divisible by 9 if the sum of its digits is divisible by 9. Is this a deep property of the number nine? Not at all! It is a property of the number nine in relation to the number ten. It's an accident of our anatomy that we have ten fingers and built our system around them.

The general principle is that divisibility by $b-1$ in any base $b$ is tested by summing the digits. This is because $b \equiv 1 \pmod{b-1}$ , so every power $b^k$ is also congruent to $1$ . The value of a number $(d_k \dots d_0)_b = \sum d_i b^i$ modulo $b-1$ is simply $\sum d_i (1)^i = \sum d_i$ . So, in base twelve, the rule for divisibility by eleven is just to sum the digits!

What about divisibility by eleven in our own base ten? The rule is to take the alternating sum of the digits. Why? Because $10 \equiv -1 \pmod{11}$ . So, the value of a number modulo $11$ becomes a sum of its digits weighted by alternating powers of $-1$ : $d_0 - d_1 + d_2 - \dots$ . This single insight unifies all these seemingly arbitrary rules: the divisibility test for a number $m$ is completely determined by the behavior of the base $b$ under the lens of modular arithmetic (specifically, $b \pmod m$ ). The rules are not magic; they are shadows cast by the base.

This perspective can even help us solve numerical puzzles that seem baffling at first. If someone tells you that a two-digit number written $(xy)_6$ in base six plus another number $(yx)_7$ in base seven equals 52, you are not just juggling symbols. You are translating between languages. By converting everything to a common tongue (base ten, our comfortable standard), the problem becomes a simple linear equation, $7x + 8y = 52$ , whose solution is readily found.

The choice of base even dictates large-scale features of numbers. If you were to ask how many zeros the colossal number $1000!$ ends with, you are really asking: "How many times can I divide this number by 10?" Since $10 = 2 \times 5$ , this question is about the prime factors of the base. To form a 10, you need one 2 and one 5. A quick count reveals that the prime factorization of $1000!$ contains far more factors of 2 than 5. The number of 5s is the bottleneck. By counting the factors of 5, we are, in essence, discovering the "base-10 structure" of $1000!$ . Legendre's formula provides the elegant tool for this, showing the number of trailing zeros is precisely 249. If we had asked the same question in, say, base 14 ( $2 \times 7$ ), the answer would be limited by the number of factors of 7, a completely different calculation.

The Language of the Machine: Bits, Bytes, and Betrayal

Nowhere is the choice of base more critical than in the heart of our digital world: the computer. At their most fundamental level, computers do not speak in tens. They speak in a language of on and off, yes and no, true and false. They speak in base two.

Every number, every character, every pixel on your screen is ultimately a long string of 0s and 1s. To translate our familiar base-ten numbers into this binary tongue, computers relentlessly apply a process rooted in the Division Algorithm. To convert a number $n$ to base 2, you divide it by 2; the remainder is the last digit. You take the quotient and repeat the process. The string of remainders, read backwards, is the number in base two. This simple, repetitive process is perfectly suited for an algorithmic mind and forms the basis of all base conversion routines in software.

But this translation is not always perfect. Just as some words in one language have no exact equivalent in another, some numbers that are simple for us are impossible for a computer to store perfectly. Consider the innocuous number $0.1$ . In base ten, it is a simple fraction, $\frac{1}{10}$ . To write this in base two, however, is to attempt the impossible. A fraction has a finite representation in base $b$ only if the prime factors of its denominator are also prime factors of $b$ . The denominator of $\frac{1}{10}$ is $10 = 2 \times 5$ . The prime factors of base two are, well, just 2. The factor of 5 is an outsider; it cannot be accommodated.

As a result, $0.1$ in binary is a non-terminating, repeating decimal: $(0.0001100110011\dots)_2$ . When a computer stores $0.1$ in its standard [binary64](/sciencepedia/feynman/keyword/binary64) floating-point format, it must truncate this infinite sequence. The number it stores is not exactly $0.1$ , but an incredibly close approximation, which happens to be the exact fraction $\frac{3602879701896397}{2^{55}}$ .

For many scientific applications, this tiny error is negligible. But in the world of finance, pennies matter. Imagine a bank calculating compound interest. The rate might be $0.5\%$ , or $0.005$ . The initial balance might be $\$ 11.01 $. Both$ 1.005 $and$ 11.01$ are simple decimal fractions, but because their denominators contain factors of 5, they become infinite repeating fractions in base two and must be rounded. When the bank's software multiplies these two approximations, the result is another approximation. This new, tiny error could be just enough to tip the final result to the wrong side of a half-cent rounding boundary, causing the final answer to be off by a penny. This is not a hypothetical fear; it is a genuine problem that led to the development of decimal64 floating-point standards, which perform arithmetic in base ten specifically to handle financial calculations with perfect accuracy. The choice of base is not academic; it has real financial consequences.

Computation, Complexity, and Clever Constructions

Beyond the practicalities of storing numbers, base representation is a powerful theoretical tool for designing and analyzing algorithms. In theoretical computer science, we often measure an algorithm's speed by counting the number of basic operations it performs. For multiplying very large integers, the standard schoolbook method is too slow. The Karatsuba algorithm provides a faster, "divide-and-conquer" approach.

To analyze its performance, we can do something clever. Instead of thinking of a number as a long string of binary digits (base 2), we can chop it up into large chunks, say of 64 bits each. We are now thinking of the number in base $B = 2^{64}$ . Each "digit" in this new base is a number our computer can handle in a single operation. By framing the problem in this large base, the analysis of Karatsuba's algorithm becomes a clean recurrence relation, revealing that its cost grows as $m^{\log_2(3)}$ , where $m$ is the number of base- $B$ digits. This is significantly faster than the naive $m^2$ complexity. Choosing the right base gives us the right lens to see the algorithm's true efficiency.

Perhaps the most ingenious use of base representation is in the field of computational complexity, which studies the fundamental difficulty of problems. To prove a problem is "hard," one often shows how to translate a known hard problem into it. In a famous reduction from VERTEX-COVER to SUBSET-SUM, numbers are constructed in base four for a very specific reason. The goal is to create a set of numbers that, when summed, can verify a logical property of a graph without any messy interference. By using base four, the sum of any three digits in a given position will never "carry over" into the next position. Each digit position becomes a private, independent counter for a specific part of the problem (one for each edge in the graph). The choice of base is a masterful trick, creating a firewall between different logical components of the proof, all encoded within the structure of integers.

Vistas of Abstract Mathematics

The utility of base representation extends far beyond computation and into the purest realms of mathematics, revealing beauty and unity in unexpected places.

Consider the central binomial coefficient, $\binom{2n}{n}$ , a number that appears everywhere in combinatorics and probability. Now consider $s_2(n)$ , the sum of the digits of $n$ in its binary representation. What could these two possibly have to do with each other? In a stunning result known as Kummer's Theorem, it turns out that the number of times 2 divides $\binom{2n}{n}$ is exactly equal to $s_2(n)$ . This connects the world of combinatorics (counting subsets) with the deepest arithmetic properties of numbers (their binary structure). The proof itself is a beautiful dance between Legendre's formula for prime factorials and the properties of bit-shifting.

Finally, let us gaze upon one of mathematics's most famous "monsters": the Cantor set. We construct it by taking the interval $[0, 1]$ , removing the open middle third, then removing the middle third of the remaining two segments, and so on, ad infinitum. What is left is a "dust" of infinitely many points, so sparse that its total length is zero, yet so numerous that it contains as many points as the original interval. How can we describe the points in this bizarre set?

The answer is breathtakingly simple: use base three. A number is in the Cantor set if and only if it can be written in base three using only the digits 0 and 2. Removing the middle third $(\frac{1}{3}, \frac{2}{3})$ is equivalent to forbidding all numbers whose base-three representation begins with $0.1\dots$ . The next removals forbid a 1 in the second position, and so on. The entire complex, fractal geometry of the Cantor set is perfectly captured by this simple rule of representation. What's more, this description allows us to dissect the set's anatomy. The endpoints of the removed intervals are the numbers whose base-three expansion eventually ends in all 0s or all 2s. The "other" points—the vast majority—are those whose expansions contain an infinite scattering of both 0s and 2s.

From simple divisibility tricks to the foundations of computer science, from financial precision to the nature of algorithmic complexity, and from the secrets of binomial coefficients to the geometry of fractals, the concept of base representation is a golden thread. It teaches us that how we write a number can be as important as the number itself. By changing our point of view, we change what we can see, and in doing so, we uncover a deeper, more beautiful, and more unified mathematical world.