PC-relative addressing

SciencePedia

Key Takeaways

PC-relative addressing calculates a memory location by adding an offset to the current Program Counter, creating position-independent code that runs regardless of where it's loaded in memory.
This position independence is essential for modern computing, enabling memory-saving shared libraries and the crucial security feature, Address Space Layout Randomization (ASLR).
The limited "reach" of a PC-relative instruction is overcome by software toolchain tricks like linker relaxation, which creates "trampolines" to handle long-distance jumps.
The influence of PC-relative addressing extends across the entire system, from enabling dynamic loaders via the Global Offset Table (GOT) to optimizing CPU hardware like the Branch Target Buffer (BTB).

Introduction

In the dynamic landscape of modern computing, programs and their components are rarely loaded into the same memory location twice. This poses a significant challenge for a program's instructions, which constantly need to reference data and other parts of the code. Relying on fixed, absolute memory addresses is fragile and inefficient, akin to using a physical street address for a building that moves every day. This rigidity creates a knowledge gap that must be bridged for systems to be flexible, efficient, and secure.

PC-relative addressing emerges as the elegant solution to this problem. Instead of using a fixed address, it specifies a location based on its distance from the current point of execution. This article explores this pivotal concept. First, in "Principles and Mechanisms," we will dissect the core formula behind PC-relative addressing, understand how it enables position-independent code (PIC), and examine its limitations and the clever ways software works around them. Then, in "Applications and Interdisciplinary Connections," we will see how this single idea becomes the foundation for shared libraries, enhances system security through ASLR, and influences everything from operating system design to the silicon of the CPU itself.

Principles and Mechanisms

Imagine you are writing a letter. To mail it, you need an address. The most straightforward way is to write the full, absolute address: "1600 Pennsylvania Avenue, Washington, D.C." This works perfectly, as long as the White House doesn't move. But what if it did? What if the entire city of Washington D.C. was picked up and moved to a new location? Every single letter with that hard-coded address would suddenly be undeliverable. This is the essential dilemma of a computer program. An instruction often needs to find a piece of data or jump to another instruction. The simplest approach, absolute addressing, is to bake the exact numerical memory address into the instruction itself. This is rigid and fragile. In the dynamic world of modern computing, where programs and their components are loaded into different memory locations every time they run, absolute addressing is like building a house of cards on a tablecloth that's about to be yanked.

The Liberation of the Relative

Nature, and computer architects who learn from it, often find a more elegant solution. Instead of specifying an absolute location, what if we gave directions relative to where we are now? "To get to the data, just walk 200 bytes forward from here." This is the beautiful, simple idea behind Program Counter-relative addressing, or PC-relative addressing.

The Program Counter (PC) is a special register in the CPU's heart that always knows the address of the instruction it's about to execute. It's the CPU's sense of "here and now." A PC-relative instruction doesn't contain a full address. Instead, it contains a small, signed number called an offset or displacement. The CPU calculates the target address with a simple formula:

$\text{Target Address} = \text{Current PC} + \text{Offset}$

There's a subtle but common convention here. By the time an instruction is being executed, the PC has often already been updated to point to the next instruction in line. So, the "Current PC" in the formula is typically the address of the instruction following the current one. Let's say an instruction at address $0x00401000$ wants to load data. The instruction itself is 4 bytes long, so the PC used for the calculation is already pointing to $0x00401004$ . If the instruction contains a signed offset of $-792$ bytes, the CPU computes the target address as $0x00401004 + (-792) = 0x00400CEC$ .

The genius of this method is its resilience to change. If a program loader moves the entire block of code and its nearby data by, say, $0x10000$ bytes, both the instruction's address and its target's address change by the same amount. The instruction will be at a new location, and the PC will have a new value. But the distance—the relative offset between them—remains perfectly constant. The instruction "walk 200 bytes forward" is still correct, no matter where the starting point is. This property is called position independence.

The Dividend of Independence: Why We Care

This isn't just an academic curiosity; it's the bedrock of modern computing. Code that is position-independent, often called Position-Independent Code (PIC), doesn't need to be rewritten every time it's loaded into a new memory location. This has two monumental consequences.

First, it enables shared libraries. Think of all the common code that programs use—for printing to the screen, handling files, or drawing windows. Instead of every single application having its own copy, the operating system can load one copy of a shared library into memory and have multiple applications use it simultaneously, each mapping it into their own virtual address space. Because the library is written using PC-relative addressing, it works correctly regardless of where it's loaded in each program's memory map. Without it, your computer's memory would fill up with thousands of redundant copies of the same code.

Second, it enhances security through Address Space Layout Randomization (ASLR). To thwart attackers, modern operating systems deliberately load a program's components—its main code, its libraries—at random memory locations each time it runs. If an attacker tries to exploit a bug by jumping to a fixed, known address, they will likely fail because the target is no longer there. ASLR is only practical because PIC allows the code to function correctly no matter where it's placed.

The efficiency gain is staggering. Imagine a program module with thousands of references to its own internal data and functions. With absolute addressing, the loader would have to perform a "fixup" for every single one of those references, reading the old address, adding the new base address, and writing it back. This takes time and clogs the memory bus. A PIC module using PC-relative addressing for its internal references needs no such fixups. The only fixups required are for references to data outside the module, which are often handled cleverly using a Global Offset Table (GOT). By consolidating external references, a module with thousands of data uses might only require a few dozen fixups in its GOT. This dramatically reduces the work the loader must do, speeding up application startup. For example, a hypothetical module that would require nearly a million cycles and over 25,000 bytes of memory traffic to relocate using absolute addresses might require only 10,000 cycles and a mere 160 bytes of traffic when compiled as PIC. The savings extend to the size of the program file itself, as the amount of relocation metadata that needs to be stored is drastically reduced.

A Leash of a Certain Length

Of course, there is no free lunch in physics or computer science. The offset in a PC-relative instruction is stored in a fixed number of bits—say, 12 or 16 bits—within the instruction itself. This means there's a limit to how far it can "reach." A 12-bit signed offset, for instance, can represent values from $-2048$ to $2047$ . If this offset is scaled by the instruction size (e.g., 4 bytes), the instruction can branch backward by up to $2048 \times 4 = 8192$ bytes and forward by up to $2047 \times 4 = 8188$ bytes. (The slight asymmetry is a charming quirk of two's-complement number representation).

This "leash" has direct consequences. For a while or for loop, the code ends with a conditional branch back to the top. The size of the loop body is limited by the backward reach of the branch instruction. A branch with a 9-bit signed instruction offset (-512 to +511) can support a loop body of at most 512 instructions. For most loops, this is more than enough. But what if it's not?

The Linker's Artful Dodge

Here we see a beautiful dance between the hardware's limitations and the software's ingenuity. The compiler optimistically assumes a branch target will be in reach. But what if the linker, the tool that stitches all the code pieces together, discovers that a function call is to a target millions of bytes away?

The linker performs a trick called linker relaxation. It replaces the out-of-range branch with a clever sequence of instructions. One common technique is to create a trampoline. The linker replaces the far branch with a short, in-reach branch to a tiny snippet of newly generated code—the trampoline. This trampoline's sole job is to perform a long-distance, unconditional jump to the final destination, typically by loading the full, 32-bit or 64-bit target address into a register and then jumping to the address in that register. It's like taking a short hop to a teleporter that can send you anywhere.

When Relative Isn't Constant

The core magic of PC-relative addressing is the assumption that the instruction and its target are on the same "shifting tablecloth"—that their relative distance is invariant. But what happens if this assumption is violated?

Consider the case where a piece of code is relocated, but its target data is not. This happens in some advanced linking scenarios, like accessing a Global Offset Table that might be in a different, fixed-location memory segment. If an instruction at address $P$ moves to $P + \Delta$ , but its target $S$ stays put, the original offset becomes wrong. The relationship is no longer $EA = (P + \Delta) + \text{offset}_{\text{old}}$ , but rather $EA = (P + \Delta) + \text{offset}_{\text{new}}$ . To ensure the effective address $EA$ still resolves to the correct, fixed target $S$ , the linker must step in and compute a new offset: $\text{offset}_{\text{new}} = S - (P + \Delta) = \text{offset}_{\text{old}} - \Delta$ . The offset must be adjusted to perfectly counteract the instruction's movement.

This shows that PC-relative addressing is not a magic wand; it's a description of a geometric relationship. If the geometry changes—for instance, if a post-link tool inserts code between an instruction and its target—the description must be updated. Without a mechanism like a relocation table to allow a patcher to recalculate the offset, the instruction will fail, loading data from the wrong location. The principle holds true even for more complex addressing modes that add an index register to the calculation; the displacement part of the formula must always be adjusted to compensate for any change in the PC that is not matched by a corresponding change in the target's location. Architects must also be precise, as the simple choice of whether the PC in the formula refers to the current instruction or the next one will change the offset value an assembler must compute.

Whispers in the Silicon

The influence of this powerful idea runs so deep that it even shapes the processor's microarchitecture. Consider the Branch Target Buffer (BTB), a small, fast cache that stores the predicted target addresses of recently executed branches to keep the CPU's pipeline full and running fast.

In an older, absolute-addressing world, a BTB entry might store the branch's absolute PC and the target's absolute address. But in a PIC world, this is inefficient. The absolute addresses change every time the program runs! A much smarter design, made possible by PC-relative branching, is to have the BTB store the position-independent displacement. The tag used to identify the branch can then be simplified, as it no longer needs to concern itself with the shifting bits of the absolute target address. The move to PIC allows for a smaller, more efficient tag in the BTB, saving precious silicon space and power.

Here, we see the principle in its full glory: a high-level concept born from software needs—the need for relocatable, shareable code—echoes all the way down into the physical layout of transistors on the CPU die. This is the unity and inherent beauty of computer science, where a single, elegant idea can ripple through every layer of abstraction, from the operating system to the silicon itself.

Applications and Interdisciplinary Connections

Having grasped the principle of Program Counter-relative addressing—the simple but profound idea of specifying a location not by its absolute "street address" but by its position relative to "where we are now"—we can embark on a journey to see how this single concept blossoms into a cornerstone of modern computing. It is one of those wonderfully elegant ideas whose consequences ripple through every layer of a system, from the compiler's optimization puzzles to the silicon-level drama of the memory hierarchy, and even to the front lines of cybersecurity.

Imagine you're writing a program, and so are a thousand other people. Every program needs to perform basic tasks like printing to the screen or reading from a file. Does it make sense for every single compiled program to include its own copy of the code for printf? Of course not. That would be a colossal waste of disk space and, more importantly, system memory. The obvious solution is to have one central copy of this "standard library" that everyone can share.

But this raises a difficult question: where in memory do we put this shared library? If we hardcode its address, what happens when two different libraries want to occupy the same spot? And how can every program know in advance where the library will be?

PC-relative addressing provides the beautiful answer. By compiling the shared library as Position-Independent Code (PIC), we create a module that can be loaded anywhere in memory and still function correctly without a single byte of its instructions being changed. The magic lies in the simple mathematical invariance we discussed. If an instruction at address $a$ needs to jump to a function at address $s$ within the same library, the required displacement is $d = s - a$ . If the operating system loads this entire library at some new base address $B$ , the instruction is now at $a' = a + B$ and its target is at $s' = s + B$ . The relative distance remains unchanged: $d' = s' - a' = (s + B) - (a + B) = s - a$ . The original displacement $d$ is still perfect!

This allows the operating system to place a single physical copy of the library's code in memory and map it into the virtual address space of hundreds of different processes. Because the code is position-independent, it works for everyone. Furthermore, since the code itself never needs to be modified, it can be marked as read-only. This is a huge security win, forming the basis of the W^X (Write XOR Execute) policy that prevents attackers from easily overwriting executable code with their own malicious instructions.

The Linker's Toolkit: Clever Tricks for a Bounded World

PC-relative addressing isn't without its limits. The displacement value, $d$ , is stored in the instruction itself, and it only has a finite number of bits—say, 20 or 24. This means there's a maximum "reach" for any PC-relative jump. For a 20-bit signed displacement, you can only jump about half a megabyte forward or backward. What happens if your code needs to call a function that's millions of bytes away, outside this "near" radius?

This is where the software toolchain—specifically, the linker—gets wonderfully clever. If the linker detects that a PC-relative jump's target is too far, it doesn't give up. Instead, it synthesizes a small block of code called a veneer or trampoline. This veneer is placed at a location that is within reach of the original jump. The linker then changes the original jump to target this veneer. The veneer's only job is to perform the long-distance jump. It's a two-hop process: a short, PC-relative hop to the trampoline, followed by a long-range, unrestricted jump from the trampoline to the final destination.

This long-range jump is often accomplished by loading a full 64-bit absolute address from a nearby table into a register and then performing an indirect jump through that register. This two-step mechanism—a limited PC-relative branch followed by a powerful indirect jump—beautifully illustrates how a constrained tool can be used to build a universal one.

This principle even extends to the data a program uses. Compilers strive to place constant data needed by a piece of code into a "literal pool" located nearby, so it can be accessed with a single, efficient PC-relative load. The optimal placement of this pool becomes a fascinating geometric problem, minimizing the total distance from all instructions that need to access it—a problem whose solution often involves finding the median of the instruction locations.

A Symphony of Indirection: The Global Offset Table

We've seen how PC-relative addressing works for references within a single module. But what about references between modules? How does your program call the printf function located in a completely separate shared library? At the time your code is compiled and linked, the final memory address of printf is unknown. Its relative position is not fixed.

The solution is another layer of elegant indirection: the Global Offset Table (GOT). Instead of trying to jump directly to printf, your code makes a PC-relative jump to an entry in a special table—the GOT—that is part of your own program's data segment. Think of it as a personal address book. The linker ensures that there is an entry in your address book reserved for printf.

When your program is first loaded, this GOT entry is just a placeholder. It's the job of the system's dynamic loader to find the actual address where the operating system placed the printf function in memory and then patch that address into your program's GOT entry. From then on, whenever your code needs printf, it follows the same two-step dance:

A position-independent, PC-relative jump to the printf entry in its own GOT.
An indirect jump from there, using the absolute address that the dynamic loader so kindly filled in.

This separation is the key: the code remains pure, shared, and read-only, while the messy, address-specific details are confined to a small, writable data table.

A Bedrock of System Security

This entire architecture of position-independent code, a prerequisite for which is PC-relative addressing, is not just about efficiency and modularity. It is a fundamental pillar of modern computer security. The fact that shared libraries and executables can be loaded at any address enables a crucial defense mechanism: Address Space Layout Randomization (ASLR).

With ASLR, the operating system loads your program, and all the libraries it uses, at a different, random base address every time it runs. This makes it incredibly difficult for an attacker to exploit a bug. Many attacks rely on knowing the address of a specific piece of code (a "gadget") they want to jump to. If that address is a constantly moving target, their attack will almost certainly fail, crashing the program harmlessly instead of compromising it. Without PIC, ASLR would be impossible to implement efficiently and securely, as it would require patching the code itself, breaking memory sharing and violating W^X policies.

We can take this even further. The limited range of PC-relative jumps can itself be a security feature. In a multi-tenant system where different users' code must be isolated, we can place large, unmapped "guard gaps" between them. If the hardware's maximum jump displacement is smaller than the gap, it becomes physically impossible for a single malicious instruction to jump from one tenant's region to another. This can be augmented with a software policy called Control-Flow Integrity (CFI), which acts like a runtime security guard, checking that every jump target is on a pre-approved list. This effectively creates even tighter bounds on where code can go, drastically reducing the attacker's freedom of movement.

The Dynamic World of Operating Systems and JIT Compilers

The power of relativity extends deep into the core of the operating system and the most advanced runtime environments.

When a hardware interrupt or an exception occurs, the processor must stop what it's doing and jump to an OS handler routine. Where are these handlers? They are stored in a vector table. On modern systems, this table is relocatable; the OS can move it by simply updating a special hardware register, the Vector Base Register (VBR). By writing the handlers themselves as position-independent code using PC-relative addressing, the OS can move its entire exception handling infrastructure to a new location without patching a single instruction in the handlers themselves.

This dynamic nature is also critical for Just-In-Time (JIT) compilers, which are at the heart of high-performance languages like Java and JavaScript. A JIT compiler generates native machine code on the fly. As the program runs, the JIT might discover better ways to organize this code, moving it around in memory to improve performance. Every time it moves a block of code from base address $B$ to $B'$ , it must act as a mini-linker. For any PC-relative call within that block, it can't just adjust the old displacement; it must re-calculate a brand new displacement from scratch: $d' = T - (B' + s + \ell_{\text{call}})$ , where $T$ is the absolute target address. This constant re-evaluation based on the code's current context is the very definition of relativity in action.

Down to the Silicon: A Dance with the TLB

Finally, let's see how this high-level software concept interacts with the low-level reality of the processor's hardware. Your CPU uses a Translation Lookaside Buffer (TLB) to cache recent translations from virtual to physical page addresses. A "unified" TLB holds translations for both code fetches and data loads/stores.

Consider an instruction located at the very end of a virtual page, say, at offset 4092 in a 4096-byte page. Now, imagine this instruction performs a PC-relative data load with a displacement of +64 bytes. The data address will be $4092 + 64 = 4156$ , which lies in the next virtual page. Because this is a new page, the data load will likely cause a TLB miss. The hardware fetches the translation for this new page and installs it in the TLB.

What happens next? The program counter increments to fetch the next instruction, which is at address $4092 + 4 = 4096$ —the very beginning of that same new page. When the CPU goes to fetch this instruction, it needs to translate the page address. But wait! The data load just a moment ago caused that exact translation to be loaded into the unified TLB. The result? The instruction fetch is now a blazing-fast TLB hit. This subtle and beautiful interaction shows how PC-relative addressing is intimately woven into the performance fabric of the entire memory system.

From enabling the vast ecosystem of shared libraries to forming a bedrock of system security and interacting with the deepest levels of hardware, PC-relative addressing is a testament to the power of a simple, elegant idea. It is the unassuming principle of relativity that makes the complex, dynamic world of modern software possible.

PC-relative addressing

Introduction

Principles and Mechanisms

The Liberation of the Relative

The Dividend of Independence: Why We Care

A Leash of a Certain Length

The Linker's Artful Dodge

When Relative Isn't Constant

Whispers in the Silicon

Applications and Interdisciplinary Connections

The Art of Position-Independent Code: A Foundation for Sharing

The Linker's Toolkit: Clever Tricks for a Bounded World

A Symphony of Indirection: The Global Offset Table

A Bedrock of System Security

The Dynamic World of Operating Systems and JIT Compilers

Down to the Silicon: A Dance with the TLB

PC-relative addressing

Introduction

Principles and Mechanisms

The Liberation of the Relative

The Dividend of Independence: Why We Care

A Leash of a Certain Length

The Linker's Artful Dodge

When Relative Isn't Constant

Whispers in the Silicon

Applications and Interdisciplinary Connections

The Art of Position-Independent Code: A Foundation for Sharing

The Linker's Toolkit: Clever Tricks for a Bounded World

A Symphony of Indirection: The Global Offset Table

A Bedrock of System Security

The Dynamic World of Operating Systems and JIT Compilers

Down to the Silicon: A Dance with the TLB