Paraxial Optics

SciencePedia

Key Takeaways

Paraxial optics simplifies optical calculations by assuming light rays make small angles with the optical axis, allowing for linear approximations ( $\sin \theta \approx \theta$ ).
The ray transfer matrix method provides a powerful tool to analyze complex optical systems by representing each component and transformation as a 2x2 matrix.
The principles of paraxial optics are fundamental not only for designing light-based instruments but also extend to other fields like electron microscopy.
Paraxial optics serves as the ideal first-order theory, providing the essential framework for understanding and correcting real-world lens imperfections known as aberrations.

Introduction

The path of light, while governed by precise laws, presents a formidable mathematical challenge for designing complex optical systems. How can we tame the non-linear complexities of refraction to create the sophisticated lenses in our cameras, telescopes, and microscopes? The solution is found in a brilliant approximation known as paraxial optics. By telling a "gentle lie"—assuming all light rays travel close to the optical axis at small angles—we transform a difficult problem into a beautifully linear one. This unlocks a framework of astonishing power, forming the bedrock of modern optical engineering. This article serves as a guide to this essential topic. We will begin by exploring the core Principles and Mechanisms, from the foundational small-angle approximation and Fermat's Principle to the powerful ray transfer matrix method. Subsequently, in Applications and Interdisciplinary Connections, we will see how these idealized rules are instrumental in designing real-world optical instruments, analyzing system tolerances, and forging connections between optics and fields as diverse as biology and electron microscopy.

Principles and Mechanisms

The Gentle Lie: Heart of the Paraxial World

Look at the world around you. Light streams from the sun, bounces off a tree, enters your eye, and forms an image on your retina. The path of that light is governed by beautifully precise laws, but they can be maddeningly complex. When a light ray hits the surface of a lens or the cornea of your eye, it bends according to Snell's Law, a relationship involving the sines of angles. The sine function, as you might know, is a wavy, non-linear thing. Trying to trace a ray through a dozen surfaces using exact trigonometry would be a nightmare of calculations, a thicket of mathematics that obscures the simple beauty of image formation.

So, what do we do? We do what physicists and engineers have always done: we find a brilliant approximation. We tell a small, gentle lie. We decide to only look at a special, simplified world. This is the paraxial world, a realm where all light rays travel very close to the central axis of our lens system and make very small angles with it. The word "paraxial" itself means "near the axis."

In this world, a wonderful simplification happens. For very small angles $\theta$ (measured in radians), the value of $\sin(\theta)$ is almost exactly equal to $\theta$ itself. The wavy sine function becomes a simple, straight line. Our complicated Snell's Law, $n_1 \sin(\theta_1) = n_2 \sin(\theta_2)$ , magically transforms into the beautifully simple linear relationship $n_1 \theta_1 \approx n_2 \theta_2$ . This is it. This is the single, foundational assumption of paraxial optics.

Of course, it is an approximation. A more accurate picture of reality would include more terms from the Taylor series expansion of the sine function. For instance, a better approximation for the refraction angle is not just linear, but includes a term proportional to $\theta_1^3$ . These higher-order terms are the mathematical origin of aberrations—the pesky imperfections like blurriness and distortion that lens designers constantly fight. But by agreeing to ignore them for a moment, we unlock a framework of astonishing power and elegance. We have traded perfect accuracy for profound insight.

The Equation of Seeing: A Deeper Principle

With our linear rule for bending light, we can now ask how a lens forms an image. We could derive the equations using tedious geometry, but there is a much more beautiful and profound way, one that hints at the deep unity of physics: Fermat's Principle of Least Time.

This principle states that light, in traveling between two points, will always take the path that takes the least amount of time. For image formation, this means that every ray of light that leaves an object point and arrives at the corresponding image point must take the exact same amount of time to make the journey.

Think about it. Consider a point object $S$ being imaged to a point image $I$ by a single, curved glass surface. One ray travels straight along the optical axis from $S$ to $I$ . Another ray leaves $S$ , travels at an angle to hit the curved edge of the glass, bends, and then continues to $I$ . The second ray travels a longer geometric distance. How can it possibly arrive at the same time? Because it travels for less time inside the slower medium (the glass) and more time in the faster medium (the air). For a perfectly focused image to form, the extra geometric path length must be perfectly compensated by the change in speed.

By writing down this condition—that the optical path length (the geometric distance multiplied by the refractive index) is the same for all paths—and applying our small-angle approximation, a single, universal equation emerges from the mathematics as if by magic. For a spherical surface of radius $R$ separating two media with refractive indices $n_1$ and $n_2$ , the relationship between the object distance $s_o$ and the image distance $s_i$ is:

\frac{n_1}{s_o} + \frac{n_2}{s_i} = \frac{n_2 - n_1}{R}

This isn't just an equation; it is the equation of imaging for a single surface. It tells us everything. And it’s wonderfully consistent. What happens if the surface is not curved but perfectly flat, like the surface of a swimming pool? A flat surface is just a sphere with an infinite radius, $R \to \infty$ . In that limit, the right side of our equation becomes zero. The equation simplifies to $\frac{n_1}{s_o} + \frac{n_2}{s_i} = 0$ , which gives us the famous formula for apparent depth, $|s'| = (n_2/n_1)s$ . The profound general law contains the familiar specific case.

The Algebra of Lenses: The Matrix Method

The imaging equation is powerful, but using it to analyze a modern camera lens with ten or more surfaces would still be a chore. Each surface creates an image, which then becomes the object for the next surface, and so on. We need a more industrial-strength tool.

This is where the linearity of the paraxial world truly shines. Any process that is linear can be described by matrix algebra. We can represent the state of a light ray at any point by a simple two-component vector: its height $y$ above the optical axis and the product of its angle and refractive index, $n\theta$ .

\text{Ray State} = \begin{pmatrix} y \\ n\theta \end{pmatrix}

What happens to this ray as it travels through an optical system? It undergoes a series of simple transformations. When it travels a distance $d$ , its angle doesn't change, but its height does. When it crosses a curved surface, its height doesn't change, but its angle does. In the paraxial world, both of these transformations—translation and refraction—are linear. This means each can be represented by a simple $2 \times 2$ matrix.

The journey of a ray through an entire complex lens system is now reduced to a simple, orderly process: just multiply the ray's initial state vector by the matrix for each element, one after the other.

\begin{pmatrix} y_{out} \\ n_{out}\theta_{out} \end{pmatrix} = M_{total} \begin{pmatrix} y_{in} \\ n_{in}\theta_{in} \end{pmatrix} \quad \text{where} \quad M_{total} = M_{last} \cdots M_{2} M_{1}

This ray transfer matrix method is unbelievably powerful. An entire, complex lens system, no matter how many elements it has, can be boiled down to a single $2 \times 2$ matrix. The properties of the whole system—its focal length, its magnifying power, the location of its principal planes—are encoded right there in the four numbers of that final matrix. For example, by calculating the matrix for a solid glass ball, we can find the location of its principal planes, abstract concepts that tell us how the lens behaves as a whole. The calculation reveals, perhaps surprisingly, that they are located right at the geometric center of the sphere. The matrix machinery effortlessly reveals a hidden, elegant symmetry of the system.

Hidden Symmetries and Elegant Views

When we find a simple and powerful mathematical structure, it often points to deeper underlying principles and symmetries. In classical mechanics, the laws of motion lead to conservation of energy and momentum. Does our paraxial world have similar conserved quantities?

Indeed, it does. Consider any two distinct rays traveling through an optical system—for instance, a "marginal ray" from the center of an object and a "chief ray" from its edge. Let their heights be $h_m$ and $h_c$ , and their angles be $u_m$ and $u_c$ . The quantity $L = n(h_c u_m - h_m u_c)$ is an invariant. It is called the Lagrange-Helmholtz Invariant. Its value remains absolutely constant as the two rays propagate through any number of lenses, spaces, and mirrors. This is a profound statement about the structure of light propagation. Like a conservation law, it constrains what an optical system can and cannot do, connecting the properties of the light entering the system to the light leaving it. For a microscope, this invariant can be calculated at the object plane and is simply $n H u_m$ , where $n$ is the object-space refractive index, $H$ is the object height, and $u_m$ is the initial angle of the marginal ray. We instantly know this quantity's value everywhere else in the system, without tracing a single ray further.

Symmetries also allow us to change our point of view to make things simpler. The standard imaging equation measures distances from the surfaces of the lens. But the true "heart" of a lens is its focal points. What if we measure distances from there instead? Let $x_o$ be the object's distance from the first focal point, and $x_i$ be the image's distance from the second focal point. If we substitute this new coordinate system into the standard imaging equation, the algebra magically simplifies, and we are left with the stunningly beautiful and symmetric Newtonian imaging equation:

x_o x_i = f_1 f_2

This equation reveals a deep, reciprocal relationship between the object and image spaces, a harmony that was hidden within the more cumbersome standard form.

When the Lie Breaks Down: A Glimpse of Aberrations

We must now confess and come to terms with our "gentle lie." The paraxial model, for all its power and beauty, is an idealization. Real rays are not always infinitesimally close to the axis, and $\sin(\theta)$ is not exactly $\theta$ . The price we pay for using spherical lenses in the real world is that our perfect, point-like images become slightly blurred and distorted. These imperfections are called aberrations.

The paraxial theory is the first-order theory of optics. The first deviations from this perfect world are the third-order aberrations, also known as the Seidel aberrations. They arise from the next term in the power series expansion of the sine function, the one proportional to $\theta^3$ . There are five of them: spherical aberration, coma, astigmatism, field curvature, and distortion. Lens designers work tirelessly to balance and cancel these effects by combining multiple lens elements. If they succeed in eliminating all of them, the next set of imperfections to worry about are the even smaller fifth-order aberrations, and so on in an endless quest for perfection.

A particularly intuitive example is chromatic aberration. The refractive index of glass is slightly different for different colors of light. This means a simple lens will have a slightly different focal length for red light than for blue light. If you try to focus light from a white-light source, the colors will separate, creating ugly color fringes around the image. The paraxial framework, however, gives us the tools to analyze this. We can relate the longitudinal spread of the different colored focal points ( $\delta_L$ ) to the transverse size of the color blur at any given focal plane ( $\delta_T$ ). The connection between these two views of the same problem is an elegant geometric relationship derived directly from our paraxial ray tracing rules.

So, is paraxial optics wrong? Not at all. It is the perfect starting point. It is the elegant blueprint of an ideal optical world. It gives us the fundamental principles, the powerful design tools like the matrix method, and the language to understand imaging. The more complex and fascinating study of aberrations is the study of the departure from that ideal blueprint. Paraxial optics is the foundation upon which all of practical lens design is built.

Applications and Interdisciplinary Connections

We have spent our time learning the rules of the game—the principles and matrices of paraxial optics. This is the essential grammar of light, the set of laws governing how rays bend and images form in a world of small angles. But learning grammar is only useful if you intend to write poetry or tell stories. Now, the real fun begins. We are going to see what stories paraxial optics can tell. You will find that this seemingly simple approximation is not a crutch, but a master key that unlocks the design of nearly every optical instrument we use to explore our world, revealing a beautiful unity across disparate fields of science and engineering.

The Foundations of Seeing: Designing Our Windows to the World

Let's start with one of humanity's oldest scientific quests: looking at the stars. To see faint, distant objects, you need a telescope. You might think of a telescope as a long, unwieldy tube, and for a simple refractor, you'd be right. But what if you need a very long focal length—for high magnification—in a compact, manageable package? This is where the cleverness of optical design comes in. The Cassegrain telescope uses a large concave primary mirror and a smaller convex secondary mirror. Light comes in, bounces off the primary, heads towards a focus, but is intercepted by the secondary, which then reflects it back through a hole in the primary. Using the matrix methods we've learned, we can treat this whole assembly as a single entity and calculate its effective focal length. We find that this two-mirror combination can achieve a focal length far greater than the physical length of the telescope tube, a beautiful piece of optical engineering that folds a long light path into a small space.

But building an instrument is more than just getting the focal length right. When you look through a telescope or a camera, the brightness of the image and the field of view you can see are not accidental. They are determined by the physical sizes of the lenses and the placement of diaphragms, or "stops," within the system. The most important of these is the aperture stop—it's the opening that acts as the main gatekeeper, limiting the bundle of rays from an object point on the axis. But what the observer sees as the entrance and exit windows of the system are the images of this stop. The entrance pupil is the image of the aperture stop as seen from the front (the object side), and the exit pupil is its image as seen from the back (the image side). The size and location of the exit pupil are critically important; for instance, in a telescope, it must be small enough to fit within the pupil of your own eye. Paraxial optics gives us the simple tools of image formation to find these pupils and design instruments where the light is efficiently guided from the world to the detector, be it a CCD chip or your retina.

Often, a complex instrument like a microscope or a long camera lens isn't just one or two lenses. It might involve a relay system, whose job is to take an image formed by one part of the system and faithfully transfer it to another. A key challenge here is to ensure the pupils are properly matched, so the cone of light accepted by the next stage is the same as the one delivered by the previous one. By carefully choosing the spacing between the relay lenses, we can control the magnification of the pupil itself, ensuring that no precious light is lost along the way. A common configuration is the "4f" system, where two lenses of focal length $f$ are separated by $2f$ , which not only relays the image but also performs some remarkable tricks in signal processing, a topic for another day. The simple principle of setting the pupil magnification to one by adjusting lens spacing is a fundamental building block in modular optical design.

Of course, no real-world image is perfect. One common artifact you’ve surely seen in photographs is vignetting, where the corners and edges of an image appear darker than the center. This isn't necessarily a "flaw" but an inherent consequence of geometry. For an off-axis point, the cone of light entering the instrument might be partially blocked, or "clipped," by the edges of one of the lenses further down the line. Using paraxial ray tracing, we can precisely predict how the bundle of rays from an off-axis source gets trimmed by the apertures in the system, and we can quantify exactly how much of the image will be darkened. This understanding allows designers to make a conscious trade-off between a wide field of view and the uniformity of image brightness.

A more fundamental imperfection is chromatic aberration. Because the refractive index of glass is slightly different for different colors of light, a simple lens will focus blue light at a slightly different point than red light. This "rainbow curse" plagues simple optical systems. Designers have long sought to correct it by combining multiple lenses made of different types of glass. But paraxial theory reveals a deep and subtle limitation. For a system made of two separated thin lenses of the same glass, you can design it to eliminate one type of chromatic aberration—say, the lateral chromatic aberration ( $C_{II}$ ), which causes color fringing at the edges of the image. However, when you do this, you discover that the longitudinal chromatic aberration ( $C_I$ ), which blurs all points in the image, stubbornly remains. This illustrates a profound principle in optical design: you can't always have everything. The art of lens design is the art of managing these inescapable trade-offs.

The Unity of Physics: Optics Beyond Light

One of the most profound ideas in physics is the unity of its principles. The elegant framework we've developed for light rays turns out to be far more general. Consider the Scanning Electron Microscope (SEM), an instrument that lets us see the world at the nanometer scale. Instead of photons, it uses a beam of electrons. These electrons are guided not by glass lenses, but by magnetic fields generated by coils of wire. A "thin magnetic lens" can focus an electron beam, and "deflection coils" can steer it.

Amazingly, in the paraxial approximation, the trajectories of these electrons obey a set of rules identical to those of light rays. We can apply our ray-tracing logic to design complex electron-optical systems. For example, a common technique in materials science called Electron Backscatter Diffraction (EBSD) requires the electron beam to pivot, or "rock," about a single point on the sample's surface. This is achieved using a double-deflection system with two coils. By precisely controlling the ratio of the currents in the two coils, we can make the electron beam tilt back and forth as if it were hinged to a fixed point in space. The ability to derive this current ratio using the familiar formulas of paraxial optics is a stunning demonstration that the same geometric logic applies, whether you are steering photons with glass or electrons with magnets.

The Interdisciplinary Bridge: Optics in Science and Engineering

The power of paraxial optics truly shines when it becomes a tool for other disciplines, providing a quantitative bridge between physical principles and phenomena in biology, engineering, and beyond.

Take the eye itself. The camera-type eye, with a single lens focusing an image onto a light-sensitive retina, is a marvel of biological engineering. So effective is this design that it evolved independently in wildly different lineages, such as vertebrates (like us) and cephalopod mollusks (like the octopus)—a classic case of convergent evolution. We can model the crystalline lens of an eye as a thick lens, a more accurate model than our simple thin lens approximation. Using the paraxial formulas for a thick lens, we can calculate its focal length based on its curvature, thickness, and refractive index. This allows us to quantitatively compare the optical performance of eyes across different species, turning comparative anatomy into a problem in optical physics and providing insight into how evolution optimizes physical structures for their function.

In the world of precision engineering, "perfect" exists only on paper. Real systems must be built, and they are subject to manufacturing tolerances and environmental disturbances. Paraxial optics is the essential tool for what engineers call tolerance analysis. What happens if a lens is not perfectly centered, but is displaced by a tiny amount $\Delta y$ ? Our ray-tracing equations can be easily modified to account for this decentering, allowing us to predict exactly how the final position and angle of a ray will be affected. This tells an engineer how precisely a system must be assembled to meet its performance specifications.

Similarly, high-performance optical systems must often operate in harsh environments. A primary mirror on a satellite telescope in orbit is subjected to extreme temperature swings as it moves in and out of Earth's shadow. A change in temperature causes the mirror material to expand or contract. This thermal expansion, governed by the material's Coefficient of Thermal Expansion ( $\alpha$ ), changes the mirror's radius of curvature. And since the focal length of a mirror is simply half its radius of curvature, the focal point shifts. A simple combination of thermal physics and paraxial optics allows engineers to calculate this focal shift precisely and either choose materials with near-zero thermal expansion or design active correction systems to keep the telescope in perfect focus.

Perhaps the most subtle and powerful application is in the science of measurement, or metrology. It turns out that how you look at something can change what you see. A standard (entocentric) imaging system has perspective: objects farther away look smaller. But what if you could build a system with no perspective? An image-space telecentric system is designed, by placing the aperture stop at the front focal plane of the lens, such that the chief rays in image space are all parallel to the optical axis. This has the remarkable effect of making the magnification independent of the object's distance from the lens.

Consider using such a system to observe a physical phenomenon, like the birefringence induced in a lens by mechanical stress. The amount of birefringence depends on the angle at which the light ray travels through the material. An entocentric system collects rays from a range of angles depending on the object point's position. A telecentric system, however, ensures that the chief ray from every object point is parallel. This fundamental difference in how the object is "probed" leads to a dramatically different observed pattern of stress-induced retardation. This is a profound lesson: the design of our instruments is not passive. It is an an active part of the measurement, and a deep understanding of paraxial optics is essential to ensure that we are measuring what we think we are measuring.

From the grand design of telescopes to the subtle art of measurement, the principles of paraxial optics are the common thread. They are simple, they are powerful, and they are everywhere. They are the quiet, indispensable language that allows us to build our windows to the universe and to truly understand what we see through them.