try ai
Popular Science
Edit
Share
Feedback
  • Lower and Upper Sums

Lower and Upper Sums

SciencePediaSciencePedia
Key Takeaways
  • Lower and upper sums create under- and over-approximations of an area by building rectangles that fit entirely below or extend entirely above a curve.
  • A function is defined as integrable if the gap between the lower and upper sums can be squeezed to zero by continually refining the interval partition.
  • This framework proves that all continuous and monotonic functions are integrable, guaranteeing that the "squeeze" process will succeed for these large classes of functions.
  • The theory of sums validates essential properties of integrals and underpins numerical methods by providing a rigorous way to handle complex or oscillating functions.

Introduction

The concept of finding the area under a curve is a cornerstone of calculus, but how do we precisely define and calculate this area for complex, arbitrary functions? While simple geometric formulas suffice for rectangles and triangles, a more robust and universal method is needed to handle the vast world of curves encountered in science and mathematics. This article addresses this fundamental gap by introducing the elegant theory of lower and upper sums. It provides a formal definition of the integral not through a formula, but through a process of approximation and squeezing. In the first chapter, "Principles and Mechanisms," you will learn how to construct these sums using partitions and see how they trap the true area, leading to a rigorous criterion for integrability. Subsequently, the "Applications and Interdisciplinary Connections" chapter will explore the profound implications of this theory, showing how it tames complex functions, validates algebraic rules of calculus, and forms the theoretical bedrock for modern computational methods.

Principles and Mechanisms

So, we have this marvelous idea of finding the area under a curve. But how do we actually do it? If the shape is a simple rectangle or triangle, you've known how since you were a child. But what about the graceful curve of a parabola, or the arc of a circle? There’s no simple formula for the area of such a shape. What we need is a universal strategy, a reliable machine that can take (almost) any well-behaved function and spit out the area underneath it.

The strategy, it turns out, is one of beautiful simplicity. It’s a game of "trap and squeeze." We will sneak up on the true area from two sides. We'll construct one approximation that we know is too small and another one that we know is too big. The true area, if it exists, must be trapped between them. Then, we’ll make the trap tighter and tighter until there's only one possible value left for the area to be.

Trapping an Elusive Area

Imagine the curve traced by the function f(x)=x2f(x) = x^2f(x)=x2 from x=0x=0x=0 to x=2x=2x=2, or perhaps the elegant quarter-circle given by f(x)=1−x2f(x) = \sqrt{1-x^2}f(x)=1−x2​ on the interval [0,1][0, 1][0,1]. To begin our trapping game, we first chop the interval on the xxx-axis into smaller pieces. This set of chopping points is called a ​​partition​​, which we can denote by P={x0,x1,…,xn}P = \{x_0, x_1, \ldots, x_n\}P={x0​,x1​,…,xn​}. Each little piece, like [xi−1,xi][x_{i-1}, x_i][xi−1​,xi​], is a ​​subinterval​​.

Now, for each subinterval, let's look at the part of the curve that lives above it. What's the lowest value the function reaches in this little segment? We’ll call this value the ​​infimum​​, or mim_imi​. And what's the highest value? We'll call that the ​​supremum​​, or MiM_iMi​. If the function is continuous, these are simply the minimum and maximum values in that segment.

With these values, we can build two types of rectangles on each subinterval. One has height mim_imi​—the "short" rectangle. It fits snugly under the curve. The other has height MiM_iMi​—the "tall" rectangle, which pokes out just enough to contain the curve completely.

Lower and Upper Sums: Building the Floor and Ceiling

If we add up the areas of all the short rectangles, we get what’s called the ​​lower Darboux sum​​, denoted L(f,P)L(f, P)L(f,P). This sum is the area of a blocky shape that is entirely contained within the area we're trying to find. It’s our guaranteed under-approximation. L(f,P)=∑i=1nmi(xi−xi−1)L(f, P) = \sum_{i=1}^{n} m_i (x_i - x_{i-1})L(f,P)=∑i=1n​mi​(xi​−xi−1​)

Similarly, adding up the areas of all the tall rectangles gives us the ​​upper Darboux sum​​, U(f,P)U(f, P)U(f,P). This is our guaranteed over-approximation; the true area is completely contained within it. U(f,P)=∑i=1nMi(xi−xi−1)U(f, P) = \sum_{i=1}^{n} M_i (x_i - x_{i-1})U(f,P)=∑i=1n​Mi​(xi​−xi−1​)

So, for any partition PPP, we have successfully trapped the true area AAA: L(f,P)≤A≤U(f,P)L(f, P) \leq A \leq U(f, P)L(f,P)≤A≤U(f,P)

Let's check this machine with the simplest possible case: a constant function, f(x)=kf(x)=kf(x)=k, over an interval [a,b][a, b][a,b]. On any subinterval, the function never changes. The lowest it gets is kkk, and the highest it gets is also kkk. So, mi=Mi=km_i = M_i = kmi​=Mi​=k for all iii. When we calculate the sums, we find: L(f,P)=∑i=1nk(xi−xi−1)=k∑i=1n(xi−xi−1)=k(b−a)L(f, P) = \sum_{i=1}^{n} k (x_i - x_{i-1}) = k \sum_{i=1}^{n} (x_i - x_{i-1}) = k(b-a)L(f,P)=∑i=1n​k(xi​−xi−1​)=k∑i=1n​(xi​−xi−1​)=k(b−a) U(f,P)=∑i=1nk(xi−xi−1)=k∑i=1n(xi−xi−1)=k(b−a)U(f, P) = \sum_{i=1}^{n} k (x_i - x_{i-1}) = k \sum_{i=1}^{n} (x_i - x_{i-1}) = k(b-a)U(f,P)=∑i=1n​k(xi​−xi−1​)=k∑i=1n​(xi​−xi−1​)=k(b−a) The lower and upper sums are identical, and they give us the obvious answer: the area of a rectangle of width (b−a)(b-a)(b−a) and height kkk. Our machine works perfectly in this trivial case.

For a more interesting function like f(x)=x2f(x) = x^2f(x)=x2 on [0,2][0, 2][0,2], with a simple partition P={0,1,2}P = \{0, 1, 2\}P={0,1,2}, the function is increasing. In the first subinterval [0,1][0, 1][0,1], the lowest value is f(0)=0f(0)=0f(0)=0 and the highest is f(1)=1f(1)=1f(1)=1. In the second, [1,2][1, 2][1,2], the lowest is f(1)=1f(1)=1f(1)=1 and the highest is f(2)=4f(2)=4f(2)=4. The sums come out to be L(f,P)=1L(f, P) = 1L(f,P)=1 and U(f,P)=5U(f, P) = 5U(f,P)=5. The true area is trapped somewhere between 1 and 5. This is a very wide trap, but it's a start! For a decreasing function like f(x)=1/xf(x)=1/xf(x)=1/x, the logic is the same, but the infimum and supremum on each interval are found at the right and left endpoints, respectively.

The Squeeze: How to Sharpen Your Guess

A trap between 1 and 5 isn't very useful. How do we make it better? We make our partition finer! Imagine we take an existing partition PPP and add just one more point to create a new, ​​refined​​ partition P′P'P′. Let's say we add a point ccc into a subinterval [xi−1,xi][x_{i-1}, x_i][xi−1​,xi​]. This one subinterval becomes two: [xi−1,c][x_{i-1}, c][xi−1​,c] and [c,xi][c, x_i][c,xi​].

What happens to the sums? The suprema (the high points) in these two new, smaller intervals can only be less than or equal to the supremum of the original, larger interval. They can't possibly be higher! So, the upper sum either stays the same or, more likely, gets smaller. Symmetrically, the infima (the low points) in the new intervals can only be greater than or equal to the original infimum. So, the lower sum either stays the same or gets larger.

This is the key insight! Every time we refine the partition, the floor of our trap (LLL) rises and the ceiling of our trap (UUU) lowers. We have this beautiful chain of inequalities: L(f,P)≤L(f,P′)≤U(f,P′)≤U(f,P)L(f, P) \leq L(f, P') \leq U(f, P') \leq U(f, P)L(f,P)≤L(f,P′)≤U(f,P′)≤U(f,P) The trap can only get tighter. The process of integration, then, is the process of squeezing this trap. We imagine making the partition finer and finer, adding more and more points. If the lower sums and the upper sums both converge to the same single number, then we've done it. We've squeezed the trap down to a single point, and that point must be the true area. A function for which this squeezing process works is called ​​integrable​​.

The Anatomy of the Gap: Oscillation and Integrability

The success of our method depends entirely on whether we can make the gap between the upper and lower sums, U(f,P)−L(f,P)U(f, P) - L(f, P)U(f,P)−L(f,P), as small as we want. Let's look at this gap more closely. For each subinterval, the contribution to the gap is the area of a small "uncertainty rectangle" whose width is Δxk=xk−xk−1\Delta x_k = x_k - x_{k-1}Δxk​=xk​−xk−1​ and whose height is Mk−mkM_k - m_kMk​−mk​. This height, the difference between the supremum and the infimum on a subinterval, is called the ​​oscillation​​ of the function, ωk\omega_kωk​.

The total gap is simply the sum of the areas of these uncertainty rectangles: U(f,P)−L(f,P)=∑k=1n(Mk−mk)Δxk=∑k=1nωkΔxkU(f, P) - L(f, P) = \sum_{k=1}^{n} (M_k - m_k)\Delta x_k = \sum_{k=1}^{n} \omega_k \Delta x_kU(f,P)−L(f,P)=∑k=1n​(Mk​−mk​)Δxk​=∑k=1n​ωk​Δxk​

So, the question of integrability becomes: can we find a partition so fine that the total area of these oscillation-rectangles is arbitrarily small? For some functions, the answer is a resounding yes. For others, it's a frustrating no.

A Rogues' Gallery of Functions

Let's meet some of the players in this game.

​​The Heroes: Integrable Functions​​

A huge and important class of heroes are the ​​monotonic functions​​—those that are always increasing or always decreasing on an interval. For these functions, we can prove not just that the gap shrinks, but we can say exactly how fast it shrinks. Consider a uniform partition of [a,b][a,b][a,b] into nnn pieces, each of width b−an\frac{b-a}{n}nb−a​. For an increasing function, the infimum in any subinterval [xk−1,xk][x_{k-1}, x_k][xk−1​,xk​] is f(xk−1)f(x_{k-1})f(xk−1​) and the supremum is f(xk)f(x_k)f(xk​). The gap U−LU-LU−L becomes a delightful ​​telescoping sum​​: U(f,Pn)−L(f,Pn)=∑k=1n(f(xk)−f(xk−1))b−anU(f, P_n) - L(f, P_n) = \sum_{k=1}^{n} (f(x_k) - f(x_{k-1})) \frac{b-a}{n}U(f,Pn​)−L(f,Pn​)=∑k=1n​(f(xk​)−f(xk−1​))nb−a​ =b−an[(f(x1)−f(x0))+(f(x2)−f(x1))+⋯+(f(xn)−f(xn−1))]= \frac{b-a}{n} [ (f(x_1)-f(x_0)) + (f(x_2)-f(x_1)) + \dots + (f(x_n)-f(x_{n-1})) ]=nb−a​[(f(x1​)−f(x0​))+(f(x2​)−f(x1​))+⋯+(f(xn​)−f(xn−1​))] =b−an(f(xn)−f(x0))=b−an(f(b)−f(a))= \frac{b-a}{n} (f(x_n) - f(x_0)) = \frac{b-a}{n} (f(b) - f(a))=nb−a​(f(xn​)−f(x0​))=nb−a​(f(b)−f(a)) This is a remarkable result. The gap is inversely proportional to nnn. As we make the partition finer by increasing nnn, the gap goes to zero, guaranteed! This means every monotonic function is integrable. This is incredibly powerful. It doesn't even matter if the function has jumps, as long as it keeps going in one general direction. A function like f(x)=x+⌊3x⌋f(x) = x + \lfloor 3x \rfloorf(x)=x+⌊3x⌋, which has step-like discontinuities, is still monotonic (non-decreasing) and therefore perfectly integrable by this same argument.

Of course, all ​​continuous functions​​ on closed intervals are also integrable. Intuitively, continuity means that we can make the oscillation ωk\omega_kωk​ on any subinterval as small as we like by simply making the subinterval narrow enough.

We can also see that some simple operations don't spoil integrability. If you take an integrable function f(x)f(x)f(x) and just shift the whole graph up by a constant CCC to get g(x)=f(x)+Cg(x) = f(x) + Cg(x)=f(x)+C, all you're doing is adding a giant rectangle of area C(b−a)C(b-a)C(b−a) to your sums. Both the lower and upper sums increase by exactly this amount, and the function remains integrable.

​​The Villain: A Non-Integrable Function​​

To truly appreciate the heroes, we must meet a villain. Consider the strange and wild ​​Dirichlet function​​. Let's say it's defined to be c1c_1c1​ if xxx is a rational number and c2c_2c2​ if xxx is an irrational number, with c1>c2c_1 > c_2c1​>c2​. Now, try to trap the area under this beast. Take any subinterval, no matter how microscopically small. Because both rational and irrational numbers are everywhere dense, this tiny interval will contain points where the function is c1c_1c1​ and points where it is c2c_2c2​.

What does this do to our sums? On every single subinterval, the supremum MkM_kMk​ is c1c_1c1​ and the infimum mkm_kmk​ is c2c_2c2​. The oscillation ωk\omega_kωk​ is always c1−c2c_1 - c_2c1​−c2​. The gap doesn't depend on the partition at all! U(f,P)−L(f,P)=(c1−c2)∑i=1n(xi−xi−1)=(c1−c2)(b−a)U(f, P) - L(f, P) = (c_1 - c_2) \sum_{i=1}^{n} (x_i - x_{i-1}) = (c_1 - c_2)(b-a)U(f,P)−L(f,P)=(c1​−c2​)∑i=1n​(xi​−xi−1​)=(c1​−c2​)(b−a) The gap is a fixed, positive constant. You can slice the interval a million, a billion, a trillion times, and the gap never shrinks. The lower and upper sums never meet. The trap never closes. This function is the classic example of a non-integrable function. It's a beautiful monster, because it shows us precisely what the criterion of integrability is for—it's a condition to tame chaos, to ensure that as we zoom in, the function's behavior becomes manageable, not infinitely ragged.

Applications and Interdisciplinary Connections

Now that we have thoroughly examined the inner workings of upper and lower sums—this beautiful machine of boxes tall and short—you might be wondering, "What is it all for?" Is it merely a pedantic exercise to satisfy the mathematicians, a way to prove what our intuition already tells us about area? The answer, I hope you'll come to see, is a resounding "no." This machinery is far more than a tool for finding area; it’s a powerful and subtle way of thinking that allows us to make sense of a messy, complicated, and continuous world. It provides the solid ground beneath the entire edifice of integral calculus, and its applications stretch across science, engineering, and even into the philosophical questions of what it means to measure something at all.

Let's take this machine for a spin and see what it can really do.

The Art of Taming Complexity

The first and most obvious use of our integral concept is to handle functions that are not simple straight lines or parabolas. For any "nice" continuous function, like the smooth curve of f(x)=cx3f(x) = c x^3f(x)=cx3, it's a straightforward, almost mechanical process to show that as we make our partition finer and finer, the gap between the upper and lower sums, U(Pn,f)−L(Pn,f)U(P_n, f) - L(P_n, f)U(Pn​,f)−L(Pn​,f), shrinks down to nothing. This is the ideal case, like measuring the length of a perfectly straight road. But the real world is rarely so clean. What happens when our functions have glitches, jumps, or other strange behaviors? This is where the true power of our definition begins to shine.

Imagine you are a signal processor, looking at a stream of data. The signal is mostly smooth, but occasionally there's a sudden, instantaneous "spike"—a single point of error. Or perhaps you're a physicist modeling a thin wire with a few point-like masses attached. Your function describing the density is zero almost everywhere, but has a non-zero value at a handful of points. Can you still find a meaningful average or total mass? Our machinery answers with a confident "yes."

Let's consider a function that is zero everywhere except for a finite number of points where it spikes up to some value α\alphaα. Common sense might suggest these spikes could ruin everything. But think about the upper sum. The spikes only affect the rectangles that contain them. Since there are only a finite number of these spikes, we can choose a partition so fine that we "trap" each spike within an incredibly narrow rectangle. While the height of these few rectangles is α\alphaα, their total width can be made as small as we please. The sum of the areas of these "spike rectangles" can therefore be made arbitrarily close to zero. Meanwhile, the lower sum is always zero, because every interval, no matter how small, contains points where the function is zero. As our partition gets finer, the upper sum is squeezed down towards the lower sum, and they both meet at zero. The integral is zero! The integral, in its wisdom, recognizes that a finite number of isolated points have no "substance" in a continuous world. They are ghosts in the machine.

What if the function is even wilder? Consider the function f(x)=sin⁡(1/x2)f(x) = \sin(1/x^2)f(x)=sin(1/x2) near the origin. It's a madhouse! As xxx gets closer to zero, the function oscillates faster and faster, swinging between 111 and −1-1−1 an infinite number of times. It seems impossible to pin down. But again, our method is cleverer than the chaos. Instead of using a uniform partition, we can use a custom-built one. We create one tiny little interval, say from 000 to δ\deltaδ, that quarantines all the infinite madness. Inside this tiny interval, the oscillation Mi−miM_i - m_iMi​−mi​ is at its maximum value of 222. But the contribution to the total uncertainty, U−LU-LU−L, is 2×δ2 \times \delta2×δ. We can make this contribution as small as we want just by choosing δ\deltaδ to be small enough! Outside this quarantine zone, from δ\deltaδ to 111, the function is perfectly well-behaved and continuous. We can cover that part with a standard fine partition and make its contribution to the uncertainty small as well. By this strategy of "divide and conquer," we can prove the function is integrable. We have tamed an infinite beast, not by wrestling it, but by cleverly corralling it.

The Algebra of Reality

Science is not about isolated facts; it's about relationships. We constantly combine quantities—scaling them, multiplying them, inverting them. A crucial test of any mathematical tool is whether it respects these relationships. Does our integral concept behave logically when we perform algebra on functions?

Suppose you have an integrable function fff, and you create a new function by simply scaling it, g(x)=cf(x)g(x) = c f(x)g(x)=cf(x). This could be as simple as changing units, from meters to feet. The upper and lower sums provide the answer immediately. When you scale a function by a positive constant ccc, you scale the suprema and infima by ccc. The difference U−LU-LU−L for the new function is just ccc times the old difference. If you use a negative ccc, the roles of supremum and infimum swap, but the magnitude of the difference still scales by ∣c∣|c|∣c∣. So if you can make the uncertainty for fff arbitrarily small, you can do the same for cfcfcf. This is why you're allowed to pull constants out of an integral, a rule you've used since your first calculus class. It's not an arbitrary rule; it's a direct consequence of how the approximation boxes behave under stretching.

More profoundly, what about products? In physics, power is the product of voltage and current, P(t)=V(t)I(t)P(t) = V(t)I(t)P(t)=V(t)I(t). In fluid dynamics, the momentum flux is the product of density and velocity squared. If we know that fff and ggg are integrable, can we be sure their product fgfgfg is? The answer is yes, and the proof is a little jewel of an argument that hinges on our sums. The oscillation of the product fgfgfg in a small interval can be cleverly bounded by the oscillations of fff and ggg themselves. This guarantee is the foundation that allows us to confidently integrate countless physical quantities that are themselves products of other, more fundamental quantities.

Or consider reciprocals. If you have a function R(t)R(t)R(t) representing the electrical resistance of a component over time, and you know it's always positive and bounded away from zero, you might be interested in its conductance, G(t)=1/R(t)G(t)=1/R(t)G(t)=1/R(t). If R(t)R(t)R(t) is integrable, is G(t)G(t)G(t)? It is! As long as the function fff stays safely above some positive value δ\deltaδ, meaning it never gets too close to zero, then controlling the oscillations of fff is enough to control the oscillations of 1/f1/f1/f. This stability under algebraic operations is what makes the Riemann integral a practical and trustworthy tool for building complex models of the world.

The Bridge to the Infinite and the Digital

Perhaps one of the most important connections is between this abstract theory and the world of computation. We often encounter functions that are too complex to integrate analytically. We might have data from an experiment, or a function defined by a monstrously complicated formula. The universal strategy is to approximate it with a sequence of simpler functions, like polynomials or Fourier series (a sum of sines and cosines). This is the basis of numerical integration, a cornerstone of modern science and engineering.

But this whole enterprise rests on a critical question: if our sequence of functions fnf_nfn​ gets closer and closer to the true function fff, do the integrals of fnf_nfn​ also get closer to the integral of fff? The theory of uniform convergence, built on the scaffolding of our upper and lower sums, provides the answer. If the approximation is "uniform"—meaning the worst-case error across the whole interval shrinks to zero—then the limit of the integrals is indeed the integral of the limit. This theorem is the license that allows us to trust our computers. It guarantees that when a simulation refines its grid and the numerical solution converges, the integrated quantities it calculates (like total energy, drag, or financial risk) are actually converging to the true values.

Finally, what happens when we push our machine to its absolute limit? What kinds of functions can it not handle? This is often where the most interesting physics and mathematics lie. Consider a truly bizarre function, born from the strange properties of our number system. Let f(x)=1f(x)=1f(x)=1 if the decimal expansion of xxx contains the digit '7', and f(x)=0f(x)=0f(x)=0 otherwise. Now, try to integrate this on [0,1][0,1][0,1]. Pick any subinterval, no matter how tiny. I can find a number inside it that has a '7' in its expansion, and I can find one that doesn't. For example, in the interval [0.41,0.42][0.41, 0.42][0.41,0.42], the number 0.417...0.417...0.417... exists, and so does 0.4111...0.4111...0.4111.... This means that on every single subinterval of any partition, the supremum MiM_iMi​ is 1 and the infimum mim_imi​ is 0.

Think about what this does to our sums. The upper sum, summing up all the MiΔxiM_i \Delta x_iMi​Δxi​, is always 111. The lower sum, summing up the miΔxim_i \Delta x_imi​Δxi​, is always 000. No matter how fine we make our partition, the upper and lower sums remain stubbornly fixed at 1 and 0. They will never meet. The gap refuses to close. Our machine breaks down. The function is not Riemann integrable.

This isn't a failure of mathematics. It's a profound discovery. It tells us that the Riemann integral, for all its power, is designed for functions that are, in some sense, "mostly" continuous. It cannot handle functions that are pathologically discontinuous everywhere. The existence of such functions forced mathematicians like Henri Lebesgue to invent a more powerful, more subtle theory of integration—one that could make sense of such chaotic behavior. Pushing a tool to its breaking point is how we discover the need for a better one.

From taming wild oscillations to justifying the algorithms that run our modern world, the simple idea of squeezing a function between upper and lower rectangles blossoms into a rich and powerful theory. It connects abstract mathematical rigor to the practical and computational problems we face every day, revealing the deep, underlying unity between how we reason and what we can build.