5.5 The mean value theorem
If is differentiable, then the derivative encodes a lot of information about the original function . Here we discuss a number of important results in this vein, which relate properties of (and possibly higher-order derivatives) to properties of .
Stationary points
Let be an open interval and suppose is differentiable. Intuitively, it is clear that the tangent to the graph of should be horizontal at a point where the function attains its maximum: see Figure 5.10. The same holds for a point where attains its minimum. Thus, our intuition tells us that the derivative (which corresponds to the gradient of the tangent line) should vanish wherever we encounter an extreme value of the function.33 3 But we are not content with intuition: we need formal proof! We will make this precise in Theorem 5.34 below. In fact, this observation should still hold if, rather than necessarily consider a maximum for , we considering a local maximum, defined as follows.
Let be an open interval, be a function and .
-
1
We say is a local maximum for if there exists some open interval such that and for all ;
-
2
We say is a local minimum for if there exists some open interval such that and for all .
We illustrate the concept of a local maximum in Figure 5.10. Note that if is a maximum for , then it is also a local maximum (since we can just take ). Similarly, if is a minimum for , then it is also a local minimum.
The following theorem makes precise the intuitive link between local extrema and derivatives. The key to this link is considering the zeros of the derivative , which we call the stationary points of the function .
Let be an open interval and be differentiable.
-
1
If is a local maximum for then .
-
2
If is a local minimum for , then .
In particular, the maximum and minimum of (if they exist) must occur at stationary points of .
We shall only prove (1), since (2) can be proved using a similar argument, or can be derived from (1) by replacing with .
Suppose has a local maximum at , so that there exists some open interval with such that for all . In particular, if is such that , then . Consequently, the difference quotient satisfies
Since is differentiable at , the left and right derivatives exist at and are equal to . Furthermore,
From these two inequalities we conclude that , as required. ∎
Sketch a figure to illustrate the ideas behind the proof of Theorem 5.34.
Theorem 5.34 is a useful tool for finding the maximum and minimum values of a function, since it limits the possibilities of where these values can occur. However, the following simple example shows that stationary points do not always correspond to local extrema.
The converse of Theorem 5.34 does not hold. For instance, the function given by is differentiable with for all and therefore has a stationary point at . However, since and , we see that is neither a local maximum nor a local minimum for .
So, extreme values must occur at stationary points, but stationary points are not always extrema. We can use information from second-order derivatives to try to further diagnose whether a stationary point is an extremum; we shall return to this topic later.
Rolle’s theorem
As a simple consequence of the stationary point theorem, we deduce the following result.
Suppose , with . If is continuous on , differentiable on and , then there exists some such that .
Suppose is differentiable. Rolle’s theorem tells us that between every pair of zeros of there is a stationary point of , where the tangent line is horizontal. We illustrate this in Figure 5.11.
By the extreme value theorem from Theorem 4.106, the function attains its minimum value and maximum value somewhere on . If , then is constant. In this case, we see from Definition 5.2 that for all and so the claim follows. Thus, we may assume .
Since and is the maximum value of , we know . Consider the case . Then the value is not attained by the function at either of the endpoints or of the interval, since . However, we know that the value is attained by somewhere in the interval , and so there exists some such that . Thus, is a maximum for and the stationary point theorem (Theorem 5.34) implies that for , as required.
It remains to consider the case . Since , it follows that . We can now use exactly the same argument as in the previous case to show that there exist some minimum for and therefore for , as required. ∎
Show that each hypothesis of Rolle’s theorem is necessary as follows.
-
(i)
Show there exists some which is differentiable on and satisfies for which for all .
-
(ii)
Show that there exists some which is continuous on , differentiable on , and satisfies but for which for all .
Note that Rolle’s theorem asserts the existence of a stationary point, but does not say anything about uniqueness. In particular, there can be more than just one stationary point.
Consider the function given by . Observe that is continuous on , differentiable on and . Sketch the graph of and show that has precisely two stationary points in .
Sketch an example of a function satisfying the hypotheses of Rolle’s theorem with precisely stationary points. Here we are interested in a conceptional drawing: you do not need to derive a formula for the function, merely illustrate the concept.
The mean value theorem
Our next step is to prove an important and far-reaching upgrade of Rolle’s theorem.
Suppose with . If is continuous on and differentiable on then there exists some such that
Show that the mean value theorem implies Rolle’s theorem as a special case.
Before discussing the proof of Theorem 5.41, it’s helpful to spend some time developing intuition for what the result is telling us. We shall actually discuss three different interpretations of the mean value theorem: one here and two others in later sections.44 4 Unfortunately, none of our interpretations will fully explain why Theorem 5.41 is called the ‘mean value’ theorem. The answer is that the expression (5.18) corresponds to the average (or ‘mean’) rate of change for the function . However, this interpretation relies on integration theory and, in particular, the fundamental theorem of calculus. You will investigate these topics if you take the year 2 course Further Analysis and Several Variable Calculus.
MVT Interpretation 1: Parallel lines.
There is a simple geometric interpretation of the mean value theorem, which is illustrated in Figure 5.12. The figure shows the secant line through the points and on the graph of . The gradient of is given by
which corresponds to left-hand side of (5.18). The mean value theorem tells us the following: there exists some such that the gradient of is equal to the gradient of the tangent line to the graph of at . In particular, the secant and the tangent line are parallel.
In Exercise 5.42, we saw that the mean value theorem implies Rolle’s theorem. In fact, we can also use Rolle’s theorem to prove the mean value theorem (so the two results are equivalent)!
We shall only sketch the details of the proof here: you can fill in the details for yourself (see Exercise 5.43 below)!
The graph of the linear polynomial
is precisely the secant line through and . The idea behind the proof is to subtract this linear polynomial from in order to reduce to a situation where Rolle’s theorem can be applied. More precisely, consider
If we compare the graphs of and as in Figure 5.13, then intuitively the graph of is formed by sliding the graph of so that the secant line through and becomes the horizontal axis. This is exactly the situation where Rolle’s theorem applies.
Rolle’s theorem applied to tells us there exists some such that . Applying this to the definition (5.19) of and rearranging, we obtain the conclusion of the mean value theorem. ∎
Prove the mean value theorem by applying Rolle’s theorem to as in (5.19).