5.8 Taylor’s theorem
To conclude the course, we now turn to a simple and practical problem: how do we actually compute values of functions?
Practical computation
As an example, consider the function . For certain , the value of is clear from the definition (in terms of points on the unit circle): for instance, we all know that
However, given some generic , how would you work out the value of ? For instance,
How can we compute ?
Of course, we can simply plug into a calculator and see that
but where do these digits come from? Do you know a way to compute them by hand?
There are very few functions whose values we can compute directly. Typically, to compute we are forced to work with some simpler function which approximates near . The idea is to work with some whose values we actually can compute, and then relate this information back to .
In this section, we investigate approximating smooth functions by polynomials. The key result is Taylor’s theorem, which tells us how close our approximation is to the true value of . Polynomials are great for practical computation: they only involve addition and multiplication, so their values are (relatively) easy to work out. Using Taylor’s theorem, we will develop a practical solution to the question posed in 5.66.
Approximation by constants
In order to understand Taylor’s theorem, we first return to the mean value theorem and give a third and final (as far as this course is concerned) interpretation of this result.
MVT Interpretation 3: Approximation by a constant function.
Let be an open interval, be differentiable and . For any with , the function satisfies the hypotheses of the mean value theorem on the interval . If and , then the same holds for the interval . From this, we deduce that for all there exists some lying between55 5 When we say lies between and , we simply mean either if or if . That is, . and such that
Note that the above identity trivially holds in the case for any choice of .
We rewrite (5.25) in new notation as
where
This notation seems a bit clunky right now, but it will be helpful for comparing with what comes later.
In our third interpretation, we think of as a constant function (which is the simplest kind of polynomial). The identity (5.26) can be interpreted as approximating by the constant function . The function corresponding to the remainder (or error) of the approximation.
Graphically, the above corresponds to approximating graph of by the horizontal line . We illustrate this in Figure 5.15. The remainder tells us how far points on the graph of are from the approximating straight line.
Consider the function around the point . We wish to crudely approximate by the constant function . From (5.26), the error in this approximation is given by . Since for all , we have
In particular, if is close to , then we can guarantee the error in the crude approximation is reasonably small: it is at most .
Approximation by linear polynomials
In general, the horizontal line is not a good choice of approximating line for the graph of our function . A better choice is to take the tangent line. Recall, the tangent line to the graph of at the point is the line which passes through and has gradient . This is graphed by the function
Indeed, we can check this by noting that and for all . The following lemma is explores what happens when we approximate by .
Let be an open interval and suppose is twice differentiable. Given and , there exists some lying between and such that
Let’s try to understand what Lemma 5.68 is telling us, and how it compares with what we know from the mean value theorem. The quantity is the remainder when we approximate by the linear polynomial . We can see this graphically in Figure 5.16(b) . In particular, we can think of as a measure of how accurately the tangent line approximates the graph of .
Continuing with the setup from Example 5.67, we now wish to approximate by the linear polynomial
The error in this approximation is given by . Since for all , we have
Thus, if is close to , then we can guarantee the error in this approximation is very small: it is at most .
Comparing (5.27) and (5.28), if is close to , then is much, much smaller than . As we expect, for values of close to , the tangent line provides a much better approximation to the graph of than the horizontal line . We can see the difference in size between the remainders and by comparing Figure 5.16(a) and Figure 5.16(b).
We won’t prove Lemma 5.68 right now. Instead, we’ll discuss and prove a more general statement (Taylor’s theorem) which includes Lemma 5.68 as a special case.
It is important to note that is only a better approximation to for values for values of close to . For instance, if we take (which we think of as being far from ), then
In particular, and , so the constant function is a better approximation than the linear polynomial at .
In light of Warning 5.70, we typically consider the problem of approximating a given function locally around a point . That is, in practice we work with values of close to .
Approximation by polynomials
The mean value theorem is about approximating by a constant and Lemma 5.68 is about approximating by a linear function, around some fixed point . What happens if we try to approximate by a more general polynomial function?
For example, suppose we want to approximate by a quadratic polynomial,
for some coefficients , , . Notice that we have chosen to express the polynomial in a particular form so that it is centred around ; this is natural, since we are trying to approximate around . The first question is: how do we choose the coefficients , and ?
The constant polynomial was chosen so that . The linear polynomial was chosen so that and . It therefore makes sense to continue this pattern, and choose our coefficients so that
Since we have and , this forces us to take , and , giving
This suggests that, if we wish to approximate by a quadratic polynomial around , then (5.29) is our best bet for the approximant. We can push this idea further and consider higher degrees.
Let be an open interval, and be -times differentiable and . The polynomial66 6 When , we interpret the sum as equal to and so as above.
is called the Taylor polynomial of degree at .
It is often convenient to write
where we adopt the conventions , and (with the latter convention holding even if ).
Let be an open interval, be -times differentiable and .
-
(i)
For , with , show that .
-
(ii)
For , with , show that .
-
(iii)
Use the above to conclude that the derivatives of satisfy
We now turn to Taylor’s theorem. As in Lemma 5.68, the idea is that the polynomial should give a good approximation to the function near the point . Taylor’s theorem tells us how close we can expect the approximation to be to the true value of .
Let be an open interval, and be -times differentiable. For all and there exists a number lying between and , which depends on , and , such that
Here is the remainder when we approximate the value by .
We will return to prove Taylor’s theorem in the next section. For now we note some applications and consequences of the result.
Continuing with the setup from Examples 5.67 and 5.69, fixing odd, we now wish to approximate by the degree polynomial77footnotemark: 7
We could also consider even degrees , in which case the above formula is slightly different (we sum up to the index rather than the index ). The error in this approximation is given by . Since for all , we have
This generalises the bounds (5.27) and (5.28), which correspond to the and cases, respectively. As get larger, we are approximating using a higher and higher degree polynomial . Correspondingly, the right-hand side of (5.31) gets smaller and smaller.88 8 At least for . If is large, then the right-hand side of (5.31) may increase in for the first few values of , but then will later decrease down towards . Thus, for large the polynomial gives an extremely good approximation to around , and the approximation gets better as increases. We illustrate this in Figure 5.17.
We can apply Taylor’s theorem to answer the question posed in 5.66.
We can use Example 5.75 to compute the first decimal digits of .
To illustrate the approach, we approximate by the Taylor polynomial , which can be computed using basic arithmetic. Observe that
The question is: how far is from the true value of ? From the bound (5.31) derived from Taylor’s theorem, remainder satisfies
The fact that there are 5 zeros after the decimal point here indicates that our approximation is correct in its first 5 decimal digits: . Note that our earlier calculation (5.32) shows that
Taylor series expansion
If is an infinitely differentiable function, then we can define the Taylor polynomials for all degrees . These functions provide a sequence of polynomials which provide increasingly accurate approximations to the function . It is therefore natural to ask whether as . This question leads us to consider Taylor series. We illustrate this concept by considering the familiar example of the function.
For all , we have
For and , let denote the th partial sum of the series on the right-hand side of (5.34). From (5.30), we see that . As observed in Example 5.75, Taylor’s theorem ensures that
We have ; see Exercise 5.78. Thus, the sequence of partial sums satisfies as , which is precisely the identity (5.34). ∎
By considering the series , show that holds for all .
The series on the right-hand side of (5.34) is called the Taylor series of centred at . More generally, we have the following definition.
Given an open interval , an infinitely differentiable function and , we define the Taylor series of centred at to be the formal series
We write formal here to indicate the fact that we do not know, in general, whether (5.35) converges. However, for certain familiar functions , the Taylor series of does converge back to .
The case of a Taylor series is sometimes called a Maclaurin series, at least within the UK. Maclaurin was a professor of mathematics in Edinburgh in the eighteenth century and his grave can be found in Greyfriars Kirkyard.
Show that holds for all .
Compute the Taylor series of centred at . What do you notice about the series that you obtain?
Only very special functions can be expressed in terms of a Taylor series and, in general, many things can go wrong:
-
•
The function may not be infinitely differentiable, so that its Taylor series is not defined;
-
•
Even if is infinitely differentiable, so we can define the Taylor series, there is no guarantee that the Taylor series will converge for all values of in the domain of ;
-
•
Even if the Taylor series does converge at a given point in the domain of , there is no guarantee that the limit is equal to .
Some of these subtleties are illustrated Example 5.84 below. Understanding convergence of Taylor series is a subtle problem, which is explored in year 3 analysis courses.
One can show that the Taylor series of centred at is given by
Computing the Taylor series is a little tricky by arguing directly from the definition (this is not recommended), but becomes a lot easier using some more advanced theory. You will learn more about this if you take the year 3 analysis courses. On the other hand, using what we already know about geometric series,
However, we can also see that the series diverges for all with (for instance, by the th term test: the sequence of terms does not converge when , so the series must diverge).
Thus, is an example of a function which is infinitely differentiable on the whole of , but the Taylor series only converges on the interval . The reason for this behaviour becomes clear once we move to the complex plane: to understand it we need ideas from complex analysis.