5.8 Taylor’s theorem

To conclude the course, we now turn to a simple and practical problem: how do we actually compute values of functions?

Practical computation

As an example, consider the function $\sin$ . For certain $a\in\mathbb{R}$ , the value of $\sin(a)$ is clear from the definition (in terms of points on the unit circle): for instance, we all know that

\sin(0)=0,\qquad\sin(\pi/4)=1/\sqrt{2},\qquad\sin(\pi/2)=1.

However, given some generic $a\in\mathbb{R}$ , how would you work out the value of $\sin(a)$ ? For instance,

Question 5.66.

How can we compute $\sin(1/2)$ ?

Of course, we can simply plug $\sin(1/2)$ into a calculator and see that

\sin(1/2)=0.4794255386\dots,

but where do these digits come from? Do you know a way to compute them by hand?

There are very few functions whose values we can compute directly. Typically, to compute $f(a)$ we are forced to work with some simpler function $p$ which approximates $f$ near $a$ . The idea is to work with some $p$ whose values we actually can compute, and then relate this information back to $f$ .

In this section, we investigate approximating smooth functions by polynomials. The key result is Taylor’s theorem, which tells us how close our approximation is to the true value of $f$ . Polynomials are great for practical computation: they only involve addition and multiplication, so their values are (relatively) easy to work out. Using Taylor’s theorem, we will develop a practical solution to the question posed in 5.66.

Approximation by constants

In order to understand Taylor’s theorem, we first return to the mean value theorem and give a third and final (as far as this course is concerned) interpretation of this result.

MVT Interpretation 3: Approximation by a constant function.

Let $I\subseteq\mathbb{R}$ be an open interval, $f\colon I\to\mathbb{R}$ be differentiable and $a\in I$ . For any $x\in I$ with $x>a$ , the function $f$ satisfies the hypotheses of the mean value theorem on the interval $[a,x]$ . If $x\in I$ and $x<a$ , then the same holds for the interval $[x,a]$ . From this, we deduce that for all $x\in I\setminus\{a\}$ there exists some $c_{x}\in I$ lying between⁵⁵ 5 When we say $c$ lies between $a$ and $x$ , we simply mean either $c\in(a,x)$ if $x>a$ or $c\in(x,a)$ if $x<a$ . That is, $c\in(\min\{a,x\},\max\{a,x\})$ . $a$ and $x$ such that

(5.25) (5.25)

f(x)=f(a)+f^{\prime}(c_{x})(x-a).

Note that the above identity trivially holds in the case $x=a$ for any choice of $c_{a}\in I$ .

We rewrite (5.25) in new notation as

(5.26) (5.26)

f(x)=P_{0}^{f,a}(x)+R_{0}^{f,a}(x)

where

P_{0}^{f,a}(x):=f(a)\quad\text{for all $x\in\mathbb{R}$}\qquad\text{and}\qquad R% _{0}^{f,a}(x):=f^{\prime}(c_{x})(x-a)\quad\text{for all $x\in I$.}

This notation seems a bit clunky right now, but it will be helpful for comparing with what comes later.

In our third interpretation, we think of $P_{0}^{f,a}\colon\mathbb{R}\to\mathbb{R}$ as a constant function (which is the simplest kind of polynomial). The identity (5.26) can be interpreted as approximating $f$ by the constant function $P_{0}^{f,a}$ . The function $R_{0}^{f,a}$ corresponding to the remainder (or error) of the approximation.

Graphically, the above corresponds to approximating graph of $f$ by the horizontal line $y=f(a)$ . We illustrate this in Figure 5.15. The remainder $R_{0}^{f,a}(x)$ tells us how far points on the graph of $f$ are from the approximating straight line.

Figure 5.15: Approximating

\sin

using the constant polynomial

P_{0}^{\sin,0}

. The remainder

R_{0}^{\sin,0}(x)

is the difference in heights between the graph of

\sin

and the graph of the approximating function.

Example 5.67.

Consider the function $\sin\colon\mathbb{R}\to\mathbb{R}$ around the point $a:=0$ . We wish to crudely approximate $\sin$ by the constant function $P_{0}^{\sin,0}(x):=0$ . From (5.26), the error in this approximation is given by $R_{0}^{\sin,0}(x)=\sin^{\prime}(c_{x})(x-0)$ . Since $|\sin^{\prime}(c)|=|\cos(c)|\leq 1$ for all $c\in\mathbb{R}$ , we have

(5.27) (5.27)

\left|R_{0}^{\sin,0}(x)\right|\leq|x|.

In particular, if $x$ is close to $0$ , then we can guarantee the error in the crude approximation is reasonably small: it is at most $|x|$ .

Approximation by linear polynomials

In general, the horizontal line $y=f(a)$ is not a good choice of approximating line for the graph of our function $f$ . A better choice is to take the tangent line. Recall, the tangent line to the graph of $f$ at the point $(a,f(a))$ is the line which passes through $(a,f(a))$ and has gradient $f^{\prime}(a)$ . This is graphed by the function

P_{1}^{f,a}\colon\mathbb{R}\to\mathbb{R},\qquad P_{1}^{f,a}(x):=f(a)+f^{\prime% }(a)(x-a).

Indeed, we can check this by noting that $P_{1}^{f,a}(a)=f(a)$ and $(P_{1}^{f,a})^{\prime}(x)=f^{\prime}(a)$ for all $x\in\mathbb{R}$ . The following lemma is explores what happens when we approximate $f$ by $P_{1}^{f,a}$ .

Lemma 5.68 (Taylor’s theorem: linear case).

Let $I\subseteq\mathbb{R}$ be an open interval and suppose $f\colon I\to\mathbb{R}$ is twice differentiable. Given $a\in I$ and $x\in I\setminus\{a\}$ , there exists some $c_{x}\in I$ lying between $a$ and $x$ such that

f(x)=f(a)+f^{\prime}(a)(x-a)+R_{1}^{f,a}(x)\qquad\text{where}\qquad R_{1}^{f,a% }(x):=\frac{f^{\prime\prime}(c_{x})}{2}(x-a)^{2}.

Let’s try to understand what Lemma 5.68 is telling us, and how it compares with what we know from the mean value theorem. The quantity $R_{1}^{f,a}(x)$ is the remainder when we approximate $f(x)$ by the linear polynomial $P_{1}^{f,a}(x):=f(a)+f^{\prime}(a)(x-a)$ . We can see this graphically in Figure 5.16(b) . In particular, we can think of $\left|R_{1}^{f,a}(x)\right|$ as a measure of how accurately the tangent line approximates the graph of $f$ .

(a) The constant polynomial

P_{0}^{\sin,0}

(b) The linear polynomial

P_{1}^{\sin,0}

Figure 5.16: Approximating

\sin

using low degree polynomials. The remainders correspond to the difference in heights between the graph of

\sin

and the graph of the approximating function. We see that

R_{1}^{\sin,0}(x)

is much smaller than

R_{0}^{\sin,0}(x)

Example 5.69.

Continuing with the setup from Example 5.67, we now wish to approximate $\sin$ by the linear polynomial

P_{1}^{\sin,0}(x):=\sin(0)+\cos(0)x=x.

The error in this approximation is given by $R_{1}^{\sin,0}(x)$ . Since $|\sin^{\prime\prime}(c)|=|\sin c|\leq 1$ for all $c\in\mathbb{R}$ , we have

(5.28) (5.28)

\left|R_{1}^{\sin,a}(x)\right|\leq\frac{|x|^{2}}{2}\qquad\text{for all $x\in% \mathbb{R}$.}

Thus, if $x$ is close to $0$ , then we can guarantee the error in this approximation is very small: it is at most $|x|^{2}$ .

Comparing (5.27) and (5.28), if $x$ is close to $0$ , then $|x|^{2}$ is much, much smaller than $|x|$ . As we expect, for values of $x$ close to $0$ , the tangent line provides a much better approximation to the graph of $\sin$ than the horizontal line $y=0$ . We can see the difference in size between the remainders $R_{0}^{\sin,a}(x)$ and $R_{1}^{\sin,a}(x)$ by comparing Figure 5.16(a) and Figure 5.16(b).

We won’t prove Lemma 5.68 right now. Instead, we’ll discuss and prove a more general statement (Taylor’s theorem) which includes Lemma 5.68 as a special case.

Warning 5.70.

It is important to note that $P_{1}^{\sin,0}(x)$ is only a better approximation to $\sin(x)$ for values for values of $x$ close to $0$ . For instance, if we take $x=\pi$ (which we think of as being far from $0$ ), then

\sin(\pi)=0,\qquad P_{0}^{\sin,0}(\pi)=0\qquad\text{and}\qquad P_{1}^{\sin,0}(% \pi)=\pi.

In particular, $R_{0}^{\sin,0}(\pi)=0$ and $R_{1}^{\sin,0}(\pi)=\pi$ , so the constant function $P_{0}^{\sin,0}$ is a better approximation than the linear polynomial $P_{1}^{\sin,0}$ at $x=\pi$ .

In light of Warning 5.70, we typically consider the problem of approximating a given function $f$ locally around a point $a$ . That is, in practice we work with values of $x$ close to $a$ .

Approximation by polynomials

The mean value theorem is about approximating $f$ by a constant and Lemma 5.68 is about approximating $f$ by a linear function, around some fixed point $a\in I$ . What happens if we try to approximate $f$ by a more general polynomial function?

For example, suppose we want to approximate $f$ by a quadratic polynomial,

P_{2}^{f,a}(x):=c_{0}+c_{1}(x-a)+c_{2}(x-a)^{2}.

for some coefficients $c_{0}$ , $c_{1}$ , $c_{2}\in\mathbb{R}$ . Notice that we have chosen to express the polynomial in a particular form so that it is centred around $a$ ; this is natural, since we are trying to approximate $f$ around $a$ . The first question is: how do we choose the coefficients $c_{0}$ , $c_{1}$ and $c_{2}$ ?

The constant polynomial $P_{0}^{f,a}$ was chosen so that $P_{0}^{f,a}(a)=f(a)$ . The linear polynomial $P_{1}^{f,a}$ was chosen so that $P_{1}^{f,a}(a)=f(a)$ and $(P_{1}^{f,a})^{\prime}(a)=f^{\prime}(a)$ . It therefore makes sense to continue this pattern, and choose our coefficients so that

P_{2}^{f,a}(a)=f(a),\qquad(P_{2}^{f,a})^{\prime}(a)=f^{\prime}(a)\qquad\text{% and}\qquad(P_{2}^{f,a})^{\prime\prime}(a)=f^{\prime\prime}(a).

Since we have $(P_{2}^{f,a})^{\prime}(a)=c_{1}+2c_{2}(x-a)$ and $(P_{2}^{f,a})^{\prime\prime}(a)=2c_{2}$ , this forces us to take $c_{0}:=f(a)$ , $c_{1}:=f^{\prime}(a)$ and $c_{2}:=\frac{f^{\prime\prime}(a)}{2}$ , giving

(5.29) (5.29)

P_{2}^{f,a}(x):=f(a)+f^{\prime}(a)(x-a)+\frac{f^{\prime\prime}(a)}{2}(x-a)^{2}.

This suggests that, if we wish to approximate $f$ by a quadratic polynomial around $a$ , then (5.29) is our best bet for the approximant. We can push this idea further and consider higher degrees.

Definition 5.71.

Let $I\subseteq\mathbb{R}$ be an open interval, $n\in\mathbb{N}_{0}$ and $f\colon I\to\mathbb{R}$ be $n$ -times differentiable and $a\in I$ . The polynomial⁶⁶ 6 When $n=0$ , we interpret the sum as equal to $0$ and so $P_{0}^{f,a}(x):=f(a)$ as above.

P_{n}^{f,a}(x):=f(a)+\sum_{k=1}^{n}\frac{f^{(k)}(a)}{k!}(x-a)^{k}

is called the Taylor polynomial of degree $n$ at $a$ .

Remark 5.72.

It is often convenient to write

P_{n}^{f,a}(x)=\sum_{k=0}^{n}\frac{f^{(k)}(a)}{k!}(x-a)^{k}

where we adopt the conventions $f^{(0)}:=f$ , $0!:=1$ and $(x-x_{0})^{0}:=1$ (with the latter convention holding even if $x=x_{0}$ ).

Exercise 5.73.

Let $I\subseteq\mathbb{R}$ be an open interval, $f\colon I\to\mathbb{R}$ be $n$ -times differentiable and $a\in I$ .

(i)

For $j$ , $k\in\mathbb{N}_{0}$ with $j\leq k$ , show that $\frac{\mathrm{d}^{j}}{\mathrm{d}x^{j}}(x-a)^{k}=\frac{k!}{(k-j)!}(x-a)^{k-j}$ .
(ii)

For $j$ , $k\in\mathbb{N}_{0}$ with $j>k$ , show that $\frac{\mathrm{d}^{j}}{\mathrm{d}x^{j}}(x-a)^{k}=0$ .
(iii)

Use the above to conclude that the derivatives of $P_{n}^{f,a}$ satisfy

$(P_{n}^{f,a})^{(j)}(a)=f^{(j)}(a)\qquad\text{for $0\leq j\leq n$.}$

We now turn to Taylor’s theorem. As in Lemma 5.68, the idea is that the polynomial $P_{n}^{f,a}$ should give a good approximation to the function $f$ near the point $a$ . Taylor’s theorem tells us how close we can expect the approximation to be to the true value of $f$ .

Theorem 5.74 (Taylor’s theorem).

Let $I\subseteq\mathbb{R}$ be an open interval, $n\in\mathbb{N}_{0}$ and $f\colon I\to\mathbb{R}$ be $(n+1)$ -times differentiable. For all $a\in I$ and $x\in I\setminus\{a\}$ there exists a number $c_{x}$ lying between $x$ and $a$ , which depends on $n$ , $x$ and $a$ , such that

f(x)=P_{n}^{f,a}(x)+R_{n}^{f,a}(x)\qquad\text{where}\qquad R_{n}^{f,a}(x):=% \frac{f^{(n+1)}(c_{x})}{(n+1)!}(x-a)^{n+1}.

Here $R_{n}^{f,a}(x)$ is the remainder when we approximate the value $f(x)$ by $P_{n}^{f,a}(x)$ .

We will return to prove Taylor’s theorem in the next section. For now we note some applications and consequences of the result.

Example 5.75.

Continuing with the setup from Examples 5.67 and 5.69, fixing $n\in\mathbb{N}$ odd, we now wish to approximate $\sin$ by the degree $n$ polynomial⁷⁷footnotemark: 7

(5.30) (5.30)

P_{n}^{\sin,0}(x):=\sum_{k=0}^{n}\frac{\sin^{(k)}(0)}{k!}x^{k}=\sum_{k=0}^{% \ell}\frac{(-1)^{k}}{(2k+1)!}x^{2k+1}\qquad\text{where $n=2\ell+1$ for $\ell% \in\mathbb{N}_{0}$.}

We could also consider even degrees $n=2\ell$ , in which case the above formula is slightly different (we sum up to the index $\ell-1$ rather than the index $\ell$ ). The error in this approximation is given by $R_{n}^{\sin,0}(x)$ . Since $|\sin^{(n+1)}(c)|\leq 1$ for all $c\in\mathbb{R}$ , we have

(5.31) (5.31)

\left|R_{n}^{\sin,0}(x)\right|\leq\frac{|x|^{n+1}}{(n+1)!}\qquad\text{for all % $x\in\mathbb{R}$.}

This generalises the bounds (5.27) and (5.28), which correspond to the $n=0$ and $n=1$ cases, respectively. As $n$ get larger, we are approximating $\sin$ using a higher and higher degree polynomial $P_{n}^{\sin,0}$ . Correspondingly, the right-hand side of (5.31) gets smaller and smaller.⁸⁸ 8 At least for $|x|\leq 1$ . If $|x|$ is large, then the right-hand side of (5.31) may increase in $n$ for the first few values of $n$ , but then will later decrease down towards $0$ . Thus, for large $n$ the polynomial $P_{n}^{\sin,0}$ gives an extremely good approximation to $f$ around $0$ , and the approximation gets better as $n$ increases. We illustrate this in Figure 5.17.

^†^†footnotetext: The second equality relies on the fact that

\sin^{(k)}(0)=\begin{cases}(-1)^{\ell}&\text{if $k=2\ell+1$ for some $\ell\in% \mathbb{N}_{0}$,}\\ 0&\text{if $k$ is even.}\end{cases}

Can you see why this is true?

Figure 5.17: Successive Taylor polynomials

P_{n}^{\sin,0}

for

n=1

3

5

7

. As the degree increases, near

0

the polynomial

P_{n}^{\sin,0}

provides a more and more accurate approximation for

\sin

We can apply Taylor’s theorem to answer the question posed in 5.66.

Example 5.76.

We can use Example 5.75 to compute the first decimal digits of $\sin(1/2)$ .

To illustrate the approach, we approximate $\sin$ by the Taylor polynomial $P_{6}^{\sin,0}$ , which can be computed using basic arithmetic. Observe that

(5.32) (5.32)

P_{6}^{\sin,0}(1/2)=\frac{1}{2}-\frac{1}{2^{3}}\cdot\frac{1}{3!}+\frac{1}{2^{5% }}\cdot\frac{1}{5!}=\frac{1}{2}-\frac{1}{48}+\frac{1}{3840}=\frac{1841}{3840}=% 0.47942708333\dots.

The question is: how far is $P_{6}^{\sin,0}(1/2)$ from the true value of $\sin(1/2)$ ? From the bound (5.31) derived from Taylor’s theorem, remainder satisfies

(5.33) (5.33)

\left|\sin(1/2)-P_{6}^{\sin,0}(1/2)\right|=\left|R_{6}^{\sin,0}(1/2)\right|% \leq\frac{1}{2^{7}}\cdot\frac{1}{7!}=\frac{1}{645120}<0.\underline{00000}16.

The fact that there are 5 zeros after the decimal point here indicates that our approximation is correct in its first 5 decimal digits: $0.47942$ . Note that our earlier calculation (5.32) shows that

\underline{0.47942}7\leq P_{6}^{\sin,0}(1/2)\leq\underline{0.47942}8.

Now, suppose $\sin(1/2)\leq 0.47942$ . In this case,

|\sin(1/2)-P_{6}^{\sin,0}(1/2)|=P_{6}^{\sin,0}(1/2)-\sin(1/2)\geq 0.479427-0.4% 7942=0.000007,

which contradicts (5.33). Similarly, suppose $\sin(1/2)\geq 0.47943$ . Then

|\sin(1/2)-P_{6}^{\sin,0}(1/2)|=\sin(1/2)-P_{6}^{\sin,0}(1/2)\geq 0.47943-0.47% 9428=0.000002,

which again contradicts (5.33). Thus, we must have $0.47942<\sin(1/2)<0.47943$ which tells us that the first $5$ decimal digits of $\sin(1/2)$ are $0.47942$ .

Taylor series expansion

If $f$ is an infinitely differentiable function, then we can define the Taylor polynomials $P_{n}^{f,a}$ for all degrees $n\in\mathbb{N}$ . These functions provide a sequence of polynomials which provide increasingly accurate approximations to the function $f$ . It is therefore natural to ask whether $P_{n}^{f,a}(x)\to f(x)$ as $n\to\infty$ . This question leads us to consider Taylor series. We illustrate this concept by considering the familiar example of the $\sin$ function.

Theorem 5.77 (Taylor series expansion for

\sin

For all $x\in\mathbb{R}$ , we have

(5.34) (5.34)

\sin x=\sum_{k=0}^{\infty}\frac{(-1)^{k}x^{2k+1}}{(2k+1)!}=x-\frac{x^{3}}{3!}+% \frac{x^{5}}{5!}-\frac{x^{7}}{7!}+\cdots.

Proof.

For $x\in\mathbb{R}$ and $\ell\in\mathbb{N}$ , let $s_{\ell}(x)$ denote the $\ell$ th partial sum of the series on the right-hand side of (5.34). From (5.30), we see that $s_{\ell}(x)=P_{2\ell+1}^{\sin,0}(x)$ . As observed in Example 5.75, Taylor’s theorem ensures that

|\sin x-s_{\ell}(x)|=\left|\sin x-P_{2\ell+1}^{\sin,0}(x)\right|=\left|R_{2% \ell+1}^{\sin,0}(x)\right|\leq\frac{|x|^{2\ell+2}}{(2\ell+2)!}.

We have $\displaystyle\lim_{\ell\to\infty}\frac{|x|^{2\ell+2}}{(2\ell+2)!}=0$ ; see Exercise 5.78. Thus, the sequence of partial sums $(s_{\ell}(x))_{\ell\in\mathbb{N}_{0}}$ satisfies $s_{\ell}(x)\to\sin x$ as $\ell\to\infty$ , which is precisely the identity (5.34). ∎

Exercise 5.78.

By considering the series $\displaystyle\sum_{n=1}^{\infty}\frac{|x|^{n+1}}{(n+1)!}$ , show that $\displaystyle\lim_{n\to\infty}\frac{|x|^{n+1}}{(n+1)!}=0$ holds for all $x\in\mathbb{R}$ .

The series on the right-hand side of (5.34) is called the Taylor series of $\sin x$ centred at $0$ . More generally, we have the following definition.

Definition 5.79.

Given an open interval $I\subseteq\mathbb{R}$ , an infinitely differentiable function $f\colon I\to\mathbb{R}$ and $a\in I$ , we define the Taylor series of $f$ centred at $a$ to be the formal series

(5.35) (5.35)

\sum_{k=0}^{\infty}\frac{f^{(k)}(a)}{k!}(x-a)^{k}.

We write formal here to indicate the fact that we do not know, in general, whether (5.35) converges. However, for certain familiar functions $f$ , the Taylor series of $f$ does converge back to $f$ .

Remark 5.80.

The case $a=0$ of a Taylor series is sometimes called a Maclaurin series, at least within the UK. Maclaurin was a professor of mathematics in Edinburgh in the eighteenth century and his grave can be found in Greyfriars Kirkyard.

Exercise 5.81.

Show that $\displaystyle\cos x=\sum_{k=0}^{\infty}\frac{(-1)^{k}x^{2k}}{(2k)!}$ holds for all $x\in\mathbb{R}$ .

Exercise 5.82.

Compute the Taylor series of $\exp$ centred at $0$ . What do you notice about the series that you obtain?

Warning 5.83.

Only very special functions can be expressed in terms of a Taylor series and, in general, many things can go wrong:

•

The function $f$ may not be infinitely differentiable, so that its Taylor series is not defined;
•

Even if $f$ is infinitely differentiable, so we can define the Taylor series, there is no guarantee that the Taylor series will converge for all values of $x$ in the domain of $f$ ;
•

Even if the Taylor series does converge at a given point $x$ in the domain of $f$ , there is no guarantee that the limit is equal to $f(x)$ .

Some of these subtleties are illustrated Example 5.84 below. Understanding convergence of Taylor series is a subtle problem, which is explored in year 3 analysis courses.

Example 5.84.

One can show that the Taylor series of $(1+x^{2})^{-1}$ centred at $0$ is given by

\sum_{k=0}^{\infty}(-1)^{k}x^{2k}.

Computing the Taylor series is a little tricky by arguing directly from the definition (this is not recommended), but becomes a lot easier using some more advanced theory. You will learn more about this if you take the year 3 analysis courses. On the other hand, using what we already know about geometric series,

\frac{1}{1+x^{2}}=\sum_{k=0}^{\infty}(-1)^{k}x^{2k}\qquad\text{for all $x\in(-% 1,1)$.}

However, we can also see that the series diverges for all $x\in\mathbb{R}$ with $|x|\geq 1$ (for instance, by the $k$ th term test: the sequence of terms $((-1)^{k}x^{2k})_{k\in\mathbb{N}}$ does not converge when $|x|\geq 1$ , so the series must diverge).

Thus, $x\mapsto(1+x^{2})^{-1}$ is an example of a function which is infinitely differentiable on the whole of $\mathbb{R}$ , but the Taylor series only converges on the interval $(-1,1)$ . The reason for this behaviour becomes clear once we move to the complex plane: to understand it we need ideas from complex analysis.