5.4 The differential inverse function theorem

Suppose f:IJf\colon I\to J is bijective, so that we may define the inverse f1:JIf^{-1}\colon J\to I. The graph of f1f^{-1} is given by the reflection of the graph of ff along the diagonal. If ff is differentiable at aIa\in I, then from the picture in Figure 5.9 we intuitively expect that f1f^{-1} should be differentiable at b:=f(a)Jb:=f(a)\in J. Indeed, the tangent line to f1f^{-1} at bb should correspond to the reflection of the tangent line to ff at aa. Moreover, reflecting across the diagonal swaps the ‘horizontal change’ and ‘vertical change’ of the lines. Thus, we expect f1(b)=1/f(a)f^{-1}(b)=1/f^{\prime}(a), which can also be written as

(5.16) (5.16) f1(b)=1f(f1(b)).f^{-1}(b)=\frac{1}{f^{\prime}(f^{-1}(b))}.

Our goal is to make this intuitive argument precise.

Figure 5.9: The inverse function theorem. For b=f(a)b=f(a), the tangent line to the graph of f1f^{-1} at (b,f1(b))(b,f^{-1}(b)) is the reflection across the diagonal of the tangent line to ff at (a,f(a))(a,f(a)).

If we know f1f^{-1} is differentiable at bb, then we can give a precise proof of (5.16) using the chain rule. Indeed, we can use the chain rule to differentiate both sides of the equation ff1(y)=yf\circ f^{-1}(y)=y at the point y=by=b to deduce that (f1)(b)f(f1(b))=1(f^{-1})^{\prime}(b)f^{\prime}(f^{-1}(b))=1. This then rearranges to give (5.16).

The problem with the chain rule argument is that it requires us to know f1f^{-1} is differentiable at bb in the first place. The differential inverse function theorem addresses this shortcoming.

Theorem 5.30 (Differential inverse function theorem).

Let II, JJ\subseteq\mathbb{R} be open intervals and f:IJf\colon I\to J be bijective. If bJb\in J and ff is differentiable at f1(b)f^{-1}(b) with f(f1(b))0f^{\prime}(f^{-1}(b))\neq 0, then f1f^{-1} is differentiable at bb and

(f1)(b)=1f(f1(b)).(f^{-1})^{\prime}(b)=\frac{1}{f^{\prime}(f^{-1}(b))}.

The inverse function theorem is a very helpful theoretical and computational tool. It can be used to show that many functions are differentiable and compute their derivatives. Before giving the proof of Theorem 5.30, let’s see some examples of it in action.

Example 5.31.

Recall from our earlier discussion in Chapter 4 that the exponential function exp:(0,)\exp\colon\mathbb{R}\to(0,\infty) is a bijection.22 2 To be precise, so far we have only rigorously proved that exp\exp is a bijection when considered as a mapping from [0,)[0,\infty) to [1,)[1,\infty): see Example 4.97. However, we shall extend this in Section 5.7 below. Thus, there exists an inverse function log:(0,)\log\colon(0,\infty)\to\mathbb{R}, which satisfies log(exp(x))=x\log(\exp(x))=x for all xx\in\mathbb{R} and exp(log(y))=y\exp(\log(y))=y for all y>0y>0. Furthermore, we know from Lemma 5.12 that exp\exp is differentiable with exp(x)=exp(x)>0\exp^{\prime}(x)=\exp(x)>0 for all xx\in\mathbb{R}. It therefore follows from the inverse function theorem that log\log is also differentiable and

log(y)=1exp(log(y))=1exp(logy)=1yfor all y>0.\log^{\prime}(y)=\frac{1}{\exp^{\prime}(\log(y))}=\frac{1}{\exp(\log y)}=\frac% {1}{y}\qquad\text{for all $y>0$.}
Exercise 5.32.

Let nn\in\mathbb{N} and f:(0,)(0,)f\colon(0,\infty)\to(0,\infty) be the function f(x):=x1/nf(x):=x^{1/n}. Show that ff is differentiable and

f(x)=x1+1/nnfor all x>0f^{\prime}(x)=\frac{x^{-1+1/n}}{n}\qquad\text{for all $x>0$. }

We now turn to the proof of the inverse function theorem.

Proof (of Theorem 5.30).

This proof is nonexaminable.

Fix bJb\in J so that we can write b=f(a)b=f(a) for some aIa\in I. To establish whether f1f^{-1} is differentiable at bb, we need to consider the difference quotients

f1(b+k)f1(b)k=f1(b+k)akfor k0 with b+kJ.\frac{f^{-1}(b+k)-f^{-1}(b)}{k}=\frac{f^{-1}(b+k)-a}{k}\qquad\text{for $k\neq 0% $ with $b+k\in J$.}

Given k0k\neq 0 such that b+kJ=f(I)b+k\in J=f(I), we may write b+k=f(a+h(k))b+k=f(a+h(k)) for some h(k)0h(k)\neq 0 such that a+h(k)Ia+h(k)\in I. Thus, the difference quotient can be rewritten as

(5.17) (5.17) f1(b+k)ak=a+h(k)af(a+h(k))b=h(k)f(a+h(k))f(a).\frac{f^{-1}(b+k)-a}{k}=\frac{a+h(k)-a}{f(a+h(k))-b}=\frac{h(k)}{f(a+h(k))-f(a% )}.

On the other hand, by applying f1f^{-1} to both sides of b+k=f(a+h(k))b+k=f(a+h(k)), we see that

f1(b+k)=a+h(k)h(k)=f1(b+k)f1(b).f^{-1}(b+k)=a+h(k)\qquad\iff\qquad h(k)=f^{-1}(b+k)-f^{-1}(b).

By Theorem 4.112, we know that the inverse function f1f^{-1} is continuous, and so

limk0h(k)=limk0f1(b+k)f1(b)=f1(b)f1(b)=0.\lim_{k\to 0}h(k)=\lim_{k\to 0}f^{-1}(b+k)-f^{-1}(b)=f^{-1}(b)-f^{-1}(b)=0.

Combining this observation with (5.17) and using the composition law for limits from Theorem 4.55, we have

(f1)(b)=limk0f1(b+k)ak=limh0hf(a+h)f(a)=1f(a),(f^{-1})^{\prime}(b)=\lim_{k\to 0}\frac{f^{-1}(b+k)-a}{k}=\lim_{h\to 0}\frac{h% }{f(a+h)-f(a)}=\frac{1}{f^{\prime}(a)},

where the last step uses the quotient rule for limits of functions from Theorem 4.49. Since a=f1(b)a=f^{-1}(b), this completes the proof. ∎