5.4 The differential inverse function theorem

Suppose $f\colon I\to J$ is bijective, so that we may define the inverse $f^{-1}\colon J\to I$ . The graph of $f^{-1}$ is given by the reflection of the graph of $f$ along the diagonal. If $f$ is differentiable at $a\in I$ , then from the picture in Figure 5.9 we intuitively expect that $f^{-1}$ should be differentiable at $b:=f(a)\in J$ . Indeed, the tangent line to $f^{-1}$ at $b$ should correspond to the reflection of the tangent line to $f$ at $a$ . Moreover, reflecting across the diagonal swaps the ‘horizontal change’ and ‘vertical change’ of the lines. Thus, we expect $f^{-1}(b)=1/f^{\prime}(a)$ , which can also be written as

(5.16) (5.16)

f^{-1}(b)=\frac{1}{f^{\prime}(f^{-1}(b))}.

Our goal is to make this intuitive argument precise.

Figure 5.9: The inverse function theorem. For

b=f(a)

, the tangent line to the graph of

f^{-1}

(b,f^{-1}(b))

is the reflection across the diagonal of the tangent line to

f

(a,f(a))

If we know $f^{-1}$ is differentiable at $b$ , then we can give a precise proof of (5.16) using the chain rule. Indeed, we can use the chain rule to differentiate both sides of the equation $f\circ f^{-1}(y)=y$ at the point $y=b$ to deduce that $(f^{-1})^{\prime}(b)f^{\prime}(f^{-1}(b))=1$ . This then rearranges to give (5.16).

The problem with the chain rule argument is that it requires us to know $f^{-1}$ is differentiable at $b$ in the first place. The differential inverse function theorem addresses this shortcoming.

Theorem 5.30 (Differential inverse function theorem).

Let $I$ , $J\subseteq\mathbb{R}$ be open intervals and $f\colon I\to J$ be bijective. If $b\in J$ and $f$ is differentiable at $f^{-1}(b)$ with $f^{\prime}(f^{-1}(b))\neq 0$ , then $f^{-1}$ is differentiable at $b$ and

(f^{-1})^{\prime}(b)=\frac{1}{f^{\prime}(f^{-1}(b))}.

The inverse function theorem is a very helpful theoretical and computational tool. It can be used to show that many functions are differentiable and compute their derivatives. Before giving the proof of Theorem 5.30, let’s see some examples of it in action.

Example 5.31.

Recall from our earlier discussion in Chapter 4 that the exponential function $\exp\colon\mathbb{R}\to(0,\infty)$ is a bijection.²² 2 To be precise, so far we have only rigorously proved that $\exp$ is a bijection when considered as a mapping from $[0,\infty)$ to $[1,\infty)$ : see Example 4.97. However, we shall extend this in Section 5.7 below. Thus, there exists an inverse function $\log\colon(0,\infty)\to\mathbb{R}$ , which satisfies $\log(\exp(x))=x$ for all $x\in\mathbb{R}$ and $\exp(\log(y))=y$ for all $y>0$ . Furthermore, we know from Lemma 5.12 that $\exp$ is differentiable with $\exp^{\prime}(x)=\exp(x)>0$ for all $x\in\mathbb{R}$ . It therefore follows from the inverse function theorem that $\log$ is also differentiable and

\log^{\prime}(y)=\frac{1}{\exp^{\prime}(\log(y))}=\frac{1}{\exp(\log y)}=\frac% {1}{y}\qquad\text{for all $y>0$.}

Exercise 5.32.

Let $n\in\mathbb{N}$ and $f\colon(0,\infty)\to(0,\infty)$ be the function $f(x):=x^{1/n}$ . Show that $f$ is differentiable and

f^{\prime}(x)=\frac{x^{-1+1/n}}{n}\qquad\text{for all $x>0$. }

We now turn to the proof of the inverse function theorem.

Proof (of Theorem 5.30).

This proof is nonexaminable.

Fix $b\in J$ so that we can write $b=f(a)$ for some $a\in I$ . To establish whether $f^{-1}$ is differentiable at $b$ , we need to consider the difference quotients

\frac{f^{-1}(b+k)-f^{-1}(b)}{k}=\frac{f^{-1}(b+k)-a}{k}\qquad\text{for $k\neq 0% $ with $b+k\in J$.}

Given $k\neq 0$ such that $b+k\in J=f(I)$ , we may write $b+k=f(a+h(k))$ for some $h(k)\neq 0$ such that $a+h(k)\in I$ . Thus, the difference quotient can be rewritten as

(5.17) (5.17)

\frac{f^{-1}(b+k)-a}{k}=\frac{a+h(k)-a}{f(a+h(k))-b}=\frac{h(k)}{f(a+h(k))-f(a% )}.

On the other hand, by applying $f^{-1}$ to both sides of $b+k=f(a+h(k))$ , we see that

f^{-1}(b+k)=a+h(k)\qquad\iff\qquad h(k)=f^{-1}(b+k)-f^{-1}(b).

By Theorem 4.112, we know that the inverse function $f^{-1}$ is continuous, and so

\lim_{k\to 0}h(k)=\lim_{k\to 0}f^{-1}(b+k)-f^{-1}(b)=f^{-1}(b)-f^{-1}(b)=0.

Combining this observation with (5.17) and using the composition law for limits from Theorem 4.55, we have

(f^{-1})^{\prime}(b)=\lim_{k\to 0}\frac{f^{-1}(b+k)-a}{k}=\lim_{h\to 0}\frac{h% }{f(a+h)-f(a)}=\frac{1}{f^{\prime}(a)},

where the last step uses the quotient rule for limits of functions from Theorem 4.49. Since $a=f^{-1}(b)$ , this completes the proof. ∎