Nathan Wakefield, Christine Kelley, Marla Williams, Michelle Haver, Lawrence Seminario-Romero, Robert Huben, Aurora Marks, Stephanie Prahl, Based upon Active Calculus by Matthew Boelkins

Section2.5The Chain Rule

Motivating Questions

What is a composite function and how do we recognize its structure algebraically?

Given a composite function \(C(x) = f(g(x))\) that is built from differentiable functions \(f\) and \(g\text{,}\) how do we compute \(C'(x)\) in terms of \(f\text{,}\) \(g\text{,}\) \(f'\text{,}\) and \(g'\text{?}\) What is the statement of the Chain Rule?

In addition to learning how to differentiate a variety of basic functions, we have also been developing our ability to use rules to differentiate certain algebraic combinations of them.

Example2.56

State the rule(s) used to find the derivative of each of the following combinations of \(f(x) = \sin(x)\) and \(g(x) = x^2\text{:}\)

Finding \(s'\) uses the sum and constant multiple rules, because \(s(x) = 3g(x) - 5f(x)\text{.}\) Determining \(p'\) requires the product rule, because \(p(x) = g(x) \cdot f(x)\text{.}\) To calculate \(q'\) we use the quotient rule, because \(q(x) =\frac{f(x)}{g(x)}\text{.}\)

and observe that any input \(x\) passes through a chain of functions. In the process that defines the function \(C(x)\text{,}\) \(x\) is first squared, and then the sine of the result is taken. We can represent this using an arrow diagram as follows:

\begin{equation*}
x \longrightarrow x^2 \longrightarrow \sin(x^2)\text{.}
\end{equation*}

It turns out we can express \(C\) in terms of the elementary functions \(f\) and \(g\) that were used above in Example2.56. Observe that \(x\) is the input for the function \(g\text{,}\) and the result is then used as the input for \(f\text{.}\) We write

and say that \(C\) is the composition of \(f\) and \(g\text{.}\) We will refer to \(g\text{,}\) the function that is first applied to \(x\text{,}\) as the inner function, while \(f\text{,}\) the function that is applied to the result, as the outer function.

Given a composite function \(C(x) = f(g(x))\) that is built from differentiable functions \(f\) and \(g\text{,}\) how do we compute \(C'(x)\) in terms of \(f\text{,}\) \(g\text{,}\) \(f'\text{,}\) and \(g'\text{?}\) In the same way that the rate of change of a product of two functions, \(p(x) = f(x) \cdot g(x)\text{,}\) depends on the behavior of both \(f\) and \(g\text{,}\) it makes sense intuitively that the rate of change of a composite function \(C(x) = f(g(x))\) will also depend on some combination of \(f\) and \(g\) and their derivatives. The rule that describes how to compute \(C'\) in terms of \(f\) and \(g\) and their derivatives is called the chain rule.

But before we can learn what the chain rule says and why it works, we first need to be comfortable decomposing composite functions so that we can correctly identify the inner and outer functions, as we did in the example above with \(C(x) = \sin(x^2)\text{.}\)

Example2.57

For each function given below, identify its fundamental algebraic structure. In particular, is the given function a sum, product, quotient, or composition of basic functions? If the function is a composition of basic functions, state a formula for the inner function \(g\) and the outer function \(f\) so that the overall composite function can be written in the form \(f(g(x))\text{.}\) If the function is a sum, product, or quotient of basic functions, use the appropriate rule to determine its derivative.

\(\tan(2^x)\) is the composition of \(\tan(x)\) and \(2^x\text{.}\) Specifically, with \(f(x)=\tan(x)\text{,}\) \(g(x)=2^x\text{,}\) and \(h(x)=\tan(2^x)\text{,}\) we can write \(h(x)=f(g(x))\text{.}\)

\(2^x\tan(x)\) is the product of \(2^x\) and \(\tan(x)\text{.}\) Using the product rule to differentiate \(p(x)=2^x\tan(x)\text{,}\) we end up with

\((\tan(x))^2\) is the composition of \(x^2\) and \(\tan(x)\text{.}\) In particular, with \(f(x)=x^2\text{,}\) \(g(x)=\tan(x)\text{,}\) and \(r(x)=(\tan(x))^2\text{,}\) we can write \(r(x)=f(g(x))\text{.}\)

Alternatively, we can recognize \((\tan(x))^2\) as the product of \(\tan(x)\) with itself. Using the product rule to differentiate \(r(x)=(\tan(x))^2\text{,}\) we find

\(e^{\tan(x)}\) is the composition of \(e^x\) and \(\tan(x)\text{.}\) Specifically, with \(f(x)=e^x\text{,}\) \(g(x)=\tan(x)\text{,}\) and \(m(x)=e^{\tan(x)}\text{,}\) we can write \(m(x)=f(g(x))\text{.}\)

\(\sqrt{x}+\tan(x)\) is the sum of \(\sqrt{x}=x^{\frac{1}{2}}\) and \(\tan(x)\text{.}\) Using the sum rule to find the derivative of \(w(x)=\sqrt{x}+\tan(x)\text{,}\) we find

\(\sqrt{\tan(x)}\) is the composition of \(\sqrt{x}\) and \(\tan(x)\text{.}\) In particular, with \(f(x)=\sqrt{x}\text{,}\) \(g(x)=\tan(x)\text{,}\) and \(z(x)=\sqrt{\tan(x)}\text{,}\) we can write \(z(x)=f(g(x))\text{.}\)

SubsectionThe Chain Rule

Often a composite function cannot be written in an alternate algebraic form. For instance, the function \(C(x) = \sin(x^2)\) cannot be expanded or otherwise rewritten, so it presents no alternate approaches to taking the derivative. But some composite functions can be expanded or simplified, and these provide a way to explore how the chain rule works. One example of this was the function \(r(x)=(\tan(x))^2\) in Example2.57; another example is investigated below in Example2.58.

Example2.58

Let \(f(x) = -4x + 7\) and \(g(x) = 3x - 5\text{.}\) Determine a formula for \(C(x) = f(g(x))\) and compute \(C'(x)\text{.}\) How is \(C'\) related to \(f\) and \(g\) and their derivatives?

Thus, \(C'(x) = -12\text{.}\) Noting that \(f'(x) = -4\) and \(g'(x) = 3\text{,}\) we observe that \(C'\) appears to be the product of \(f'\) and \(g'\text{.}\)

It may seem that Example2.58 is too elementary to illustrate how to differentiate a composite function. Linear functions are the simplest of all functions, and composing linear functions yields another linear function. While this example does not illustrate the full complexity of a composition of nonlinear functions, at the same time we remember that any differentiable function is locally linear, and thus any function with a derivative behaves like a line when viewed up close. The fact that the derivatives of the linear functions \(f\) and \(g\) are multiplied to find the derivative of their composition turns out to be a key insight.

We now consider a composition involving a nonlinear function.

Example2.59

Let \(C(x) = \sin(2x)\text{.}\) Use the double angle identity to rewrite \(C\) as a product of basic functions, and use the product rule to find \(C'\text{.}\) Rewrite \(C'\) in the simplest form possible.

In Example2.59, if we let \(g(x) = 2x\) and \(f(x) = \sin(x)\text{,}\) we observe that \(C(x) = f(g(x))\text{.}\) Note that \(g'(x) = 2\) and \(f'(x) = \cos(x)\text{,}\) so we can view the structure of \(C'(x)\) as

In this example, as in the example involving linear functions, we see that the derivative of the composite function \(C(x) = f(g(x))\) is found by multiplying the derivatives of \(f\) and \(g\text{,}\) but with \(f'\) evaluated at \(g(x)\text{.}\)

Intuitively, it makes sense that these two quantities are involved in the rate of change of a composite function: if we ask how fast \(C\) is changing at a given \(x\) value, it clearly matters how fast \(g\) is changing at \(x\text{,}\) as well as how fast \(f\) is changing at the value of \(g(x)\text{.}\) It turns out that this structure holds for all differentiable functions^{8}It is important to recognize that we have not proved the chain rule, instead we have given a reason you might believe the chain rule to be true. A key component of mathematics is verifying one's intuition through formal proof. We will omit the proof of the chain rule, but just like other differentiation rules the chain rule can be proved formally using the limit definition of the derivative. as is stated in the chain rule.

Chain Rule

If \(g\) is differentiable at \(x\) and \(f\) is differentiable at \(g(x)\text{,}\) then the composite function \(C\) defined by \(C(x) = f(g(x))\) is differentiable at \(x\) and

As with the product and quotient rules, it is often helpful to think verbally about what the chain rule says: If \(C\) is a composite function defined by an outer function \(f\) and an inner function \(g\text{,}\) then \(C'\) is given by the derivative of the outer function evaluated at the inner function, times the derivative of the inner function.

It is helpful to clearly identify the inner function \(g\) and outer function \(f\text{,}\) compute their derivatives individually, and then put all of the pieces together by the chain rule.

Example2.60

Use the chain rule to determine the derivative of the function

The function \(r\) is composite, with inner function \(g(x) = \tan(x)\) and outer function \(f(x) = x^2\text{.}\) Organizing the key information involving \(f\text{,}\) \(g\text{,}\) and their derivatives, we have

As a side note, we remark that \(r(x)\) is usually written as \(\tan^2(x)\text{.}\) This is common notation for powers of trigonometric functions: e.g. \(\cos^4(x)\text{,}\) \(\sin^5(x)\text{,}\) and \(\sec^2(x)\) are all composite functions, with the outer function a power function and the inner function a trigonometric one.

Example2.61

For each function given below, identify an inner function \(g\) and outer function \(f\) to write the function in the form \(f(g(x))\text{.}\) Determine \(f'(x)\text{,}\) \(g'(x)\text{,}\) and \(f'(g(x))\text{,}\) and then apply the chain rule to determine the derivative of the given function.

The chain rule now joins the sum, constant multiple, product, and quotient rules in our collection of techniques for finding the derivative of a function through understanding its algebraic structure and the basic functions that constitute it. It takes practice to get comfortable applying multiple rules to differentiate a single function, but using proper notation and taking a few extra steps will help.

Example2.62

Find a formula for the derivative of \(h(t) = 3^{t^2 + 2t}\sec^4(t)\text{.}\)

We first observe that \(h\) is the product of two functions: \(h(t) = a(t) \cdot b(t)\text{,}\) where \(a(t) = 3^{t^2 + 2t}\) and \(b(t) = \sec^4(t)\text{.}\) We will need to use the product rule to differentiate \(h\text{.}\) And because \(a\) and \(b\) are composite functions, we will also need the chain rule. We therefore begin by computing \(a'(t)\) and \(b'(t)\text{.}\)

Writing \(a(t) = f(g(t)) = 3^{t^2 + 2t}\) and finding the derivatives of \(f\) and \(g\) with respect to \(t\text{,}\) we have

Turning next to the function \(b\text{,}\) we write \(b(t) = r(s(t)) = \sec^4(t)\) and find the derivatives of \(r\) and \(s\) with respect to \(t\text{.}\)

\(r(t) = t^4\text{,}\)

\(s(t) = \sec(t)\text{,}\)

\(r'(t) = 4t^3\text{,}\)

\(s'(t) = \sec(t)\tan(t)\text{,}\)

\(r'(s(t)) = 4\sec^3(t)\text{.}\)

By the chain rule,

\begin{equation*}
\end{equation*}

Now we are finally ready to compute the derivative of the function \(h\text{.}\) Recalling that \(h(t) = 3^{t^2 + 2t}\sec^4(t)\text{,}\) by the product rule we have

The above calculation may seem tedious. However, by breaking the function down into small parts and calculating derivatives of those parts separately, we are able to accurately calculate the derivative of the entire function.

Example2.63

Differentiate each of the following functions. State the rule(s) you use, label relevant derivatives appropriately, and be sure to clearly identify your overall answer.

By the constant multiple rule, \(p'(r) = 4\frac{d}{dr}\left[\sqrt{r^6 + 2e^r}\right]\text{.}\) Using the chain rule to complete the remaining derivative, we see that

Here we have the composition of three functions, rather than just two. If we first apply the chain rule to the outermost function (the sine function), we find that

The chain rule now adds substantially to our ability to compute derivatives. Whether we are finding the equation of the tangent line to a curve, the instantaneous velocity of a moving particle, or the instantaneous rate of change of a certain quantity, the chain rule is indispensable if the function under consideration is a composition.

Example2.64

Use known derivative rules (including the chain rule) as needed to answer each of the following questions.

Find an equation for the tangent line to the curve \(y= \sqrt{e^x + 3}\) at the point where \(x=0\text{.}\)

If \(\displaystyle s(t) = \frac{1}{(t^2+1)^3}\) represents the position function of a particle moving horizontally along an axis at time \(t\) (where \(s\) is measured in inches and \(t\) in seconds), find the particle's instantaneous velocity at \(t=1\text{.}\) Is the particle moving to the left or right at that instant?^{9}You may assume that this axis is like a number line, with left being the negative direction, and right being the positive direction.

Suppose that \(f(x)\) and \(g(x)\) are differentiable functions and that the following information about them is known:

\(x\)

\(f(x)\)

\(f'(x)\)

\(g(x)\)

\(g'(x)\)

\(-1\)

\(2\)

\(-5\)

\(-3\)

\(4\)

\(2\)

\(-3\)

\(4\)

\(-1\)

\(2\)

Table2.65Data for functions \(f\) and \(g\text{.}\)

If \(C(x)\) is a function given by the formula \(f(g(x))\text{,}\) determine \(C'(2)\text{.}\) In addition, if \(D(x)\) is the function \(f(f(x))\text{,}\) find \(D'(-1)\text{.}\)

Let \(f(x) = \sqrt{e^x + 3}\text{.}\) By the chain rule, \(f'(x) = \frac{e^x}{2\sqrt{e^x + 3}}\text{,}\) and thus \(f'(0) = \frac{1}{4}\text{.}\) Note further that \(f(0) = \sqrt{1 + 3} = 2\text{.}\) The tangent line is therefore the line through \((0,2)\) with slope \(\frac{1}{4}\text{,}\) which is

\begin{equation*}
y - 2 = \frac{1}{4}(x-0)\text{.}
\end{equation*}

Observe that \(s(t) = (t^2 + 1)^{-3}\text{,}\) and thus by the chain rule, \(s'(t) = -3(t^2 + 1)^{-4}(2t)\text{.}\) We therefore see that \(s'(1) = -\frac{6}{16} = -\frac{3}{8}\) inches per second, so the particle is moving left at the instant \(t = 1\text{.}\)

Since \(C(x) = f(g(x))\text{,}\) it follows \(C'(x) = f'(g(x))g'(x)\text{.}\) Therefore, \(C'(2) = f'(g(2))g'(2)\text{.}\) From the given table, \(g(2) = -1\text{,}\) so applying this result and using the additional given information,

For \(D(x) = f(f(x))\text{,}\) the chain rule tells us that \(D'(x) = f'(f(x))f'(x)\text{,}\) so \(D'(-1) = f'(f(-1))f'(-1)\text{.}\) Using the given table, it follows that

SubsectionThe Composite Version of Basic Function Rules

As we gain more experience with differentiation, we will become more comfortable in simply writing down the derivative without taking multiple steps. This is particularly simple when the inner function is linear, since the derivative of a linear function is a constant.

Example2.66

Use the chain rule to differentiate each of the following composite functions whose inside function is linear:

More generally, an excellent exercise for getting comfortable with the derivative rules is as follows. First write down a list of all the basic functions whose derivatives we know, and list the derivatives. Then write a composite function with the inner function being an unknown function \(u(x)\) and the outer function being a basic function. Finally, write the chain rule for the composite function. The following example illustrates this for two different functions.

where \(u\) is a differentiable function of \(x\text{,}\) we use the chain rule with the sine function as the outer function. Applying the chain rule, we find that

This rule is analogous to the basic derivative rule that \(\frac{d}{dx}[a^{x}] = a^{x} \ln(a)\text{.}\)

SubsectionSummary

A composite function is one where the input variable \(x\) first passes through one function, and then the resulting output passes through another. For example, the function \(h(x) = 2^{\sin(x)}\) is composite since \(x \longrightarrow \sin(x) \longrightarrow 2^{\sin(x)}\text{.}\)

Given a composite function \(C(x) = f(g(x))\) where \(f\) and \(g\) are differentiable functions, the chain rule tells us that

Let \(u(x)\) be a differentiable function. For each of the following functions, determine the derivative. Each response will involve \(u\) and/or \(u'\text{.}\)

Let functions \(p\) and \(q\) be the piecewise linear functions given by their respective graphs in Figure2.68. Use the graphs to answer the following questions.

Let \(C(x) = p(q(x))\text{.}\) Determine \(C'(0)\) and \(C'(3)\text{.}\)

Find a value of \(x\) for which \(C'(x)\) does not exist. Explain your thinking.

Let \(Y(x) = q(q(x))\) and \(Z(x) = q(p(x))\text{.}\) Determine \(Y'(-2)\) and \(Z'(0)\text{.}\)

If a spherical tank of radius 4 feet has \(h\) feet of water present in the tank, then the volume of water in the tank is given by the formula

\begin{equation*}
V = \frac{\pi}{3} h^2(12-h)\text{.}
\end{equation*}

At what instantaneous rate is the volume of water in the tank changing with respect to the height of the water at the instant \(h = 1\text{?}\) What are the units on this quantity?

Now suppose that the height of water in the tank is being regulated by an inflow and outflow (e.g., a faucet and a drain) so that the height of the water at time \(t\) is given by the rule \(h(t) = \sin(\pi t) + 1\text{,}\) where \(t\) is measured in hours (and \(h\) is still measured in feet). At what rate is the height of the water changing with respect to time at the instant \(t = 2\text{?}\)

Continuing under the assumptions in (b), at what instantaneous rate is the volume of water in the tank changing with respect to time at the instant \(t = 2\text{?}\)

What are the main differences between the rates found in (a) and (c)? Include a discussion of the relevant units.