Skip to main content
\(\newcommand{\R}{\mathbb{R}} \newcommand{\va}{\mathbf{a}} \newcommand{\vb}{\mathbf{b}} \newcommand{\vc}{\mathbf{c}} \newcommand{\vC}{\mathbf{C}} \newcommand{\vd}{\mathbf{d}} \newcommand{\ve}{\mathbf{e}} \newcommand{\vi}{\mathbf{i}} \newcommand{\vj}{\mathbf{j}} \newcommand{\vk}{\mathbf{k}} \newcommand{\vn}{\mathbf{n}} \newcommand{\vm}{\mathbf{m}} \newcommand{\vr}{\mathbf{r}} \newcommand{\vs}{\mathbf{s}} \newcommand{\vu}{\mathbf{u}} \newcommand{\vv}{\mathbf{v}} \newcommand{\vw}{\mathbf{w}} \newcommand{\vx}{\mathbf{x}} \newcommand{\vy}{\mathbf{y}} \newcommand{\vz}{\mathbf{z}} \newcommand{\vzero}{\mathbf{0}} \newcommand{\vF}{\mathbf{F}} \newcommand{\vG}{\mathbf{G}} \newcommand{\vH}{\mathbf{H}} \newcommand{\vR}{\mathbf{R}} \newcommand{\vT}{\mathbf{T}} \newcommand{\vN}{\mathbf{N}} \newcommand{\vL}{\mathbf{L}} \newcommand{\vB}{\mathbf{B}} \newcommand{\vS}{\mathbf{S}} \newcommand{\proj}{\text{proj}} \newcommand{\comp}{\text{comp}} \newcommand{\nin}{} \newcommand{\vecmag}[1]{|#1|} \newcommand{\grad}{\nabla} \DeclareMathOperator{\curl}{curl} \DeclareMathOperator{\divg}{div} \newcommand{\lt}{<} \newcommand{\gt}{>} \newcommand{\amp}{&} \)

Section2.7Constrained Optimization: Lagrange Multipliers

Objectives
  • What geometric condition enables us to optimize a function \(f=f(x,y)\) subject to a constraint given by \(g(x,y) = k\text{,}\) where \(k\) is a constant?

  • How can we exploit this geometric condition to find the extreme values of a function subject to a constraint?

We previously considered how to find the extreme values of functions on both unrestricted domains and on closed, bounded domains. Other types of optimization problems involve maximizing or minimizing a quantity subject to an external constraint. In these cases the extreme values frequently won't occur at the points where the gradient is zero, but rather at other points that satisfy an important geometric condition. These problems are often called constrained optimization problems and can be solved with the method of Lagrange Multipliers, which we study in this section.

Exploration2.7.1

According to U.S.postal regulations, the girth plus the length of a parcel sent by mail may not exceed 108 inches, where by girth we mean the perimeter of the smallest end. Our goal is to find the largest possible volume of a rectangular parcel with a square end that can be sent by mail. (We solved this applied optimization problem in single variable Active Calculus, so it may look familiar. We take a different approach in this section, and this approach allows us to view most applied optimization problems from single variable calculus as constrained optimization problems, as well as provide us tools to solve a greater variety of optimization problems.) If we let \(x\) be the length of the side of one square end of the package and \(y\) the length of the package, then we want to maximize the volume \(f(x,y) = x^2y\) of the box subject to the constraint that the girth (\(4x\)) plus the length (\(y\)) is as large as possible, or \(4x+y = 108\text{.}\) The equation \(4x + y = 108\) is thus an external constraint on the variables.

  1. The constraint equation involves the function \(g\) that is given by

    \begin{equation*} g(x,y) = 4x+y. \end{equation*}

    Explain why the constraint is a contour of \(g\text{,}\) and is therefore a two-dimensional curve.

    Figure2.7.1Contours of \(f\) and the constraint equation \(g(x,y) = 108\text{.}\)

  2. Figure2.7.1 shows the graph of the constraint equation \(g(x,y) = 108\) along with a few contours of the volume function \(f\text{.}\) Since our goal is to find the maximum value of \(f\) subject to the constraint \(g(x,y) = 108\text{,}\) we want to find the point on our constraint curve that intersects the contours of \(f\) at which \(f\) has its largest value.

    1. Points \(A\) and \(B\) in Figure2.7.1 lie on a contour of \(f\) and on the constraint equation \(g(x,y) = 108\text{.}\) Explain why neither \(A\) nor \(B\) provides a maximum value of \(f\) that satisfies the constraint.

    2. Points \(C\) and \(D\) in Figure2.7.1 lie on a contour of \(f\) and on the constraint equation \(g(x,y) = 108\text{.}\) Explain why neither \(C\) nor \(D\) provides a maximum value of \(f\) that satisfies the constraint.

    3. Based on your responses to parts i. and ii., draw the contour of \(f\) on which you believe \(f\) will achieve a maximum value subject to the constraint \(g(x,y) = 108\text{.}\) Explain why you drew the contour you did.

  3. Recall that \(g(x,y) = 108\) is a contour of the function \(g\text{,}\) and that the gradient of a function is always orthogonal to its contours. With this in mind, how should \(\nabla f\) and \(\nabla g\) be related at the optimal point? Explain.

Subsection2.7.1Constrained Optimization and Lagrange Multipliers

In Preview Activity2.7.1, we considered an optimization problem where there is an external constraint on the variables, namely that the girth plus the length of the package cannot exceed 108 inches. We saw that we can create a function \(g\) from the constraint, specifically \(g(x,y) = 4x+y\text{.}\) The constraint equation is then just a contour of \(g\text{,}\) \(g(x, y) = c\text{,}\) where \(c\) is a constant (in our case 108). Figure2.7.2 illustrates that the volume function \(f\) is maximized, subject to the constraint \(g(x, y) = c\text{,}\) when the graph of \(g(x, y) = c\) is tangent to a contour of \(f\text{.}\) Moreover, the value of \(f\) on this contour is the sought maximum value.

Figure2.7.2Contours of \(f\) and the constraint contour.

To find this point where the graph of the constraint is tangent to a contour of \(f\text{,}\) recall that \(\nabla f\) is perpendicular to the contours of \(f\) and \(\nabla g\) is perpendicular to the contour of \(g\text{.}\) At such a point, the vectors \(\nabla g\) and \(\nabla f\) are parallel, and thus we need to determine the points where this occurs. Recall that two vectors are parallel if one is a nonzero scalar multiple of the other, so we therefore look for values of a parameter \(\lambda\) that make

\begin{equation} \nabla f = \lambda \nabla g.\label{eq_10_8_Lagrange_ex1}\tag{2.7.1} \end{equation}

The constant \(\lambda\) is called a Lagrange multiplier.

To find the values of \(\lambda\) that satisfy (2.7.1) for the volume function in Preview Activity2.7.1, we calculate both \(\nabla f\) and \(\nabla g\text{.}\) Observe that

\begin{equation*} \nabla f = 2xy \vi + x^2 \vj \ \ \ \ \text{ and } \ \ \ \ \nabla g = 4\vi + \vj, \end{equation*}

and thus we need a value of \(\lambda\) so that

\begin{equation*} 2xy \vi + x^2 \vj = \lambda(4\vi + \vj). \end{equation*}

Equating components in the most recent equation and incorporating the original constraint, we have three equations

\begin{align} 2xy \amp = \lambda (4) \label{eq_10_8_lag_ex1}\tag{2.7.2}\\ x^2 \amp = \lambda (1) \label{eq_10_8_lag_ex2}\tag{2.7.3}\\ 4x+y \amp = 108 \label{eq_10_8_lag_ex3}\tag{2.7.4} \end{align}

in the three unknowns \(x\text{,}\) \(y\text{,}\) and \(\lambda\text{.}\) First, note that if \(\lambda = 0\text{,}\) then equation (2.7.3) shows that \(x=0\text{.}\) From this, Equation(2.7.4) tells us that \(y = 108\text{.}\) So the point \((0,108)\) is a point we need to consider. Next, provided that \(\lambda \neq 0\) (from which it follows that \(x \neq 0\) by Equation (2.7.3)), we may divide both sides of Equation(2.7.2) by the corresponding sides of (2.7.3) to eliminate \(\lambda\text{,}\) and thus find that

\begin{align*} \frac{2y}{x} \amp = 4, \ \mbox{so}\\ y \amp = 2x. \end{align*}

Substituting into Equation(2.7.4) gives us

\begin{equation*} 4x+2x = 108 \end{equation*}

or

\begin{equation*} x = 18. \end{equation*}

Thus we have \(y = 2x = 36\) and \(\lambda = x^2 = 324\) as another point to consider. So the points at which the gradients of \(f\) and \(g\) are parallel, and thus at which \(f\) may have a maximum or minimum subject to the constraint, are \((0,108)\) and \((18,36)\text{.}\) By evaluating the function \(f\) at these points, we see that we maximize the volume when the length of the square end of the box is 18 inches and the length is 36 inches, for a maximum volume of \(f(18,36) = 11664\) cubic inches. Since \(f(0,108) = 0\text{,}\) we obtain a minimum value at this point.

We summarize the process of Lagrange multipliers as follows.

The method of Lagrange multipliers

The general technique for optimizing a function \(f = f(x,y)\) subject to a constraint \(g(x,y)=c\) is to solve the system \(\nabla f = \lambda \nabla g\) and \(g(x,y)=c\) for \(x\text{,}\) \(y\text{,}\) and \(\lambda\text{.}\) We then evaluate the function \(f\) at each point \((x,y)\) that results from a solution to the system in order to find the optimum values of \(f\) subject to the constraint.

Activity2.7.2

A cylindrical soda can holds about 355 cc of liquid. In this activity, we want to find the dimensions of such a can that will minimize the surface area. For the sake of simplicity, assume the can is a perfect cylinder.

  1. What are the variables in this problem? Based on the context, what restriction(s), if any, are there on these variables?

  2. What quantity do we want to optimize in this problem? What equation describes the constraint? (You need to decide which of these functions plays the role of \(f\) and which plays the role of \(g\) in our discussion of Lagrange multipliers.)

  3. Find \(\lambda\) and the values of your variables that satisfy Equation (2.7.1) in the context of this problem.

  4. Determine the dimensions of the pop can that give the desired solution to this constrained optimization problem.

Example2.7.3

Find the extreme values of the function \(f(x,y) = xy\) on the ellipse \(\frac{x^2}{8} + \frac{y^2}{2} = 1\text{.}\)

(Note: we could turn this into a Calculus I problem by parametrizing the ellipse, but this will be good practice for more complicated problems where that won't be possible.)

Solution

The constraint equation is

\begin{equation*} g(x,y) = \frac{x^2}{8} + \frac{y^2}{2} = 1, \end{equation*}

so we begin by computing

\begin{equation*} \nabla f \end{equation*}

and

\begin{equation*} \nabla g\text{.} \end{equation*}
\begin{align*} \nabla f \amp = \langle y,x \rangle\\ \nabla g \amp = \langle \frac{x}{4},y \rangle \end{align*}

The equation \(\nabla f = \lambda \nabla g\) becomes \(\langle y,x \rangle = \lambda \langle x/4,y \rangle\text{,}\) which we turn into two equations by looking at the \(x\)- and \(y\)-components separately:

\begin{align*} y \amp = \lambda \frac{x}{4}\\ x = \lambda y \end{align*}

Plugging \(\lambda y\) (from the second equation) in for \(x\) in the first gives

\begin{equation*} y = \lambda^2 \frac{y}{4}, \end{equation*}

which we rearrange to get \(y(1 - \frac{\lambda^2}{4}) = 0\text{.}\) So either \(y = 0\) or \(\lambda = \pm 2\text{.}\)

Case 1: if \(y = 0\) then \(x = y = 0\) but \((0,0)\) is not a point on the ellipse! So we get no solutions from this case.

Case 2: if \(\lambda = \pm 2\) then \(x = \pm 2y\text{,}\) so we can plug that in for \(x\) in the constraint equation (the one that defines the ellipse) to get

\begin{equation*} \frac{(\pm 2y)^2}{8} + \frac{y^2}{2} = 1, \end{equation*}

which we can solve to get \(y = \pm 1\text{.}\) In summary, there are four possible points to consider: \((\pm 2,1)\) and \((\pm 2,-1)\text{.}\) By evaluating \(f\) at each of these four points, we deduce that \((2,1)\) and \((-2,-1)\) maximize the function on the ellipse, and \((-2,1)\) and \((2,-1)\) minimize the function on the ellipse.

The method of Lagrange multipliers also works for functions of more than two variables.

Example2.7.4

Use the method of Lagrange multipliers to find the dimensions of the least expensive packing crate with a volume of 45 cubic feet when the material for the top costs $2 per square foot, the bottom is $3 per square foot and the sides are $1.50 per square foot.

Solution

There are three variables: the length \(l\text{,}\) width \(w\text{,}\) and height \(h\) of the crate. The optimization is constrained by the requirement that the volume be 45 cubic feet, so we set

\begin{equation} g(l,w,h) = lwh = 45.\label{eq-lagrange1-constraint}\tag{2.7.5} \end{equation}

We are trying to minimize cost, which is a weighted sum of the areas of the six sides of the crate:

\begin{equation*} f(l,w,h) = 2lw + 3lw + 2 \cdot 1.5 \cdot wh + 2 \cdot 1.5 \cdot lh. \end{equation*}

(The first term

\begin{equation*} 2lw \end{equation*}

measures the cost of the top, the term

\begin{equation*} 3lw \end{equation*}

measures the cost of the bottom, and the remaining two terms measure the cost of the two

\begin{equation*} w \times h \end{equation*}

sides and the two

\begin{equation*} l \times h \end{equation*}

sides.)

Next we take gradients:

\begin{align*} \nabla f \amp = \langle f_l, f_w, f_h \rangle = \langle 5w + 3h, 5l + 3h, 3w + 3l \rangle\\ \nabla g \amp = \langle g_l, g_w, g_h \rangle = \langle wh, lh, lw \rangle \end{align*}

Simplifying the equation \(\nabla f = \lambda \nabla g\) gives the following three equations.

\begin{align*} 5w + 3h \amp = \lambda wh \\ 5l + 3h \amp = \lambda lh\\ 3w + 3l \amp = \lambda lw \end{align*}

Next solve this system for \(w\text{,}\) \(l\text{,}\) and \(h\) in terms of \(\lambda\text{.}\) After some tedious algebra, we arrive at the following.

\begin{equation*} w = l = \frac{6}{\lambda}, \quad h = \frac{10}{\lambda} \end{equation*}

Now we can plug these results into (2.7.5) to obtain an equation that we can solve for \(\lambda\text{.}\)

\begin{equation*} \frac{6}{\lambda} \cdot \frac{6}{\lambda} \cdot \frac{10}{\lambda} = 45 \end{equation*}

Solving for \(\lambda\) gives \(\lambda = \sqrt[3]{360/45} = \sqrt[3]{8} = 2\text{.}\) Eliminating \(\lambda\text{,}\) we finally obtain our optimal dimensions:

\begin{equation*} w = l = 3, \quad h = 5. \end{equation*}

The method of Lagrange multipliers also works for functions of three variables. That is, if we have a function \(f = f(x,y,z)\) that we want to optimize subject to a constraint \(g(x,y,z) = k\text{,}\) the optimal point \((x,y,z)\) lies on the level surface \(S\) defined by the constraint \(g(x,y,z) = k\text{.}\) As we did in Preview Activity2.7.1, we can argue that the optimal value occurs at the level surface \(f(x,y,z) = c\) that is tangent to \(S\text{.}\) Thus, the gradients of \(f\) and \(g\) are parallel at this optimal point. So, just as in the two variable case, we can optimize \(f = f(x,y,z)\) subject to the constraint \(g(x,y,z) = k\) by finding all points \((x,y,z)\) that satisfy \(\nabla f = \lambda \nabla g\) and \(g(x,y,z) = k\text{.}\)

Subsection2.7.2Summary

  • The extrema of a function \(f=f(x,y)\) subject to a constraint \(g(x,y) = c\) occur at points for which the contour of \(f\) is tangent to the curve that represents the constraint equation. This occurs when

    \begin{equation*} \nabla f = \lambda \nabla g. \end{equation*}
  • We use the condition \(\nabla f = \lambda \nabla g\) to generate a system of equations, together with the constraint \(g(x,y) = c\text{,}\) that may be solved for \(x\text{,}\) \(y\text{,}\) and \(\lambda\text{.}\) Once we have all the solutions, we evaluate \(f\) at each of the \((x,y)\) points to determine the extrema.

Subsection2.7.3Exercises

The Cobb-Douglas production function is used in economics to model production levels based on labor and equipment. Suppose we have a specific Cobb-Douglas function of the form

\begin{equation*} f(x, y) = 50 x^{0.4}y^{0.6}, \end{equation*}

where \(x\) is the dollar amount spent on labor and \(y\) the dollar amount spent on equipment. Use the method of Lagrange multipliers to determine how much should be spent on labor and how much on equipment to maximize productivity if we have a total of 1.5 million dollars to invest in labor and equipment.

Use the method of Lagrange multipliers to find the point on the line \(x-2y=5\) that is closest to the point \((1,3)\text{.}\) To do so, respond to the following prompts.

  1. Write the function \(f=f(x,y)\) that measures the square of the distance from \((x,y)\) to \((1,3)\text{.}\) (The extrema of this function are the same as the extrema of the distance function, but \(f(x,y)\) is simpler to work with.)

  2. What is the constraint \(g(x,y) = c\text{?}\)

  3. Write the equations resulting from \(\nabla f = \lambda \nabla g\) and the constraint. Find all the points \((x,y)\) satisfying these equations.

  4. Test all the points you found to determine the extrema.

Apply the Method of Lagrange Multipliers solve each of the following constrained optimization problems.

  1. Determine the absolute maximum and absolute minimum values of \(f(x,y) = (x-1)^2 + (y-2)^2\) subject to the constraint that \(x^2 + y^2 = 16\text{.}\)

  2. Determine the points on the sphere \(x^2 + y^2 + z^2 = 4\) that are closest to and farthest from the point \((3,1,-1)\text{.}\) (As in the preceding exercise, you may find it simpler to work with the square of the distance formula, rather than the distance formula itself.)

  3. Find the absolute maximum and minimum of \(f(x,y,z) = x^2 + y^2 + z^2\) subject to the constraint that \((x-3)^2 + (y+2)^2 + (z-5)^2 \le 16\text{.}\) (Hint: here the constraint is a closed, bounded region. Use the boundary of that region for applying Lagrange Multipliers, but don't forget to also test any critical values of the function that lie in the interior of the region.)

In this exercise we consider how to apply the Method of Lagrange Multipliers to optimize functions of three variable subject to two constraints. Suppose we want to optimize \(f = f(x,y,z)\) subject to the constraints \(g(x,y,z) = c\) and \(h(x,y,z) = k\text{.}\) Also suppose that the two level surfaces \(g(x,y,z) = c\) and \(h(x,y,z) = k\) intersect at a curve \(C\text{.}\) The optimum point \(P = (x_0,y_0,z_0)\) will then lie on \(C\text{.}\)

  1. Assume that \(C\) can be represented parametrically by a vector-valued function \(\vr = \vr(t)\text{.}\) Let \(\overrightarrow{OP} = \vr(t_0)\text{.}\) Use the Chain Rule applied to \(f(\vr(t))\text{,}\) \(g(\vr(t))\text{,}\) and \(h(\vr(t))\text{,}\) to explain why

    \begin{align*} \nabla f(x_0,y_0,z_0) \cdot \vr'(t_0) \amp = 0, \\ \nabla g(x_0,y_0,z_0) \cdot \vr'(t_0) \amp = 0, \text{ and } \\ \nabla h(x_0,y_0,z_0) \cdot \vr'(t_0) \amp = 0. \end{align*}

    Explain how this shows that \(\nabla f(x_0,y_0,z_0)\text{,}\) \(\nabla g(x_0,y_0,z_0)\text{,}\) and \(\nabla h(x_0,y_0,z_0)\) are all orthogonal to \(C\) at \(P\text{.}\) This shows that \(\nabla f(x_0,y_0,z_0)\text{,}\) \(\nabla g(x_0,y_0,z_0)\text{,}\) and \(\nabla h(x_0,y_0,z_0)\) all lie in the same plane.

  2. Assuming that \(\nabla g(x_0,y_0,z_0)\) and \(\nabla h(x_0,y_0,z_0)\) are nonzero and not parallel, explain why every point in the plane determined by \(\nabla g(x_0,y_0,z_0)\) and \(\nabla h(x_0,y_0,z_0)\) has the form \(s\nabla g(x_0,y_0,z_0)+t\nabla h(x_0,y_0,z_0)\) for some scalars \(s\) and \(t\text{.}\)

  3. Parts (a.) and (b.) show that there must exist scalars \(\lambda\) and \(\mu\) such that

    \begin{equation*} \nabla f(x_0,y_0,z_0) = \lambda \nabla g(x_0,y_0,z_0)+ \mu \nabla h(x_0,y_0,z_0). \end{equation*}

    So to optimize \(f = f(x,y,z)\) subject to the constraints \(g(x,y,z) = c\) and \(h(x,y,z) = k\) we must solve the system of equations

    \begin{align*} \nabla f(x,y,z) \amp = \lambda \nabla g(x,y,z)+ \mu \nabla h(x,y,z), \\ g(x,y,z) \amp = c, \text{ and } \\ h(x,y,z) \amp = k. \end{align*}

    for \(x\text{,}\) \(y\text{,}\) \(z\text{,}\) \(\lambda\text{,}\) and \(\mu\text{.}\)

    Use this idea to find the maximum and minium values of \(f(x,y,z) = x+2y\) subject to the constraints \(y^2+z^2=8\) and \(x+y+z = 10\text{.}\)

There is a useful interpretation of the Lagrange multiplier \(\lambda\text{.}\) Assume that we want to optimize a function \(f\) with constraint \(g(x,y)=c\text{.}\) Recall that an optimal solution occurs at a point \((x_0, y_0)\) where \(\nabla f = \lambda \nabla g\text{.}\) As the constraint changes, so does the point at which the optimal solution occurs. So we can think of the optimal point as a function of the parameter \(c\text{,}\) that is \(x_0 = x_0(c)\) and \(y_0=y_0(c)\text{.}\) The optimal value of \(f\) subject to the constraint can then be considered as a function of \(c\) defined by \(f(x_0(c), y_0(c))\text{.}\) The Chain Rule shows that

\begin{equation*} \frac{df}{dc} = \frac{\partial f}{\partial x_0} \frac{dx_0}{dc} + \frac{\partial f}{\partial y_0} \frac{dy_0}{dc}. \end{equation*}
  1. Use the fact that \(\nabla f = \lambda \nabla g\) at \((x_0,y_0)\) to explain why

    \begin{equation*} \frac{df}{dc} = \lambda \frac{dg}{dc}. \end{equation*}
  2. Use the fact that \(g(x,y) = c\) to show that

    \begin{equation*} \frac{df}{dc} = \lambda. \end{equation*}

    Conclude that \(\lambda\) tells us the rate of change of the function \(f\) as the parameter \(c\) increases (or by approximately how much the optimal value of the function \(f\) will change if we increase the value of \(c\) by 1 unit).

  3. Suppose that \(\lambda = 324\) at the point where the package described in Preview Activity2.7.1 has its maximum volume. Explain in context what the value \(324\) tells us about the package.

  4. Suppose that the maximum value of a function \(f = f(x,y)\) subject to a constraint \(g(x,y) = 100\) is \(236\text{.}\) When using the method of Lagrange multipliers and solving \(\nabla f = \lambda \nabla g\text{,}\) we obtain a value of \(\lambda = 15\) at this maximum. Find an approximation to the maximum value of \(f\) subject to the constraint \(g(x,y) = 98\text{.}\)