LagrangeMultipliers
May16,2020
Abstract
WeconsideraspecialcaseofLagrangeMultipliersforconstrainedopti-mization.Theclassquicklysketchedthe“geometric”intuitionforLa-grangemultipliers,andthisnoteconsidersashortalgebraicderivation.
Inordertominimizeormaximizeafunctionwithlinearconstraints,weconsider?ndingthecriticalpoints(whichmaybelocalmaxima,localminima,orsaddlepoints)of
f(x)subjecttoAx=bHeref:Rd→Risaconvex(orconcave)function,x∈Rd,A∈Rn×d,andb∈Rn.To?ndthecriticalpoints,wecannotjustsetthederivativeoftheobjectiveequalto0.1ThetechniqueweconsideristoturntheproblemfromaconstrainedproblemintoanunconstrainedproblemusingtheLagrangian,
L(x,μ)=f(x)+μT(Ax?b)inwhichμ∈Rn
We’llshowthatthecriticalpointsoftheconstrainedfunctionfarecriticalpointsofL(x,μ).
FindingtheSpaceofSolutionsAssumetheconstraintsaresatis?able,thenletx0besuchthatAx0=b.Letrank(A)=r,thenlet{u1,...,uk}beanorthonormalbasisforthenullspaceofAinwhichk=d?r.Noteifk=0,thenx0isuniquelyde?ned.Soweconsiderk>0.Wewritethisbasisasamatrix:
U=[u1,...,uk]∈Rd×k
SinceUisabasis,anysolutionforf(x)canbewrittenasx=x0+Uy.Thiscapturesallthefreeparametersofthesolution.Thus,weconsiderthefunction:
g(y)=f(x0+Uy)inwhichg:Rk→R
Thecriticalpointsofgarecriticalpointsoff.Noticethatgisunconstrained,sowecanusestandardcalculusto?nditscriticalpoints.
?yg(y)=0equivalentlyUT?f(x0+Uy)=0.
1See
theexampleattheendofthisdocument.
1
Tomakesurethetypesareclear:?yg(y)∈Rk,?f(z)∈RdandU∈Rd×k.Inbothcases,0isthe0vectorinRk.
Theaboveconditionsaysthatifyisacriticalpointforg,then?f(x)mustbeorthogonaltoU.However,UformsabasisforthenullspaceofAandtherowspaceisorthogonaltoit.Inparticular,anyelementoftherowspacecanbewrittenz=ATμ∈Rd.Weverifythatzandu=Uyareorthogonalsince:
zTu=μTAu=μT0=0
SincewecandecomposeRdasadirectsumofnull(A)andtherowspaceofA,weknowthatanyvectororthogonaltoUmustbeintherowspace.Wecanrewritethisorthogonalityconditionasfollows:thereissomeμ∈Rn(dependingonx)suchthat
?f(x)+ATμ=0foracertainxsuchthatAx=A(x0+Uy)=Ax0=b.
TheCleverLagrangianWenowobservethatthecriticalpointsoftheLa-grangianare(bydi?erentiatingandsettingto0)
?xL(x,μ)=?f(x)+ATμ=0and?μL(x,μ)=Ax?b=0
The?rstconditionisexactlytheconditionthatxbeacriticalpointinthewaywederiveditabove,andthesecondconditionsaysthattheconstraintbesatis?ed.Thus,ifxisacriticalpoint,thereexistssomeμasabove,and(x,μ)isacriticalpointforL.
GeneralizingtoNonlinearEqualityConstraintsLagrangemultipliersareamuchmoregeneraltechnique.Ifyouwanttohandlenon-linearequalityconstraints,thenyouwillneedalittleextramachinery:theimplicitfunctiontheorem.However,thekeyideaisthatyou?ndthespaceofsolutionsandyouoptimize.Inthatcase,?ndingthecriticalpointsof
f(x)s.t.g(x)=cleadstoL(x,μ)=f(x)+μT(g(x)?c).
Thegradientconditionhereis?f(x)+JTμ=0,whereJistheJacobianmatrixofg.Forthecasewherewehaveasingleconstraint,thegradientconditionreducesto?f(x)=?μ1?g1(x),whichwecanviewassaying,“atacriticalpoint,thegradientofthesurfacemustbeparalleltothegradientofthefunction.”Thisconnectsusbacktothepicturethatwedrewduringlecture.
Example:NeedforconstrainedoptimizationWegiveasimpleexampletoshowthatyoucannotjustsetthederivativesto0.Considerf(x1,x2)=x1
2
andg(x1,x2)=x21+x2andso:
maxf(x)subjecttog(x)=1.
x
2
Thisisjustalinearfunctionaloverthecircle,anditiscompact,sothefunc-tionmustachieveamaximumvalue.Intuitively,wecanseethat(1,0)isthemaximumpossiblevalue(andhenceacriticalpoint).Here,wehave:
????????1x1
?f(x)=and?g(x)=2
0x2Noticethat?f(x)isnotzeroanywhereonthecircle–it’sconstant!Forx∈
{(1,0),(?1,0)},?f(x)=λ?g(x)(takeλ∈{1/2,?1/2},respectively).Ontheotherhand,foranyotherpointonthecirclex2=0,andsothegradientoffandgarenotparallel.Thus,suchpointsarenotcriticalpoints.ExtraResources
Ifyou?ndresourcesyoulike,postthemonPiazza!
3