Augmented Lagrangian amd Method of Multipliers (ALM)

The dual ascent requires conditions to ensure convergence. The limitation of dual ascent is solved by augmented Lagrangian method, also called method of multipliers.

This method is to robust dual ascent.

Convergence without assumptions like strict convexity or finiteness of $f$.

Consider the augmented Lagrangian function,

$L(x, y) = f(x) + y^{T}(Ax-b)+\frac{\rho}{2}\left \| Ax-b \right \|^{2}_{2}$

where $\rho > 0$ is the penalty parameter.

The augmented Lagrangian can be regarded as the unaugmented Lagrangian associated with the constraints,

$\min_{x\in\mathbb{R}^{n}} f(x)+ \frac{\rho}{2} \left \| Ax-b \right \|^{2}_{2}$

$s.t. \ Ax=b$

Clearly equivalent to original problem. Hence the both equation are same, because for any feasible $x$, the term $\frac{\rho}{2} \left || Ax-b \right ||^{2}_{2}$ becomes zero.

The associated dual function is as follows,

$g_{\rho}(y) = l_{\rho}(x,y)$

Then, adding term $\frac{\rho}{2} \left || Ax-b \right ||^{2}_{2}$

to $f\left ( x \right )$ makes $g_{\rho}(y)$ differentiable under rather mild conditions than on the original problem.

Applying dual ascent to the modified algorithm is as follows,

$x^{k+1}=argmin_{x\in \mathbb{R}^n}\ L\left (x, y^{k}\right )$

$y^{k+1}=y^{k}+\rho\left ( Ax^{k+1}-b \right )$.

Now here’s a question. Why is the convergence of ALM better than duan ascent?

Since $x^{k}$ minimizes

$f \left ( x \right ) + \left ( u^{k-1} \right )^{T}Ax+\frac{\rho}{2} \left \| Ax-b \right \|^{2}_{2}$

over $x$, we have,

$0 \in \partial f\left ( x^{k} \right )+A^{T}\left ( u^{k-1}+\rho\left ( Ax^{k}-b \right ) \right )$

$= \partial f\left ( x^{k} \right )+A^{T}u^{k}$

This is the stationarity condition for original primal problem under mild condition

$Ax^{k}-b\rightarrow 0$

$k \rightarrow \infty$

, so Karush–Kuhn–Tucker (KKT) conditions are satisfied in the limit and $x^{k}$, $u^{k}$ converge to solutions.

The advantage of ALM is that much better convergence properties. However, the disadvantage of ALM is the loss of decomposability.

There’s always a trade-off

Share on

Twitter Facebook LinkedIn

Namwon Kim

Augmented Lagrangian amd Method of Multipliers (ALM)

Share on

You may also enjoy

Object

Instance

Class

Dual Decomposition