Non Linear Programming : A Primer

20 minute read

Published:

Deprecation Warning: This was from a previous github blog and as such there might be some formating issues from that previous blog post that I have not addressed from that time

Summary: Primer on non-linear programming methods

Non Linear Programming

Disclaimer : This blog post is based of lecture notes from NUS on NLP, it is a good module so please consider taking it. I am using this blog to condense , link and elaborate on my own understanding of the topic.

A non-linear program , has nothing to do we computer programming and everything to do with optimisation. In a basic non-linear program (NLP) we are trying to solve either

  • A constraint optimisation problem

    Minimise/Maximize f(x)

    subject to gi(x)=0 for i=1,2,3,

    hj(x)0 for j=1,2,3,

  • A unconstraint optimisation problem

    Minimize/Maximize f(x)

    subject to xX

The point x that satisfies the constraint of the optimisation problem is called the fessible point. The set of x that satisfy the constraints of the optimisation problem is called the fessible set. For a minimisation problem the fessible solution x such that f(x)f(x) for any other x fessible solutions is called the optimal solution. Another way to write this is to say

(1)x=argminxSf(x)

where S is the fessible set. Likewise for maximisation

(2)x=argmaxxSf(x)

Take note: There are some NLP xSxSx<x this is an example of an unbouned minimisation problem a similar thing can be said for maximisation. For example the minimisation of the function log(x) in the domain x>0 is an unbounded minisation problem.

Examples of a Non-Linear Programs

  • Portfolio Optimisation

    Consider an investor who has a certain amount of money to be invested in a number of different securities (stocks, bonds, etc) with random returns. For each security i=1,...,n, estimates of its expected return μi and variance σi2 are given. Furthermore, for any two securities i and j, their correlation coefficient ρij is also assumed to be known. Let the proportion of the total funds invested in security i be xi. The vector X=[x1;...;xn] is called a portfolio vector.

    (3)Expected return of X=E[X]=xiμi=μTX (4)Varience of X=Var[X]=ijρijσiσjxixj=xTQx

    The goal then is the

    Minimise xTQx

    subject to E[X]r i.e. a rate of return above but at least r (satisfying criteria)

    xi=1 i.e we only have a fixed bugdet to allocate different investments

    xi0 i.e. no invesment is negative allocation

    A side note : The idea of xi0, non-negative creteria pops up in alot of NLP problems especially regarding allocation and economics it is easy to forget something like this exist.

    Remark: The form xTQx is called the quadratic form and exist in several places in NLP not just in porfolio optimisation. One interesting fact about the quadratic form is its clear relation with eigenvalues.

    minimise xTQx

    subject to xTx=1

    There is a simple proof for the case where Q is symmetric, which is true for portofilio optimisation since covarience matrix is symmetric, using eigenvalues, L(x,λ)=xTQxλ(xTx1). Hence, L(x,λ)=(xTAxλxTx)=2QTx2λx=0. Therefore, QTx=λx.

    This shows the the optimisation of the quadratic form can be rewritten as an eigenvalue problem, which begs the next question how about the opposite direction are eigenvalue problems just optimisation problem in disguise ?

A Simple Solving NLP problems (Graphical Solution)

minimise f(x)=(x14)2+(x26)2

subject to x14

x26

3x1+2x218

x1,x20

The solution lies on the line 3x1+2x=18

This isn’t always the case !! For example

minimise f(x)=(x12)2+(x222)

subject to x14

x26

3x1+2x218

x1,x20

Not all fessible points are equal

We previously discussed about optimal solutions, these are example of global minimizers sometimes it is very diffcult to find these global minimizers because the search space is too big and thus simple methods like finding the graphical solution doesn’t work. Hence, we are interested in local minimizers.

Let S be a subset of Rn. Then Bε(y)=xRn|xy∥<ε to be the open ball with center y and radius ε we call this the neighbourhood of y.

  1. A point xS is said to be a local minimizer of f(x) if there exists an ε>0 such that f(x)f(x) for all points that is fessible in the neighbourhood of radius ϵ of y.
  2. If f(x)>f(x) for all xS then x is a global minimizer

Remark For (strict) local or global maximizer, replace the inequality by f(x)f(x) or f(x)<f(x) appropriately.

Example of this a local minister would be the function f(x)=x4x33x3+3, when x0.906 it is a local minimiser.

Important theorems in NLP

  1. Weierstrass Theorem (Sufficient conditions for the existence of global optimum for a function)

A continuous function of a nonempty closed and bounded set SRn i.e. set is compact has a global maximum point and a global minimum point in S.

Remark: This can be seens as a generalisation of extreme value theorem of the real number system.

This gives us sufficient conditions to guranttee the existence of a global optimal. Note there is nothing we have stated here that shows the uniquness of this global optimum, we will discuss this in greater detail when we talk about convex functions and these sufficient conditions are about the constraints not the function itself !!

Let SRn be a bounded nonempty set. The set S is said to be bounded if there is a positive number M such that |X|MxS. That means that it has an upper and a lower bound. Then M is the maximum of the absolute value of the 2 bounds. We can hence see a ball and an interval are both examples of a bounded set as they both have some upper and lower bound for their set.

Another interesting to see is that the union of 2 bounded sets is another bounded set. A similar remark can be said about the intersection of a bounded set. Consider the 1-dimensional case of an interval to see the intution.

Let SRn be a nonempty set. The set S is said to be closed if for every convergent sequence the limit limnxnS.

Interestingly enough there is such a thing as the open set, and the complement of such an openset is a closed set. A nonempty subset of SRn such that for every xS there eixsts ϵ>0 such that the open ball B(x,ϵ)S i.e for every point in S its neighboourd is must also be part of S. This automatically excludes boundary points of set S. Imagine a ball at the boundary one half will always be outside the set S.

  1. Talyor Theorem (Sufficient condition for a global optimum for a function)

Suppose f:SR has continuous second partial derivatives in S. Suppose the line segment [x,y] joining x and y is contained in the interior of S. Then there exists w[x,y] such that

f(y)=f(x)+f(x)T(yx)+12(yz)TH(w)(yx)

Take note: This is not to be confused for talyor expansion note this is an acutal equality not an inequality.

It turns out if the hessian H(w) is positive semi-definite for x then we see very clearly that f(y)f(x)+f(x)T(yx) and if x is a stationary point (i.e. f(x)T(yx))we immediately see that yRf(y)f(x)

For a point x with a of a function with a positive semi-definite hessian and stationary point at x it is a global optima.

A word about positive semi-definite matrices

Suppose v is an eigenvector of the positive semi-definite matrix Q such that v has negative eigenvalues. Then vTQv=λvTv<0. This contradicts premises that Q is a positive semi-definite matrixs and thus for any v it must only have non-negative eigenvalues. This gives us a neccessary condition for positive semi-definite matrices which we can use for as a test for them.

If a matrix is positive semi-definite it must have posiitve eigenvalues.

However, we can go further suppose A is a symmetric matrix then we can say that Q=PDPT via orthogonal diagonalisation then for any xRxTQx=xTPDPTx. Let x¯=PTx, then xTQx=x¯TDx¯=λix¯20. Thus the converse is also true for symmetric matrices.

If a matrix is symmetric and has posiitve eigenvalues then it is positive semi-definite.

Therefore for symmetric matrices A.

A symmetric matrix A is positive semi-definite if and only if (iff) it has positive eigenvalues.

Corollary: The portofilio optimisation problem is a positive semi-definite programming problem and some constraint positive semi-definite programming are eigenvalue problems.

The test above is known as the eigenvalue test can be generalised for negative definite, positive definite … etc matrices. The key idea is the same.

Another test is the test of principal minors which is determinant of the kth principal submatrix of A.

For a symmetric matrix ,

  1. A is positive definite iff all the principal minors are positive

  2. A is negative definite iff the principal minors alternatie between negative and positive.

Techniques to take note in this chapter

  • Formulating NLP problems
  • Graphical Solution of NLP problems
  • Finding definite matrices
  • Proving if a function is bounded and closed. Especially closed not confident about that.

Leave a Comment