Wednesday, July 27, 2011

Newton's Method

Newton-Raphson Method

(see also Maximum Likelihood Visualization via SAS and R)

"Newton's method, also called the Newton-Raphson method, is a root-finding algorithm that uses the first few terms of the Taylor series of a function f(x) in the vicinity of a suspected root."­ -

Given we want to find some root value x* that optimizes the function f(x) such that f(x*)=0, we start with an initial guess x­­n and expand around that point with a taylor series:

f(xn + Δx ) = f(xn) + f’(xn)Δx + …=0 where Δx represents the difference between the actual solution x* and the guess xn.

Retaining only the first order terms we can solve for Δx :

Δx = -[f(xn)/f’(xn)]

If xn is an initial guess, the next guess can be obtained by:

xn+1 = xn - [f(xn)/f’(xn)]
Note as we get closer to the true root value x*, f(xn) gets smaller and Δx gets smaller, such that the value xn+1 from the next iteration changes little from the previous. This is referred to as convergence to the root value x*, which optimizes f(x).

Numerical Estimation of Maximum Likelihood Estimators

Recall, when we undertake MLE we typically maximize the log of the likelihood function as follows:

Max Log(L(β)) or LL ‘log likelihood’ or solve:

∂Log(L(β))/∂β = 0    =  'score matrix'   = u( β) = 0

Generally, these equations aren’t solved directly, but solutions ( βhat 's) are derived from an iterative procedure like the Newton-Raphson algorithm.

Following the procedure outlined above, let

βt = initial guess or parameter value at iteration t

βt+1 = βt –[ ∂ u(βt) / ∂ βt]-1 [u(βt)]  

(note the product in the 2nd term is the same as –[f(xn)/f’(xn)]

given  [∂ u(β0) / ∂β] = hessian matrix ‘H’ and u( β) = ∂Log(L(β))/∂β = score matrix ‘S’

the update function can be rewritten as:

βt+1 = βt –H -1t) S (βt)

Step 1: Guess initial value βt 
Step 2: Evaluate update function to obtain βt+1
Step 3: Repeat until convergence (differences between estimates approach zero)

Fisher Scoring

Similar to Newton's method above, but replace  H-1 with its expected value E(H-1 ) =  I-1 and use

βhat=  β0 - I-10) *u(β0) in the update, repeating until convergence.

No comments:

Post a Comment