"Newton's method, also called the Newton-Raphson method, is a root-finding algorithm that uses the first few terms of the Taylor series of a function f(x) in the vicinity of a suspected root." - http://mathworld.wolfram.com/NewtonsMethod.html
Given we want to find some root value x* that optimizes the function f(x) such that f(x*)=0, we start with an initial guess xn and expand around that point with a taylor series:
f(xn + Δx ) = f(xn) + f’(xn)Δx + …=0 where Δx represents the difference between the actual solution x* and the guess xn.
Retaining only the first order terms we can solve for Δx :
Δx = -[f(xn)/f’(xn)]
If xn is an initial guess, the next guess can be obtained by:
xn+1 = xn - [f(xn)/f’(xn)]
Note as we get closer to the true root value x*, f(xn) gets smaller and Δx gets smaller, such that the value xn+1 from the next iteration changes little from the previous. This is referred to as convergence to the root value x*, which optimizes f(x).
Numerical Estimation of Maximum Likelihood Estimators
Recall, when we undertake MLE we typically maximize the log of the likelihood function as follows:
Max Log(L(β)) or LL ‘log likelihood’ or solve:
∂Log(L(β))/∂β = 0 = 'score matrix' = u( β) = 0
Max Log(L(β)) or LL ‘log likelihood’ or solve:
∂Log(L(β))/∂β = 0 = 'score matrix' = u( β) = 0
Generally, these equations aren’t solved directly, but solutions ( βhat 's) are derived from an iterative procedure like the Newton-Raphson algorithm.
Following the procedure outlined above, let
βt = initial guess or parameter value at iteration t
βt+1 = βt –[ ∂ u(βt) / ∂ βt]-1 [u(βt)]
(note the product in the 2nd term is the same as –[f(xn)/f’(xn)]
given [∂ u(β0) / ∂β] = hessian matrix ‘H’ and u( β) = ∂Log(L(β))/∂β = score matrix ‘S’
the update function can be rewritten as:
βt+1 = βt –H -1(βt) S (βt)
Step 1: Guess initial value βt
Step 2: Evaluate update function to obtain βt+1
Step 3: Repeat until convergence (differences between estimates approach zero)
Fisher Scoring
Similar to Newton's method above, but replace H-1 with its expected value E(H-1 ) = I-1 and use
βhat= β0 - I-1(β0) *u(β0) in the update, repeating until convergence.
No comments:
Post a Comment