Econometric Sense: Student's t, Normality, & the Slutsky Theorems

Saturday, March 5, 2011

Student's t, Normality, & the Slutsky Theorems

Often in an analysis we are using s² instead of σ²(as the true population variance is often unknown). Yet we want to construct confidence intervals, or equivalently conduct hypothesis tests. The t distribution is the ratio of a normally distributed variable and chi-square distributed variable ( DeGroot, 2002). If our data is distributed exactly normal, we can rely on using the t-table for constructing confidence intervals. These are exact results, as the t-ratio is exactly distributed t given that the underlying data is distributed exactly normal.

But what if we don’t know the distribution of the data we are working with, or don’t feel comfortable making assumptions of normality. Usually we have to estimate σ² with s². Of course if n is large enough, reliance on the t-table or the standard normal table will give similar results. But, Goldberger offers the following anecdote:

“ There is no good reason to rely routinely on a t-table rather than a normal table unless Y itself is normally distributed” ( Goldberger, 1991).

So, how do you justify this? In this case there are some powerful theorems regarding asymptotic properties of sample statistics known as the Slutsky Theorems. These are outlined in Goldberger, 1991. The following sequence of steps using these theorems is based on Goldberger and my lecture notes from ECO 603 Research Methods for Economics, which was actually a mathematical statistics course taught by Dr. Christopher Bollinger at the University of Kentucky. Any errors or mistakes are completely my own.

GivenΘ^ is an estimator for the population parameter Θ, → ^p implies convergence in probability, and → ^d implies convergence in distribution:

S1: If Θ^ → ^p Θ then for any continuous function h (Θ^)→ ^p h(Θ).

S2: If Θ₁^ and Θ₂^ converge in probability to (Θ₁, Θ₂), then

h (Θ₁^, Θ₂^) → ^p h(Θ₁, Θ₂).

S3: If Θ^ → ^p Θ and Z_n → ^dN(0,1) then (Θ^ + Z_n ) → ^d N( Θ, 1 )

S4: If Θ^ → ^p Θ and Z_n → ^dN(0,1) then Θ^ Z_n → ^d N( 0, Θ² )

S5: If n^1/2 (Θ^ - Θ ) / s^1/2 ~^A N(0,1) then for continuous functions of Θ,

n^1/2 (h(Θ^) – h( Θ )) / s^1/2 ~^A N(0, h’(Θ)²Σ²)

( Goldberger, 1991).

So now if we want to use s² to estimate σ² we form the statistic

Z^ = ( Xbar - μ)² / (s² / n)^1/2 = n^1/2 ( Xbar - μ)²/ s (1)

This looks like the t- statistic, but if we can’t make the assumption of normality, the exact results of the t-distribution do not apply. In this case we rely on results of both the CLT and Slutsky theorems.

Given the traditional standardized normal variable formulation:

Z = n^1/2 ( Xbar - μ)²/ σ

Algebraic manipulation shows that

( σ/ s) n^1/2 ( Xbar - μ)²/ σ = n^1/2 ( Xbar - μ)²/ s

then ( σ/s) Z = Z^ where Z^ is defined in (1) above

By the CLT, Z → ^d N( 0,1)

It can be shown that s²→ ^p σ²

If we define Θ^ as s² then we can view ( σ/s) as a function h (Θ^)

Then by S1 h (Θ^)→ ^p h(Θ) which implies that ( σ/s) → ^p ( σ/ σ) = 1

Given S1, S4 gives the following result: Θ^ Z_n → ^d N( 0, Θ² ) which implies that
( σ/s)Z → ^d N( 0, ( σ/ σ)² ) = N( 0,1)

And therefore Z^ → ^d N(0,1).

Therefore, by the Central Limit and Slutsky theorems (S1 and S4) one can use the asymptotic properties of the statistic Z^ = n^1/2 ( Xbar - μ)²/s to form confidence intervals based on the standard normal distribution without making any assumptions about the distribution of the sample data and using s² to estimate σ².

How large does n have to be before asymptotic properties apply? From Kennedy, A Guide to Econometrics 5th Edition:

"How large does the sample size have to be for estimators to display their asymptotic properties? The answer to this crucial question depends on the characteristics of the problem at hand. Goldfeld and Quandt (1972, p.277) report an example in which a sample size of 30 is sufficiently large and an example in which a sample of 200 is required."

An important note to remember, it is often the case that people say 'as n becomes large the normal distribution approximates the t-distribution', but in fact, as shown above, as n-becomes large the formulation above (Z^) actually approximates the normal distribution (again based on the CLT and the Slutsky theorems).

References:

A Guide to Econometrics, Kennedy 2003.
A Course in Econometrics, Goldberger 1991.
Economics 603 Research Methods and Procedures in Economics Course Notes. University of Kentucky. Taught by Dr. Christopher Bollinger (2002).

Econometric Sense

Saturday, March 5, 2011

Student's t, Normality, & the Slutsky Theorems

No comments:

Post a Comment