Tuesday, December 29, 2015

Nonparametric Approaches to Multiple Comparisons

I have recently started reading "Applied Nonparametric Econometrics", and was thinking, when was the last time I even worked with basic non-parametric statistics?

For instance, in the courses I teach, I don't cover this, but some of the texts I reference cover some basics like the Mann Whitney Wilcoxon  (MWW) test (which can be thought of as a non-parametric equivalent to a two sample independent t-test) or the Kruskall-Wallis test (which is a non-parametric analogue to analysis of variance). These tests are often useful in situations that involve highly skewed, non-normal, or categorical ordered or  ranked data, or data from problematic or unknown distributions.  I kind of briefly reviewed some implementations in SAS, and particularly focused on the Kruskall-Wallis test, which has the following general null hypothesis:

Ho: All Populations Are Equal
Ha: All Populations Are Not Equal

If we reject Ho, we might conclude that there is a difference among populations, with one population or another providing a larger proportion of larger or smaller values for the variable of interest. If we could assume that the populations were of similar shape and symmetry, this *might* be interpreted as a test of differences in medians, but in general this is a test on differences in distributions and specifically ranks, similar to the MWW test. But if we do reject Ho, what next? In an analysis of variance context, if we reject the overall F-test on multiple means we can followup with pairwise comparisons to determine which means differ.  But at least in the older versions of SAS, there are no straightforward ways to do this kind of analysis in the non-parametric context. However, in the SAS Note (22620), one recommendation is to rank-transform the data and use the normal-theory methods in PROC GLM (Iman, 1982). See also Conover, W. J. & Iman, R. L. (1981) referenced below.

A good example of the application of GLM on ranked data can be found here: http://people.stat.sc.edu/Hitchcock/soil_KW_sasexample705.txt 

and a general overview of some non-parametric applications in SAS along these lines here.

You can also find a SAS macro with code and examples for post hoc tests here: http://www.alanelliott.com/kw/

I at first thought this was the macro by Juneau (in the references below and mentioned in the SAS note above) but it is something different, see the Elliot and Hynan reference below. From the abstract:

"The Kruskal-Wallis (KW) nonparametric analysis of variance is often used instead of a standard one-way ANOVA when data are from a suspected non-normal population. The KW omnibus procedure tests for some differences between groups, but provides no specific post hoc pair wise comparisons. This paper provides a SAS(®) macro implementation of a multiple comparison test based on significant Kruskal-Wallis results from the SAS NPAR1WAY procedure. The implementation is designed for up to 20 groups at a user-specified alpha significance level. A Monte-Carlo simulation compared this nonparametric procedure to commonly used parametric multiple comparison tests."

I found an application referencing this implementation here if interested.

According to the SAS note referenced above, SAS/STAT 12.1 will include some versions of some non-parametric post hoc tests. I'm also aware that there are several R packages that can do this as well, such as the dunn.test package.

I compared results from Elliot and Hynan's example code (example 1) and data to those from the adhoc GLM on ranks following Hitchcock and got similar results. I also got similar results using dunn.test in R:

# use same data as in www.alanelliott.com/kw
 
race <- c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3)
bmi <- c(32,30.1,27.6,26.2,28.2,26.4,23.1,23.5,24.6,24.3,24.9,25.3,23.8,22.1,23.4)
 
library(dunn.test) #load package
 
dunn.test(bmi,race, kw = TRUE, method ="bonferroni") # implement test with adjustments for multiple comparisons
 
Created by Pretty R at inside-R.org
References:

Palomares-Rius JE, Castillo P, Montes-Borrego M, Navas-Cortés JA, Landa BB (2015) Soil Properties and Olive Cultivar Determine the Structure and Diversity of Plant-Parasitic Nematode Communities Infesting Olive Orchards Soils in Southern Spain. PLoS ONE 10(1): e0116890. doi:10.1371/journal.pone.0116890

Dunn, O.J. “Multiple comparisons using rank sums”.
Technometrics 6 (1964) pp. 241-252.

Conover, W. J. & Iman, R. L. (1981). "Rank transformations as a bridge between parametric and
nonparametric statistics". American Statistician 35 (3): 124–129. doi:10.2307/2683975

Elliott AC, Hynan LS. “A SAS Macro implementation of a Multiple Comparison post hoc test for a Kruskal-Wallis analysis,” Comp Meth Prog Bio, 102:75-80, 2011

Iman, R.L. (1982), "Some Aspects of the Rank Transform in Analysis of Variance Problems," Proceedings of the Seventh Annual SAS Users Group International Conference, 7, 676-680.

Juneau, P. (2004), "Simultaneous Nonparametric Inference in a One-Way Layout Using the SAS System," Proceedings of the PharmaSUG 2004 Annual Conference, Paper SP04.

No comments:

Post a Comment