(click to enlarge)
Reference:
Selection of Target Sites for Mobile DNA Integration in the Human Genome
Berry C, Hannenhalli S, Leipzig J, Bushman FD, 2006 Selection of Target Sites for Mobile DNA Integration in the Human Genome. PLoS Comput Biol 2(11): e157. doi:10.1371/journal.pcbi.0020157
quote "The data were analyzed using the R language and environment for statistical computing and graphics "
R code for plot was adapted from code provided via the addicted to R graph gallery : http://addictedtor.free.fr/graphiques/RGraphGallery.php?graph=78
# *------------------------------------------------------------------ # | # | import scored logit data from SAS - code generated by SAS MACRO %EXPORT_TO_R # | # | # *----------------------------------------------------------------- # set R working directory setwd("C:\\Documents and Settings\\wkuuser\\Desktop\\PROJECTS\\Stats Training") # get data dat.from.SAS <- read.csv("fromSAS_delete.CSV", header=T) # check data dimensions dim(dat.from.SAS) names(dat.from.SAS) # *------------------------------------------------------------------ # | # | scatter plot with marginal histograms # | # | # *----------------------------------------------------------------- # # model predicts P(G) so we want these probabilities for each group # # get p(G) data set for the group that is actually green green <- dat.from.SAS[ dat.from.SAS$class=="G",] dim(green) # get p(G) data set for group that is actually red red <- dat.from.SAS[ dat.from.SAS$class=="R",] dim(red) # just look at regular histograms for each group hist(green$P_G, main = 'histogram for green') hist(red$P_G, main = 'histogram for red') # in order to do scatter plots n must be the same for each # group, randomly sample n = n(green) from red # Total number of red observations to match green N <- 24 print(N) # Randomly arrange the data and select out N size sample for red # and test set. dat <- red[sample(1:N),] red.rs <- dat[1:N,] dim(red.rs) # does the distribution retain original properties? Yes hist(red.rs$P_G, main = 'histogram for red sample') plot(green$P_G, red.rs$P_G) # *------------------------------------------------------------------ # | # | create the marginal plots # | # | # *----------------------------------------------------------------- def.par <- par(no.readonly = TRUE) # save default, for resetting... # define histograms Ghist <- hist(green$P_G,plot=FALSE) Rhist <- hist(red.rs$P_G, plot=FALSE) top <- max(c(Ghist$counts, Rhist$counts)) Grange <- c(0,1) Rrange <- c(0,1) nf <- layout(matrix(c(2,0,1,3),2,2,byrow=TRUE), c(3,1), c(1,3), TRUE) #layout.show(nf) par(mar=c(3,3,1,1)) plot(green$P_G, red.rs$P_G, xlim=Grange, ylim=Rrange, xlab="green", ylab="red") par(mar=c(0,3,1,1)) barplot(Ghist$counts, axes=FALSE, ylim=c(0, top), space=0, main = 'green') par(mar=c(3,0,1,1)) barplot(Rhist$counts, axes=FALSE, xlim=c(0, top), space=0, horiz=TRUE, main = 'red') par(def.par)
Matt,
ReplyDeleteDid you know that there is also a graph gallery for the SG procedures in SAS?
http://support.sas.com/sassamples/graphgallery/index.html
The graph you want is in the PROC SGRENDER gallery (Sample 35172) and includes a link that takes you to http://support.sas.com/kb/35/172.html
If you cut and paste the two calls to PROC TEMPLATE, then the following statements give a scatter plot with marginal histograms for some fake data:
data a (drop=i); /* fake data */
do i = 1 to 20;
x=rannor(1); y = rannor(1); output;
end;
run;
ods graphics;
proc sgrender data=a template=scatterhist;
dynamic YVAR="X" XVAR="Y";
run;
Actually, I don't like that the SAS Sample uses transparency for the scatter plot.
Set datatransparency=0 on the SCATTERPLOT statement in order to get the usual scatter plot.
Cheers,
Rick
See also "How to create a scatter plot with marginal histograms in SAS" http://bit.ly/jQVow4
ReplyDelete