Re: [R] p-val issue for ranked two-group test

Joshua Wiley Thu, 20 Oct 2011 18:39:17 -0700

On Thu, Oct 20, 2011 at 3:42 PM, peter dalgaard <pda...@gmail.com> wrote:
>
> On Oct 20, 2011, at 23:48 , Joshua Wiley wrote:
>
>> Hi,
>>
>> It looks like you are trying to manually bootstrap.
>
> Nope. It's a manually performed approximate Wilcoxon test. Which is fair 
> enough if the object is to learn something. (Notice, however, that the 
> ExactRankTests package eats this sort of problem for breakfast.)


You are correct, Peter.  Apologies, Laurel, for a hastie read and
reply to your email.

>
> As for the actual code, the problem seems mainly to be unclear thinking. The 
> most glaring problem is that "exptdiff" is never actually calculated, but 
> even if it were, it wouldn't be  what the comment says that it is. If it was, 
> it would logically be zero since the "expected mean rank" is the same in both 
> groups (under the null, but what else could be meant?). More likely the 
> intention is to calculate the _observed_ difference in the mean rank, as is 
> done for pdiff (what does "p" stand for??) inside the loop, but based on 
> both.ranks rather than on x. With clarified thinking, I'd expect things to 
> fall into place rather easily.
>
>> Take a look at:
>>
>> require(boot)
>> ?boot
>>
>> as an added advantage of using boot instead of trying to do it
>> manually, you can easily parallelize.  In fact, if you are using one
>> of the pre-release versions of 2.14.0, the new parallel package is
>> included by default and you do not even have to go venturing out into
>> the wide world of CRAN to look.  That said, there are several aspects
>> of your code that could be readily vectorized.  More specific details
>> supplied if a less homework-like example is provided.
>>
>> Cheers,
>>
>> Josh
>>
>> On Thu, Oct 20, 2011 at 10:17 AM, Laurel Klein Serieys
>> <laurelkl...@ucla.edu> wrote:
>>> Hi-
>>> I'm wondering if anyone can help me with my code.  I'm coming up dry when I
>>> try to get a p-value from the following code.  If I make a histogram of my
>>> resampled distribution, I find the difference between by groups to be
>>> significant.  I've ranked the data since I have outliers in one of my
>>> groups.
>>>
>>> mange= c(35,  60,  81, 158, 89, 130,  90,  38, 119, 137,  52,  30,  27, 115,
>>> 123,  31, 124,  91)
>>>
>>> healthy= c(46, 50, 30, 58, 32, 42, 42, 33, 19, 42, 30, 26, 38, 23, 16, 28,
>>> 42, 42, 33, 35, 51, 31, 39, 40 , 42, 38, 36, 39, 38)
>>>
>>> l.mange<-length(mange)
>>> l.healthy<-length(healthy)
>>>
>>> exptdiff <- mean.mange - mean.healthy #the expected difference between
>>> between the mean of the ranked groups
>>>
>>>
>>> both.chemistry<-c(mange, healthy) #concatenate two vectors into one in
>>> preparation for resampling the data
>>>
>>>
>>> both.ranks<-rank(both.chemistry) #rank combined data in the case that there
>>> are outlying values in the data or the dataset is small
>>>
>>> reps=1000
>>>
>>>
>>> z<-rep(NA,reps) # z will the the simulated storage value for the resampling
>>> efforts
>>>
>>> for(i in 1:reps){ #create the loop
>>>
>>> x<- sample(both.ranks, length(both.ranks),replace=FALSE) #instructions for
>>> how to resample where sample the entire combined data without replacment
>>>
>>> p.mange<-mean(x[(1:l.mange)])  #create a simulate mean value for the
>>> resampled mange values
>>> p.healthy<-mean(x[(l.mange+1):(l.mange+l.healthy)])  #create a simulated
>>> mean value for the resampled healthy values
>>>
>>> pdiff<- p.mange-p.healthy #the simulated difference between groups
>>>
>>> z[i]<- pdiff  #the stored list of simulated differences
>>> }
>>> p=mean(z>=exptdiff)*2 #2-tailed test multiply by two
>>> p
>>>
>>> hist(z, xlab="Resample Values", main="Distribution for Two-Group BUN Test")
>>> confints=quantile(z, c(0.025,0.975))
>>> abline(v=confints, col="blue") #draw a line for each cutoffs
>>> abline(v=exptdiff, col="red")
>>>
>>> Thanks!
>>> L.Serieys
>>>
>>> ______________________________________________
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>> --
>> Joshua Wiley
>> Ph.D. Student, Health Psychology
>> Programmer Analyst II, ATS Statistical Consulting Group
>> University of California, Los Angeles
>> https://joshuawiley.com/
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Email: pd....@cbs.dk  Priv: pda...@gmail.com

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] p-val issue for ranked two-group test

Reply via email to