Thank you for your reply,
 
I used this code on my test data, but did not get the same p-values. 
 
I think I know were the difference lies; when the data is split in 4 parts I 
want to compare the two left groups (group 1 and 2) with each other and the two 
right groups (group 3 and 4) with each other. It seems that with this code 
group 1 and 3 are compared with each other and group 2 and 4, I did not yet 
succeeded in changing this. 
 
About the unequal data sizes, I thought I could 'correct' this by using round. 
For example, when my data consists of 17 data points I would use
 
m <- length(data)/2
x <- data[1:round(m)]
y <- data[(round(m)+1):length(data)]
 
x has size 9 and y has size 8. 
 
Sincerely,

Marina de Wolff
 



From: michael.weyla...@gmail.com
Date: Thu, 11 Aug 2011 11:54:11 -0400
Subject: Re: [R] Splitting data
To: marinadewo...@hotmail.com
CC: r-help@r-project.org

This sounds very much like a recursive problem: something like this seems to 
get the gist of what you want. 

DataSplits <- function(Data, alpha = 0.05) {
    DataSplitsCore <- function(Data, alpha, level) {
        tt <- t.test(Data[,1],Data[,2])
        print(tt)
        if (tt$p.value > alpha) {
            print(paste("Stopped at level", level))
            return(invisible(TRUE))
        } else {
            nr = floor(NROW(Data)/2)
            if (nr == 1) {print(paste("Reached Samples of Size 1")); stop}
            d1 = DataSplitsCore(Data[(1:nr),], alpha = alpha, level = level + 1)
            if (d1) return(invisible(TRUE))
            d2 = DataSplitsCore(Data[-(1:nr),], alpha = alpha, level = level +1)
            if (d2) return(invisible(TRUE))
            return(invisible(FALSE))
        }
    }
    DataSplitsCore(Data, alpha = alpha, level = 1)
}

Your description wasn't the clearest about what to do when the data sizes 
didn't match, but this should give you a start. Let me know if this doesn't do 
as desired and I can help tweak it. 

Hope this can be of help, 

Michael Weylandt 

PS -- You might as well use R's built in t.test function. 


On Thu, Aug 11, 2011 at 5:17 AM, Marina de Wolff <marinadewo...@hotmail.com> 
wrote:


I want to implement the following algorithm in R:

I want to split my data, use a t test to compare both means of the groups to 
see if they significantly differ from each other. If this is a yes (p < alpha) 
I want to split again (into 4 groups) and do the same procedure twice,  and 
stop otherwise (here the problem arises). As a final result I would have 
different groups of data.

I made some code where the data is splitted, until no splitting is possible. So 
for 16 datapoints, we can split 4 times with a final result of 16 groups (p is 
NA for the 4th split since sd cannot be calculated..).

The code calculated all p values, but I don't want this. I want it to stop when 
p > alpha. I tried while, but didn't succeed.

I hope someone can help me to acchieve my goal.

This is what I tried so far with test data:

a = rnorm(9,0,0.1)
b = rnorm(7,1,0.1)
data = c(a,b)
plot(data)

# Want to calculate max of groups/split for the data
d = seq(1,100,1)
n = 2^d
m <- which(n <=length(data))
n = n[m[1]:m[length(m)]]

# All groups
i=0
j=0
dx = 0
dy =
for (i in 1:length(n)){
split <- length(data)/(n[i])
for (j in 1:(n[i]/2)){
x = data[(1 + (j-1)*(2*split)):(round(split) + (j-1)*(2*split))]
dx = cbind(dx,x)
y = data[((round(split)+1) + (j-1)*(2*split)):(2*j*split)]
dy = cbind(dy,y)
}}

dx = dx[,2:dim(dx)[2]]
dy = dy[,2:dim(dy)[2]]

k=0
meanx=0
meany=0
sdx=0
sdy=0
nx=0
ny=0
for (k in 1:dim(dx)[2]) {
meanx[k] = mean(unique(dx[,k]))
meany[k] = mean(unique(dy[,k]))
sdx[k] = sd(unique(dx[,k]))
sdy[k] = sd(unique(dy[,k]))
nx[k] = length(unique(dx[,k]))
ny[k] = length(unique(dy[,k]))
}

t = (meanx-meany)/sqrt((sdx^2/nx) + (sdy^2/ny))
df = ((sdx^2/nx) + (sdy^2/ny))^2/((sdx^2/nx)^2/(nx-1) + (sdy^2/ny)^2/(ny-1))
p = 2*pt(-abs(t),df=df)
alpha = 0.05
       [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

                                          
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to