I want to implement the following algorithm in R: I want to split my data, use a t test to compare both means of the groups to see if they significantly differ from each other. If this is a yes (p < alpha) I want to split again (into 4 groups) and do the same procedure twice, and stop otherwise (here the problem arises). As a final result I would have different groups of data. I made some code where the data is splitted, until no splitting is possible. So for 16 datapoints, we can split 4 times with a final result of 16 groups (p is NA for the 4th split since sd cannot be calculated..). The code calculated all p values, but I don't want this. I want it to stop when p > alpha. I tried while, but didn't succeed. I hope someone can help me to acchieve my goal. This is what I tried so far with test data: a = rnorm(9,0,0.1) b = rnorm(7,1,0.1) data = c(a,b) plot(data) # Want to calculate max of groups/split for the data d = seq(1,100,1) n = 2^d m <- which(n <=length(data)) n = n[m[1]:m[length(m)]] # All groups i=0 j=0 dx = 0 dy = for (i in 1:length(n)){ split <- length(data)/(n[i]) for (j in 1:(n[i]/2)){ x = data[(1 + (j-1)*(2*split)):(round(split) + (j-1)*(2*split))] dx = cbind(dx,x) y = data[((round(split)+1) + (j-1)*(2*split)):(2*j*split)] dy = cbind(dy,y) }} dx = dx[,2:dim(dx)[2]] dy = dy[,2:dim(dy)[2]] k=0 meanx=0 meany=0 sdx=0 sdy=0 nx=0 ny=0 for (k in 1:dim(dx)[2]) { meanx[k] = mean(unique(dx[,k])) meany[k] = mean(unique(dy[,k])) sdx[k] = sd(unique(dx[,k])) sdy[k] = sd(unique(dy[,k])) nx[k] = length(unique(dx[,k])) ny[k] = length(unique(dy[,k])) } t = (meanx-meany)/sqrt((sdx^2/nx) + (sdy^2/ny)) df = ((sdx^2/nx) + (sdy^2/ny))^2/((sdx^2/nx)^2/(nx-1) + (sdy^2/ny)^2/(ny-1)) p = 2*pt(-abs(t),df=df) alpha = 0.05 [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.