I thought you want to compare between the rows of two columns even if their corresponding values fall in the same row.
fun3<- function(mat){ indmat<-combn(seq_len(ncol(mat)),2) lst1<- lapply(seq_len(ncol(indmat)),function(i) {mat[,indmat[,i]]}) names(lst1)<-as.character(interaction(as.data.frame(t(indmat)),sep="_",drop=TRUE)) lst2<- lapply(lst1,function(x) {x1<- apply(x,2,fun1)}) lst3<- lapply(lst2,function(x) expand.grid(lapply(x,function(y) y[,1]))) lst4<-lapply(lst3,function(x) unlist(x[which.min(apply(x,1,function(y) abs(diff(y)))),]) ) lst5<- lapply(lst4,function(x){ if(abs(diff(x))>(nrow(mat)/2)){ nrow(mat)-abs(diff(x)) } else(abs(diff(x))) }) lst6<- lapply(seq_along(lst5),function(i) { x2<-lst1[[i]] if(lst5[[i]]==0) { #indx1<- seq(length(x2[,2])) #sum(abs(x2[,1]-x2[indx1,2])) 0 ######################## set to zero } else{ lapply(seq(1+lst5[[i]]),function(j){x3<-x2[,2] indx1<-seq(length(x3)-(j-1)) indx2<-c(setdiff(seq_along(x3),indx1),indx1) sum(abs(x2[,1]-x2[indx2,2])) }) } }) names(lst6)<- names(lst1) lst7<-lapply(lst6,unlist) lst8<- lapply(lst7,function(x) { Seq1<-seq_along(x) if(length(Seq1)==1) x else if(length(Seq1)==2){ sum(abs(x[1]-x[2])) } else{ ind<-rep(Seq1,each=2)[-1] ind1<-ind[-length(ind)] Reduce(`+`,lapply(split(ind1,(seq_along(ind1)-1)%/%2+1),function(i) { abs(diff(x[i])) })) } } ) lst9<-do.call(rbind,lst8) lst9 } fun3(mm) # [,1] #1_2 2.5966051 #1_3 1.0267435 #1_4 0.0000000 #1_5 1.8489204 #1_6 0.0000000 #2_3 0.0000000 #2_4 1.9040790 #2_5 2.2874235 #2_6 5.1526016 #3_4 0.9726777 #3_5 2.1359229 #3_6 5.0221450 #4_5 0.9124638 #4_6 0.0000000 #5_6 14.0550864 xx # 1 8 9 23 87 89 #[1,] 5 4 4 5 6 12 #[2,] 12 NA NA 9 NA NA #[3,] NA NA NA 12 NA NA According to xx, 1&4, 2&3, 4&6 (also 0 because both have 12) A.K. ________________________________ From: eliza botto <eliza_bo...@hotmail.com> To: "smartpink...@yahoo.com" <smartpink...@yahoo.com> Sent: Saturday, May 25, 2013 9:17 PM Subject: RE: QA thanks arun, i dont think thANKyou is enough for wat u did. anyway, there is slight modification that i want to ask to understand the codes more efficiently. what if i want to consider the distance between the columns having atleast one peak in the same month equal to zero, instead of "initial value"?? more precisely The distance between column 2 and 3 should be zero instead of 4.2951411. similarly the distance between column 4 and 6 should be zero instead of 8.260419. Thats just for my own knowledge to understand the loop. i hope you wont mind. The loop works absolutely well. Elisa > Date: Sat, 25 May 2013 18:03:33 -0700 > From: smartpink...@yahoo.com > Subject: Re: QA > To: eliza_bo...@hotmail.com > CC: r-help@r-project.org > > Hi, > I hope this works for you. > fun1<- function(x){ > big<- x>0.8*max(x) > n<- length(big) > startRunOfBigs<- which(c(big[1],!big[-n] & big[-1])) > endRunOfBigs<- which(c(big[-n] & !big[-1], big[n])) > index<- vapply(seq_along(startRunOfBigs),function(i) > which.max(x[startRunOfBigs[i]:endRunOfBigs[i]])+startRunOfBigs[i]-1L,0L) > index<-ifelse(sum(is.na(match(index,c(1,12))))==0 & > x[index]!=max(x[index]), NA,index) > data.frame(Index=index[!is.na(index)],Value=x[index[!is.na(index)]]) > } > > ##mm: data > fun3<- function(mat){ > indmat<-combn(seq_len(ncol(mat)),2) > lst1<- lapply(seq_len(ncol(indmat)),function(i) {mat[,indmat[,i]]}) > > names(lst1)<-as.character(interaction(as.data.frame(t(indmat)),sep="_",drop=TRUE)) > > lst2<- lapply(lst1,function(x) {x1<- apply(x,2,fun1)}) > > lst3<- lapply(lst2,function(x) expand.grid(lapply(x,function(y) y[,1]))) > lst4<-lapply(lst3,function(x) unlist(x[which.min(apply(x,1,function(y) > abs(diff(y)))),]) ) > lst5<- lapply(lst4,function(x){ > if(abs(diff(x))>(nrow(mat)/2)){ > nrow(mat)-abs(diff(x)) > } > else(abs(diff(x))) > }) > > lst6<- lapply(seq_along(lst5),function(i) { > x2<-lst1[[i]] > if(lst5[[i]]==0) { > indx1<- seq(length(x2[,2])) > sum(abs(x2[,1]-x2[indx1,2])) > } > else{ > lapply(seq(1+lst5[[i]]),function(j){x3<-x2[,2] > indx1<-seq(length(x3)-(j-1)) > indx2<-c(setdiff(seq_along(x3),indx1),indx1) > sum(abs(x2[,1]-x2[indx2,2])) > }) > } > }) > > names(lst6)<- names(lst1) > lst7<-lapply(lst6,unlist) > lst8<- lapply(lst7,function(x) { > Seq1<-seq_along(x) > if(length(Seq1)==1) x > else if(length(Seq1)==2){ > sum(abs(x[1]-x[2])) > } > else{ > ind<-rep(Seq1,each=2)[-1] > ind1<-ind[-length(ind)] > > Reduce(`+`,lapply(split(ind1,(seq_along(ind1)-1)%/%2+1),function(i) { > abs(diff(x[i])) > })) > } > > } > ) > do.call(rbind,lst8) > } > > fun3(mm) #rownames represent the comparison between the particular columns > # [,1] > #1_2 2.5966051 > #1_3 1.0267435 > #1_4 3.7387830 > #1_5 1.8489204 > #1_6 6.5233654 > #2_3 4.2951411 > #2_4 1.9040790 > #2_5 2.2874235 > #2_6 5.1526016 > #3_4 0.9726777 > #3_5 2.1359229 > #3_6 5.0221450 > #4_5 0.9124638 > #4_6 8.2604187 > #5_6 14.0550864 > > > A.K. > > > > > > ________________________________ > From: eliza botto <eliza_bo...@hotmail.com> > To: "smartpink...@yahoo.com" <smartpink...@yahoo.com> > Sent: Saturday, May 25, 2013 2:14 PM > Subject: QA > > > > > Dear Arun, > [text file is attached] > After your help on preparing loop for identifying peaks, here is my latest > question which is linked with my first question. but this time i will try to > make it more clear. > > > dput(xx) > structure(c(5L, 12L, NA, 4L, NA, NA, 4L, NA, NA, 5L, 9L, 12L, > 6L, NA, NA, 12L, NA, NA), .Dim = c(3L, 6L), .Dimnames = list( > NULL, c("1", "8", "9", "23", "87", "89"))) > > dput(mm) > structure(c(0.706461987893674, 0.998391468394261, 0.72402995269242, > 1.70874688194537, 1.93906363083693, 0.89540353128442, 0.328327645695443, > 0.427434603701202, 0.591932250254601, 0.444627635494183, 1.44407704434405, > 1.79150336746345, 0.740380661614246, 1.39756784211974, 1.43602731683199, > 2.40482060634346, 1.61684982192949, 0.549848553223765, 0.245763715425745, > 0.315411788974968, 0.390626431538384, 0.369934560068472, 0.769100067815155, > 1.76366863411459, 0.480885978853889, 1.21441674507622, 2.50566408677391, > 3.27361599826255, 1.18508780425679, 0.465943778037697, 0.29380145690883, > 0.36356245877522, 0.373314458026047, 0.334849362386475, 0.882050057788756, > 0.626807814853613, 0.774295647517675, 0.853105130179133, 0.738085443815565, > 1.26063449947807, 1.57350832698427, 0.790095501697794, 0.510641105191147, > 0.874523657118082, 1.31257333325184, 0.882086374572265, 1.13881207205977, > 1.29163890813439, 0.0849732189580101, 0.070591276171845, 0.0926010253161898, > 0.362209761457517, 1.45769283057202, 3.16165004659667, 2.74903557756267, > 1.94633472878995, 1.19319875840883, 0.533232612926756, 0.225531074123974, > 0.122949089115578, 2.06195904001605, 1.41493262330451, 1.35748791897328, > 1.19490680241894, 0.702488756183322, 0.338258418490199, 0.123398398622741, > 0.138548982660226, 0.16170889185798, 0.414543218677095, 1.84629295875002, > 2.24547399004563), .Dim = c(12L, 6L)) > > > You can see that that there are two matrices. "mm" is the actual matrix and > "xx" is the matrix indentifying the peaks of "mm".For being a peak a value > has to either the maximum value or atleast 80% of the maximum value. you can > see that the maximum value of coulmn 1 is in row number 5 and thats what it > showed in matrix "xx" whereas, the 80% of the maximum value is in row number > 12 therefore it considered it the second peak and row number was shown in > "xx". i want to calculate the distance matrix of "mm" in the following way... > The column are continous or cyclic. > The subtraction should start from the peak and should end when the peaks of > two columns are in the same row. The peaks are to be moved towrds eachother > in the shortest possible way. > For suppose the peak of colum 2 is in 4th row and the peak of column 6 is in > 12th row. Now moving these two peak towwards eachother requires moving col 2 > in reverse direction or column 6 in forward direction. > > For example > > Initial: > > Col 2 > > 1 2 3 4(max) 5 6 7 8 9 10 11 12 > > Col 6 > > 1 2 3 4 5 6 7 8 9 10 11 12(max) > > a<-sum(abs(col2-col6)) > > step1: > > Col 2 > > 2 3 4(max) 5 6 7 8 9 10 11 12 1 > > Col 6 > > 1 2 3 4 5 6 7 8 9 10 11 12(max) > > b<-sum(abs(col2-col6)) > > step2: > > Col 2 > > 3 4(max) 5 6 7 8 9 10 11 12 1 2 > > Col 6 > > 1 2 3 4 5 6 7 8 9 10 11 12(max) > > c<-sum(abs(col2-col6)) > > step3: > > Col 2 > > 4(max) 5 6 7 8 9 10 11 12 1 2 3 > > Col 6 > > 1 2 3 4 5 6 7 8 9 10 11 12(max) > > d<-sum(abs(col2-col6)) > > step4: > > Col 2 > > 5 6 7 8 9 10 11 12 1 2 3 4(max) > > Col 6 > > 1 2 3 4 5 6 7 8 9 10 11 12(max) > > e<-sum(abs(col2-col6)) > > total difference= abs(a-b)+abs(b-c)+abs(c-d)+abs(d-e) > > > The dissimilarity is zero if the peaks are already in the same row. like for > column 2 and 3 the distance is zero as peaks are under eachother. For column > 1 and 4 the distance is onceagain zero. Although they have different nuber of > peaks but as atleast one of their peaks is under eachother therefore distance > is zero. > > For Column 5 and 6 peaks can be moved in either direction as number of steps > to be followed are same. > > for column 1 and 2 following is the procedure > > Col1 has two maximum values in row 5th and 12th and column two has only one > maximum value at 4 row. As peak in 5th row of column one is closer to the > peak of column 2 therefore we will move towards it and procedure should be > > > Initial: > > Col 1 > > 1 2 3 4 5(max) 6 7 8 9 10 11 12(max) > > Col 8 > > 1 2 3 4(max) 5 6 7 8 9 10 11 12 > > a<-sum(abs(col1-col8)) > > Step1: > > Col 1 > > 1 2 3 4 5(max) 6 7 8 9 10 11 12(max) > > Col 8 > > 12 1 2 3 4(max) 5 6 7 8 9 10 11 > > b<-sum(abs(col1-col8)) > > total difference=abs(a-b) > > For column 4 and 5 > > Initial: > > Col 4 > > 1 2 3 4 5(max) 6 7 8 9(max) 10 11 12(max) > > Col 5 > > 1 2 3 4 5 6(max) 7 8 9 10 11 12 > > a<-sum(abs(col4-col5)) > > Step 1 > > Col 4 > > 1 2 3 4 5(max) 6 7 8 9(max) 10 11 12(max) > > Col 5 > > 2 3 4 5 6(max) 7 8 9 10 11 12 1 > > b<-sum(abs(col4-col5)) > > Total Difference= abs(a-b) > > If there is any point which i couldnt discuss please tell me... > > > Elisa ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.