HI, You could also try: dat2<- dat1[-ncol(dat1)] fun1<- function(dat,value){ datNew<- dat n1<- ncol(datNew) indx1<- seq(1,n1,by=2) indx2<- indx1+1 datNew[indx2][datNew[indx1]< value]<-NA dat$output<-rowMeans(datNew[indx2],na.rm=TRUE) dat } fun1(dat2,10) # x1 y1 x2 y2 x3 y3 output #1 2 100 190 99 1430 79 89.00000 #2 2 100 192 63 1431 75 69.00000 #3 2 100 192 63 1444 51 57.00000 #4 3 0 195 99 1499 50 74.50000 #5 3 0 198 98 1500 80 89.00000 #6 30 0 198 100 1451 97 65.66667 #7 32 100 868 100 1451 97 99.00000 #8 33 82 870 100 1490 97 93.00000 #9 33 0 871 82 1494 85 55.66667 A.K.
----- Original Message ----- From: arun <smartpink...@yahoo.com> To: Tom Oates <toate...@gmail.com> Cc: R help <r-help@r-project.org> Sent: Tuesday, June 11, 2013 5:23 PM Subject: Re: [R] Add a column to a dataframe based on multiple other column values HI, May be this helps: dat1<- read.table(text=" x1 y1 x2 y2 x3 y3 output 2 100 190 99 1430 79 89 2 100 192 63 1431 75 69 2 100 192 63 1444 51 57 3 0 195 99 1499 50 74.5 3 0 198 98 1500 80 89 30 0 198 100 1451 97 65.66666667 32 100 868 100 1451 97 99 33 82 870 100 1490 97 93 33 0 871 82 1494 85 55.66666667 ",sep="",header=TRUE) dat1$output2<-apply(dat1[,-7],1,function(x) {indx<-((seq(x)-1)%%2+1);indx1<-indx==1; indx2<-indx==2;mean(x[indx2][x[indx1]>10])}) dat1 # x1 y1 x2 y2 x3 y3 output output2 #1 2 100 190 99 1430 79 89.00000 89.00000 #2 2 100 192 63 1431 75 69.00000 69.00000 #3 2 100 192 63 1444 51 57.00000 57.00000 #4 3 0 195 99 1499 50 74.50000 74.50000 #5 3 0 198 98 1500 80 89.00000 89.00000 #6 30 0 198 100 1451 97 65.66667 65.66667 #7 32 100 868 100 1451 97 99.00000 99.00000 #8 33 82 870 100 1490 97 93.00000 93.00000 #9 33 0 871 82 1494 85 55.66667 55.66667 A.K. ----- Original Message ----- From: Tom Oates <toate...@gmail.com> To: r-help@r-project.org Cc: Sent: Tuesday, June 11, 2013 12:07 PM Subject: [R] Add a column to a dataframe based on multiple other column values Hi I have a dataframe as below: x1 y1 x2 y2 x3 y3 output 2 100 190 99 1430 79 89 2 100 192 63 1431 75 69 2 100 192 63 1444 51 57 3 0 195 99 1499 50 74.5 3 0 198 98 1500 80 89 30 0 198 100 1451 97 65.66666667 32 100 868 100 1451 97 99 33 82 870 100 1490 97 93 33 0 871 82 1494 85 55.66666667 In reality the dataframe has pairs of columns x & y up to a large number. As you can see from the column labelled output in the dataframe; I want to calculate the mean of each row of the yn columns, but only to include each yn value in the calculation of the mean if the corresponding xn column value is greater than 10. So for row 1; you will see that only y2 & y3 are included in calculating the output column, but for row 6 y1-y3 are all included. Because the number of paired x & y columns is large I am not sure the best way to achieve this. Thanks in advance Tom [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.