Hi, Regarding the 2nd issue of mean=3.8 being "too high", could you explain it. #Using the same example: dat1$V21[dat1$V2==1|dat1$V2==0] #[1] 6 2 1 10 0 (6+2+1+10+0)/5 #[1] 3.8 mean(dat1$V21[dat1$V2==1|dat1$V2==0]) #[1] 3.8
About missing data: set.seed(55) dat2<- as.data.frame(matrix(sample(c(NA,0:4),26*10,replace=TRUE),ncol=26)) ####new example dataset dat2$V2 #[1] 4 NA 0 0 1 3 2 4 2 1 dat2$V21 #[1] NA 3 0 0 2 0 4 0 3 NA (dat2$V2==1|dat2$V2==0) &!is.na(dat2$V2) # [1] FALSE FALSE TRUE TRUE TRUE FALSE FALSE FALSE FALSE TRUE dat2$V21[(dat2$V2==1|dat2$V2==0) &!is.na(dat2$V2)] #[1] 0 0 2 NA mean(dat2$V21[(dat2$V2==1|dat2$V2==0) &!is.na(dat2$V2)],na.rm=TRUE) #[1] 0.6666667 (0+0+2)/3 #[1] 0.6666667 If this doesn't solve the problem, please provide a reproducible example using ?dput() ex: dput(head(dataset,20)) A.K. When I enter that formula I get "NA" or NaN" as an answer. I have some missing data, which was entered in as NA, so I'm not sure if that is the problem. Originally I thought I would need to do the entire set of equations you posted, but that gave me 3.8 as a mean, which I know is too high to be the mean for this data set. Thanks ----- Original Message ----- From: arun <smartpink...@yahoo.com> To: R help <r-help@r-project.org> Cc: Sent: Friday, July 12, 2013 8:21 AM Subject: Re: Help with IF command strings Hi, Not sure I understand your question. Suppose `data1` is your real data, but if the column names are different, change "V21", "V2" by those in the real data. Based on your initial post, the column names seemed to be the same. mean(data1$V21[data1$V2==1|data1$V2==0]) A.K. What values would I substitute by real data. I did everything the way you posted, and I got 3.8 as well. So I'm curious what values I would change to get the mean for the actual data? ----- Original Message ----- From: arun <smartpink...@yahoo.com> To: R help <r-help@r-project.org> Cc: Sent: Thursday, July 11, 2013 9:21 PM Subject: Re: Help with IF command strings HI, Try this: set.seed(485) dat1<- as.data.frame(matrix(sample(0:10,26*10,replace=TRUE),ncol=26)) mean(dat1$V21[dat1$V2==1|dat1$V2==0]) #[1] 3.8 #or with(dat1,mean(V21[V2==1|V2==0])) #[1] 3.8 A.K. I have data in 26 columns, I'm trying to get a mean for column 21 only for the participants that are either 0 or 1 in column 2. One of the commands I tried looked something like this mean(data1$V21, if(V2 = 1)) So basically I need to have the program run a mean (and later other forms of analysis) on participants based on their condition. either 0 or 1. Help is greatly appreciated. Thanks ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.