Hi,
Regarding the 2nd issue of mean=3.8 being "too high", could you explain it.
#Using the same example:
 dat1$V21[dat1$V2==1|dat1$V2==0]
#[1]  6  2  1 10  0
 (6+2+1+10+0)/5
#[1] 3.8
 mean(dat1$V21[dat1$V2==1|dat1$V2==0])
#[1] 3.8

About missing data:
set.seed(55)
dat2<- as.data.frame(matrix(sample(c(NA,0:4),26*10,replace=TRUE),ncol=26))  
####new example dataset
 dat2$V2
 #[1]  4 NA  0  0  1  3  2  4  2  1
dat2$V21
 #[1] NA  3  0  0  2  0  4  0  3 NA
(dat2$V2==1|dat2$V2==0) &!is.na(dat2$V2)
# [1] FALSE FALSE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE  TRUE
 dat2$V21[(dat2$V2==1|dat2$V2==0) &!is.na(dat2$V2)]
#[1]  0  0  2 NA
mean(dat2$V21[(dat2$V2==1|dat2$V2==0) &!is.na(dat2$V2)],na.rm=TRUE)
#[1] 0.6666667
 (0+0+2)/3
#[1] 0.6666667


If this doesn't solve the problem, please provide a reproducible example using 
?dput() 
ex:
dput(head(dataset,20))

A.K.



When I enter that formula I get "NA" or NaN" as an answer.  I have some 
missing data, which was entered in as NA, so I'm not sure if that is the
 problem.  Originally I thought I would need to do the entire set of 
equations you posted, but that gave me 3.8 as a mean, which I know is 
too high to be the mean for this data set. 

Thanks 



----- Original Message -----
From: arun <smartpink...@yahoo.com>
To: R help <r-help@r-project.org>
Cc: 
Sent: Friday, July 12, 2013 8:21 AM
Subject: Re: Help with IF command strings

Hi,

Not sure I understand your question.
Suppose `data1` is your real data, but if the column names are different, 
change "V21", "V2" by those in the real data. Based on your initial post, the 
column names seemed to be the same.
mean(data1$V21[data1$V2==1|data1$V2==0])

A.K.  


What values would I substitute by real data.  I did everything the way 
you posted, and I got 3.8 as well.  So I'm curious what values I would 
change to get the mean for the actual data? 


----- Original Message -----
From: arun <smartpink...@yahoo.com>
To: R help <r-help@r-project.org>
Cc: 
Sent: Thursday, July 11, 2013 9:21 PM
Subject: Re: Help with IF command strings

HI,
Try this:
set.seed(485)
dat1<- as.data.frame(matrix(sample(0:10,26*10,replace=TRUE),ncol=26))
mean(dat1$V21[dat1$V2==1|dat1$V2==0])
#[1] 3.8
#or
with(dat1,mean(V21[V2==1|V2==0]))
#[1] 3.8


A.K.


I have data in 26 columns, I'm trying to get a mean for column 21 only for the 
participants that are either 0 or 1 in column 2. 

One of the commands I tried looked something like this 

mean(data1$V21, if(V2 = 1))   

So basically I need to have the program run a mean (and later 
other forms of analysis) on participants based on their condition. 
either 0 or 1. 

Help is greatly appreciated. 

Thanks

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to