Hello,
I need to first apologize for the error in my first question
 dataframe$A = 20 dataframe$B = NA
 
 dataframe$A + dataframe$B actually  returns NA

You quite understand my point of view. This is a household level data where you 
need to compute the total income of each household member before aggregating by 
household. 
Assume you have a household with 5 members
4 out of the 5 household members do have a full-time job
3 of the household members do not have a part-time job so that the column for 
these part-time job records NA for these three household members.
1 of the household members neither has a full-time nor part-time job
When I add the column for the full-time job and the part-time job for the five 
household members, it returns NA as the total income for the two household 
members who at least should have their total income equal to their full-time 
job income. 
Based on the scenario described above, only one of the household members should 
have NA for the total income but R returns NA as the total income for the two 
household members who at least should have their total income equal to their 
full-time job income. 
  
This is just the first step because subsequently, I will need to compute mean. 
If I go ahead to replace the NA's with zeros it will bias my mean. 

So all I need is a way to still retain my NA so that my mean and other relevant 
computations will not be biased.

Thank you
 
--------------------------------------------
On Sun, 9/6/15, Rolf Turner <r.tur...@auckland.ac.nz> wrote:

 Subject: Re: [FORGED] [R] Handling "NA" in summation

 Date: Sunday, September 6, 2015, 7:16 PM

 On 07/09/15 10:22, Olu
 Ola via R-help wrote:
 > Hello, I am
 currently working with a dataframe which has some missing
 > values represented by "NA".
 whenever, I add two columns in which at
 >
 least one of the pair of an observation is "NA",
 the sum returns
 > zero. That is for the
 same observation, if
 >
 > dataframe$A = 20 dataframe$B = NA
 >
 > dataframe$A +
 dataframe$B  returns zero.

 No it does not.  It returns NA.  As it
 should.

 > I do not want to delete
 the observations with the NA's. How do I go
 > about carrying out the necessary
 operations without deleting the
 >
 observations with the NA's.

 Your question seems to demonstrate a
 substantial amount of confusion.

 Amongst other things you probably want to deal
 with vectors (or perhaps 
 matrices) rather
 than data frames.

 To sum a
 numeric vector, ignoring missing values, you can use the
 sum() 
 function, setting the argument
 "na.rm" to TRUE.  E.g.

     v <- c(1,NA,2,NA,3,NA,4,NA)
     sum(v,na.rm=TRUE) # Gives 10.

 Ignore other advice that you
 were given, to replace NAs in your data 
 frame (???) by zeroes.  That is very
 dangerous, misleading and 
 confusing. 
 "Missing" and "zero" are *VERY*
 different concepts.

 cheers,

 Rolf
 Turner


 --

 Technical Editor ANZJS
 Department of Statistics
 University of Auckland
 Phone:
 +64-9-373-7599 ext. 88276

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to