Dear All, Replacing missing values with means is generally not a good idea:
"Perhaps the easiest way to impute is to replace each missing value with the mean of the observed values for that variable. Unfortunately, this strategy can severely distort the distribution for this variable, leading to complications with summary measures including, notably, underestimates of the standard deviation. Moreover, mean imputation distorts relationships between variables by “pulling” estimates of the correlation toward zero." That's from Gelman and Hill -- more here : http://www.stat.columbia.edu/~gelman/arm/missing.pdf best, Fraser ________________________________________ From: Val [valkr...@gmail.com] Sent: Wednesday, April 26, 2017 8:45 PM To: r-help@R-project.org (r-help@r-project.org) Subject: [R] missing and replace HI all, I have a data frame with three variables. Some of the variables do have missing values and I want to replace those missing values (1represented by NA) with the mean value of that variable. In this sample data, variable z and y do have missing values. The mean value of y and z are152. 25 and 359.5, respectively . I want replace those missing values by the respective mean value ( rounded to the nearest whole number). DF1 <- read.table(header=TRUE, text='ID1 x y z 1 25 122 352 2 30 135 376 3 40 NA 350 4 26 157 NA 5 60 195 360') mean x= 36.2 mean y=152.25 mean z= 359.5 output ID1 x y z 1 25 122 352 2 30 135 376 3 40 152 350 4 26 157 360 5 60 195 360 Thank you in advance ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.