Dear R community members

I have been struggling on this simple question, but never get appropriate
solution. So please help.

 # my data, though I have a large number of variables
var1 <- rnorm(500, 10,4)
var2 <- rnorm(500, 20, 8)
var3 <- rnorm(500, 30, 18)
var4 <- rnorm(500, 40, 20)
datafr1 <- data.frame(var1, var2, var3, var4)

# my unsuccessful codes
 nvar <- ncol(datafr1)
for (i in 1:nvar) {
              out1 <- NULL
              out2 <- NULL
              medianx <- median(getdata[,i], na.rm = TRUE)
              show(madx <- mad(getdata[,i], na.rm = TRUE))
              MD1 <- c(medianx + 2*madx)
              MD2 <- c(medianx - 2*madx)
              out1[i] <- which(getdata[,i] > MD1) # store data that are
greater than median + 2 mad
              out2[i] <- which (getdata[,1] < MD2) # store data that are
greater than median - 2 mad
             resultdf <- data.frame(out1, out2)
             write.table (resultdf, "out.csv", sep=",")
              }


My idea here is to store those value which are either greater than median +
2 *MAD or less than median - 2*MAD. Each variable have different length of
output.

The following last error message:
Error in data.frame(out1, out2) :
  arguments imply differing number of rows: 2, 0
In addition: Warning messages:
1: In out1[i] <- which(getdata[, i] > MD1) :
  number of items to replace is not a multiple of replacement length
2: In out2[i] <- which(getdata[, 1] < MD2) :
  number of items to replace is not a multiple of replacement length
3: In out1[i] <- which(getdata[, i] > MD1) :
  number of items to replace is not a multiple of replacement length

Thank you in advance for helping me.

Best regards;
RHS

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to