Dear R community members I have been struggling on this simple question, but never get appropriate solution. So please help.
# my data, though I have a large number of variables var1 <- rnorm(500, 10,4) var2 <- rnorm(500, 20, 8) var3 <- rnorm(500, 30, 18) var4 <- rnorm(500, 40, 20) datafr1 <- data.frame(var1, var2, var3, var4) # my unsuccessful codes nvar <- ncol(datafr1) for (i in 1:nvar) { out1 <- NULL out2 <- NULL medianx <- median(getdata[,i], na.rm = TRUE) show(madx <- mad(getdata[,i], na.rm = TRUE)) MD1 <- c(medianx + 2*madx) MD2 <- c(medianx - 2*madx) out1[i] <- which(getdata[,i] > MD1) # store data that are greater than median + 2 mad out2[i] <- which (getdata[,1] < MD2) # store data that are greater than median - 2 mad resultdf <- data.frame(out1, out2) write.table (resultdf, "out.csv", sep=",") } My idea here is to store those value which are either greater than median + 2 *MAD or less than median - 2*MAD. Each variable have different length of output. The following last error message: Error in data.frame(out1, out2) : arguments imply differing number of rows: 2, 0 In addition: Warning messages: 1: In out1[i] <- which(getdata[, i] > MD1) : number of items to replace is not a multiple of replacement length 2: In out2[i] <- which(getdata[, 1] < MD2) : number of items to replace is not a multiple of replacement length 3: In out1[i] <- which(getdata[, i] > MD1) : number of items to replace is not a multiple of replacement length Thank you in advance for helping me. Best regards; RHS [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.