Hello all, I'm new to R and trying to figure out how to perform calculations on a large dataset (300 000 datapoints). I have already made some code to do this but it is awfully slow. What I want to do is add a new column for each "rep_ " column where I have taken each value and divide it by the mean of all values where "PlateNo" is the same. My data is in the following format: > data
PlateNo Well rep_1 rep_2 rep_3 1 A01 1312 963 1172 1 A02 10464 6715 5628 1 A03 3301 3257 3281 1 A04 3895 3350 3496 1 A05 8731 7389 5701 2 A01 7893 6748 5920 2 A02 2912 2385 2586 2 A03 985 785 809 2 A04 1346 1018 1001 2 A05 794 314 486 To generate it copy: a <- c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2) b <- c("A01", "A02", "A03", "A04", "A05", "A01", "A02", "A03", "A04", "A05") c <- c(1312, 10464, 3301, 3895, 8731, 7893, 2912, 985, 1346, 794) d <- c(963, 6715, 3257, 3350, 7389, 6748, 2385, 785, 1018, 314) e <- c(1172, 5628, 3281, 3496, 5701, 5920, 2586, 809, 1001, 486) data <- data.frame(plateNo = a, Well = b, rep_1 = c, rep_2 = d, rep_3 = e) Here is the code I have come up with: rows <- length(data$plateNo) reps <- 3 norm <- list() for (rep in 1:reps) { x <- paste("rep_",rep,sep="") normx <- paste("normalised_",rep,sep="") for (row in 1:rows) { plateMean <- mean(data[[x]][data$plateNo == data$plateNo[row]]) wellData <- data[[x]][row] norm[[normx]][row] <- wellData / plateMean } } Any help or tips would be greatly appreciated! Thanks, Haakon [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.