Hi again Joshua. I tried your function. I think it's what I need. It works well in the small example of my first post. But I have difficulties to adapt it to my data. I'll try to give you another fake example with my real script and kind of data (you can just copy and paste it to try):
ST1 <- data.frame(sensor1=rnorm(1:10),sensor2=c(NA,NA,NA,NA,NA,rnorm(6:10)),sensor3=c(1,NA,NA,4:10),sensor4=c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA),date_time=(date())) write.table(ST1,"ST1_2012.csv",sep=";",quote=F, row.names = TRUE) ST2 <- data.frame(sensor1=c(NA,NA,NA,NA,NA,6:10),sensor2=rnorm(1:10),sensor3=c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA),sensor4=c(1,NA,NA,4:10),date_time=(date())) write.table(ST2,"ST2_2012.csv",sep=";",quote=F, row.names = TRUE) ST3 <- data.frame(sensor1=c(1,NA,NA,4:10),sensor2=c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA),sensor3=rnorm(1:10),sensor4=c(NA,NA,NA,NA,NA,6:10),date_time=(date())) write.table(ST3,"ST3_2012.csv",sep=";",quote=F, row.names = TRUE) ST4 <- data.frame(sensor1=c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA),sensor2=c(1,NA,NA,4:10),sensor3=c(NA,NA,NA,NA,NA,6:10),sensor4=rnorm(1:10),date_time=(date())) write.table(ST4,"ST4_2012.csv",sep=";",quote=F, row.names = TRUE) filenames <- list.files(pattern="\\_2012.csv$") Sensors <- paste("sensor", 1:4,sep="") Stations <-substr(filenames,1,3) nsensors <- length(Sensors) nstations <- length(Stations) nobs <- nrow(read.table(filenames[1], header=TRUE,sep=";")) yr2008 <- array(NA,dim=c(nobs, nsensors, nstations)) for(i in seq_len(nstations)){ tmp <- read.table(filenames[i], header=TRUE, sep=";") yr2008[ , , i] <- as.matrix(tmp[, Sensors]) } dimnames(yr2008) <- list(seq.int(nobs), Sensors, Stations) cor1_5 <- lapply(Sensors, function(s) cor(yr2008[1:5, s, ],use="pairwise.complete.obs")) names(cor1_5) <- Sensors cor1_5 For the moment, it makes correlations between the same sensors of each file (only for a part of my data), whatever the number of NA or numeric data. I want it to do the same, but with your function: if (sum(!is.na(data[rows, ])) >= minpresent){ data } else {NULL} } I want it to give me the same correlation matrices for each sensors between my files, but I want it to calculate the correlation coefficient only if I have at least 3 numeric values (out of 5 in the example), and not whatever the number of these numeric values (just 1 or 2 for example). If there're less than 3 numeric values (1 or 2), give NA for correlation in the matrix. And if there're only NAs in the sensor data, do nothing with it (keep it and go to the next sensor). I tried to combinate your function with mine but it doesn't work. Hope you've understood. Thanks for your help! -- View this message in context: http://r.789695.n4.nabble.com/select-part-of-files-from-a-list-files-tp4630769p4631185.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.