The nested for-loops could very easily be moved to Rcpp which should speed them up. Using apply functions instead of for-loops will not make it faster; they still have to do the same looping.
At least, when I use `outer` to replace the loop I get roughly the same speed for the two versions — although the `outer` solution does iterate over the entire matrix and not just the upper-triangular matrix. library(stringdist) # I don’t have TSmining library installed so I tested with this instead for_loop_test <- function() { matrixPrepared <- matrix(NA, nrow = nrow(dataS), ncol = nrow(dataS)) for (i in 1:(nrow(dataS)-1)){ for (j in (1+i):nrow(dataS)){ matrixPrepared[i, j] <- stringdist(paste0(as.character(dataS[i,]), collapse=""), paste0(as.character(dataS[j,]), collapse="")) } } matrixPrepared } apply_test <- function() { get_dist <- function(i, j) { if (i <= j) NA else stringdist(paste0(as.character(dataS[i,]), collapse=""), paste0(as.character(dataS[j,]), collapse="")) } get_dist <- Vectorize(get_dist) t(outer(1:nrow(dataS), 1:nrow(dataS), get_dist)) } library(microbenchmark) equivalent <- function(x, y) (is.na(x) && is.na(y)) || (x == y) check <- function(values) all(equivalent(values[[1]], values[[2]])) microbenchmark(for_loop_test(), apply_test(), check = check, times = 5) Cheers Thomas On 18 August 2016 at 17:41:01, AURORA GONZALEZ VIDAL (aurora.gonzal...@um.es(mailto:aurora.gonzal...@um.es)) wrote: > Hello > > I have two for loops that I am trying to optimize... I looked for > vectorization or for using some funcions of the apply family but really > cannot do it. I am writting my code with some small data set. With this > size there is no problem but sometimes I will have hundreds of rows so it > is really important to optimize the code. Any suggestion will be very > welcomed. > > library("TSMining") > dataS = data.frame(V1 = sample(c(1,2,3,4),30,replace = T), > V2 = sample(c(1,2,3,4),30,replace = > T), > V3 = sample(c(1,2,3,4),30,replace = > T), > V4 = sample(c(1,2,3,4),30,replace = > T)) > saxM = Func.matrix(5) > colnames(saxM) = 1:5 > rownames(saxM) = 1:5 > matrixPrepared = matrix(NA, nrow = nrow(dataS), ncol = nrow(dataS)) > > FOR(I IN 1:(NROW(DATAS)-1)){ > FOR(J IN (1+I):NROW(DATAS)){ > MATRIXPREPARED[I,J] = FUNC.DIST(AS.CHARACTER(DATAS[I,]), > AS.CHARACTER(DATAS[J,]), SAXM, N=60) > } > } > matrixPrepared > > Thank you! > > > ------ > Aurora González Vidal > Phd student in Data Analytics for Energy Efficiency > > Faculty of Computer Sciences > University of Murcia > > @. aurora.gonzal...@um.es > T. 868 88 7866 > www.um.es/ae > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.