Without more study, I can only give some general pointers. The as.vector() in X1 <- as.vector(coord[1]) is almost certainly not needed. It will add a little bit to your execution time. Converting the output of func() to a one row matrix is almost certainly not needed. Just return c(res1, res2).
Your data frame appears to be entirely numeric, in which case you don't need to ever use a data frame. Try apply( tab, 1, func, a=40, b=5, c=1 ) instead of all that dplyr stuff. Your function can be redefined as func <- function(coord, a, b, c){ X1 <- as.vector(coord[1]) Y1 <- as.vector(coord[2]) X2 <- as.vector(coord[3]) Y2 <- as.vector(coord[4]) res1 <- mean(c((X1 - a) : (X1 - 1), (Y1 + 1) : (Y1 + 40))) res2 <- mean(c((X2 - a) : (X2 - 1), (Y2 + 1) : (Y2 + 40))) if (c==0) c(res1, res2) else c(res1, res2)*b } I suspect you can operate on the entire matrix, without looping (which both the apply() method, and the split/rbind method do, in effect), and if so it will be much faster. But I can't say for sure without more study. -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 Lab cell 925-724-7509 On 11/1/18, 12:35 PM, "R-help on behalf of Nelly Reduan" <r-help-boun...@r-project.org on behalf of nell.r...@hotmail.fr> wrote: Hello, I have a input data frame with multiple rows. For each row, I want to apply a function. The input data frame has 1,000,000+ rows. How can I speed up my code ? I would like to keep the function "func". Here is a reproducible example with a simple function: library(tictoc) library(dplyr) func <- function(coord, a, b, c){ X1 <- as.vector(coord[1]) Y1 <- as.vector(coord[2]) X2 <- as.vector(coord[3]) Y2 <- as.vector(coord[4]) if(c == 0) { res1 <- mean(c((X1 - a) : (X1 - 1), (Y1 + 1) : (Y1 + 40))) res2 <- mean(c((X2 - a) : (X2 - 1), (Y2 + 1) : (Y2 + 40))) res <- matrix(c(res1, res2), ncol=2, nrow=1) } else { res1 <- mean(c((X1 - a) : (X1 - 1), (Y1 + 1) : (Y1 + 40)))*b res2 <- mean(c((X2 - a) : (X2 - 1), (Y2 + 1) : (Y2 + 40)))*b res <- matrix(c(res1, res2), ncol=2, nrow=1) } return(res) } ## Apply the function set.seed(1) n = 10000000 tab <- as.matrix(data.frame(x1 = sample(1:100, n, replace = T), y1 = sample(1:100, n, replace = T), x2 = sample(1:100, n, replace = T), y2 = sample(1:100, n, replace = T))) tic("test 1") test <- tab %>% split(1:nrow(tab)) %>% map(~ func(.x, 40, 5, 1)) %>% do.call("rbind", .) toc() test 1: 599.2 sec elapsed Thanks very much for your time Have a nice day Nell [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.