Is the slowdown happening while mclapply runs or while you're doing
the rbind? If the latter, I wonder if the code below is more efficient
than using rbind inside a loop:

my_df = do.call( rbind , my_list_from_mclapply )



On Wed, Jun 29, 2011 at 3:34 PM, Vincent Aubanel <[email protected]> wrote:
> Hi all,
>
> I'm using mclapply() of the multicore package for processing chunks of data 
> in parallel --and it works great.
>
> But when I want to collect all processed elements of the returned list into 
> one big data frame it takes ages.
>
> The elements are all data frames having identical column names, and I'm using 
> a simple rbind() inside a loop to do that. But I guess it makes some 
> expensive checking computations at each iteration as it gets slower and 
> slower as it goes. Writing out to disk individual files, concatenating with 
> the system and reading back from disk the resulting file is actually faster...
>
> Is there a magic argument to rbind() that I'm missing, or is there any other 
> solution to collect the results of parallel processing efficiently?
>
> Thanks,
> Vincent
>
> _______________________________________________
> R-SIG-Mac mailing list
> [email protected]
> https://stat.ethz.ch/mailman/listinfo/r-sig-mac
>

_______________________________________________
R-SIG-Mac mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/r-sig-mac

Reply via email to