Hello Anna, The speed of parallel computing depends on many factors. To avoid any potential confounders, Please try to use this code for timing (assuming you still have all the variables you used in your example)
``` parallel_param <- SnowParam(workers = ncores, type = "SOCK", tasks = length(my_list),exportglobals = FALSE,exportvariables = FALSE) bpstart(param) system.time({ res2 <- bplapply(my_list, FUN, BPPARAM = parallel_param) }) bpstop(param) ``` Also, I encourage you to submit your question along with a reproducible example to the GitHub issue here: https://github.com/Bioconductor/BiocParallel/issues It can help us manage the discussion and pinpoint the problem. Thanks Best, Jiefei On Tue, Aug 8, 2023 at 8:21 AM Anna Plaxienko <a...@plaxienko.com> wrote: > My motivation for using distributed memory was that my package is also > accessible on Windows. Is it better to use shared memory as default but > check the user's system and then switch to socket only if necessary? > > Regarding the real data. I have 68 samples (rows) of methylation EPIC array > data (850K columns), that I split by chromosomes. So I get 22 matrices, > each from 80K to 10K columns – that's why I need load balancing. When I use > *clusterApplyLB*, the running time of my method is 38 minutes. With > *bplapply* it's 42 minutes. In other examples the difference is the same > 10-15%. It's of course not dramatic, if you've already waited 38 minutes, > you can wait an extra 4 :) But I'm just curious as to why and if it's > something I can fix. > > вт, 8 авг. 2023 г. в 15:04, Waldir Leoncio Netto <w.l.ne...@medisin.uio.no > >: > > > Dear Anna, > > > > According to the documentation of "BiocParallelParam", SnowParam() is a > > subclass suitable for distributed memory (e.g. cluster) computing. If > > you're running your code on a simpler machine with shared memory (e.g. > your > > PC), you're probably better off using MulticoreParam() instead. Here's a > > modified example based on yours: > > > > # Setup > > library(parallel) > > library(BiocParallel) > > my_list <- list(1:10, 11:20, 21:30, 31:40, 41:50, 51:60, 61:70, 71:80, > > 81:90) > > FUN <- function(x) return(x ^ 10) > > ncores <- min(detectCores() - 1L, 10L) > > > > # Parallel > > cl <- makeCluster(ncores) > > print(system.time(res <- clusterApplyLB(cl, my_list, FUN))) > > stopCluster(cl) > > > > # BiocParallel > > parallel_param_1 <- SnowParam(workers = ncores, tasks = length(my_list)) > > print(system.time(res2 <- bplapply(my_list, FUN, BPPARAM = > > parallel_param_1))) > > parallel_param_2 <- MulticoreParam(workers = ncores, tasks = > > length(my_list)) > > print(system.time(res3 <- bplapply(my_list, FUN, BPPARAM = > > parallel_param_2))) > > > > On my machine, the output is as follows (notice the last column, with the > > total time, shows MulticoreParam() performing better than parallel): > > > > brukar system brukt > > 0.000 0.004 0.088 > > brukar system brukt > > 0.114 0.001 1.336 > > brukar system brukt > > 0.074 0.124 0.060 > > > > How does that work on your actual data? > > > > Best, > > Waldir > > > > ti., 08.08.2023 kl. 13.10 +0200, skrev Anna Plaxienko: > > > > Hi all! > > > > I'm switching from the base R *parallel* package to *BiocParallel* for my > > Bioconductor submission and I have two questions. First, I wanted advice > on > > whether I've implemented load balancing correctly. Second, I've noticed > > that the running time is about 15% longer with BiocParallel. Any ideas > why? > > > > > > Parallel code > > > > cl <- makeCluster(ncores) > > res <- clusterApplyLB(cl, my_list, FUN) > > stopCluster(cl) > > > > BiocParallel > > > > parallel_param <- SnowParam(workers = ncores, type = "SOCK", tasks = > > length(my_list)) > > res2 <- bplapply(my_list, FUN, BPPARAM = parallel_param) > > > > Thank you! > > > > Best regards, > > Anna Plaksienko > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioc-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/bioc-devel > > > > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioc-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/bioc-devel > [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel