As a follow up to this, I have been able to generate a toy example of reproducible code that generates the same problem. Below is just a sample to represent the issue, but my data and subsequent functions acting on the data are much more involved.
I no longer have the error, but, the loop running in parallel is extremely slow relative to its serialized counterpart. I have narrowed down the problem to the fact that I am searching through a very large list, grabbing the data from that list by indexing to subset and then doing stuff to it. Both "work", but the parallel version is very, very slow. I believe I am sending data files to each core and the number of searches happening is prohibitive. I am very much stuck in the design-based way of how I would do this particular problem on a single core and am not sure if there is a better designed based approach for solving this problem in the parallel version. Any advice on better ways to work with the %dopar% version here? N <- 200000 myList <- vector('list', N) names(myList) <- 1:N for(i in 1:N){ myList[[i]] <- rnorm(100) } nms <- 1:N library(foreach) library(doParallel) registerDoParallel(cores=7) result <- foreach(i = 1:3) %do% { dat <- myList[[which(names(myList) == nms[i])]] mean(dat) } result <- foreach(i = 1:3) %dopar% { dat <- myList[[which(names(myList) == nms[i])]] mean(dat) } -----Original Message----- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Doran, Harold Sent: Saturday, December 03, 2016 4:26 PM To: r-help@r-project.org Subject: [R] error serialize (foreach) I have a portion of a foreach loop that I cannot run as parallel but works fine when serialized. Below is a representation of the problem as in this instance I cannot provide reproducible data to generate the same error, the actual data I am working with are confidential. Within each foreach loop are a series of custom functions acting on my data. When using %do% I get expected result but replacing it with %dopar% generates the error. I have searched archives and also stackexchange and see this is an issue that arises and I have tried a couple of the recommendations, like trying to use an outfile in makeCluster. But I am not having success. Oddly, (or perhaps not oddly), others portions of my program run in parallel and do not generate this same error library(foreach) library(doParallel) registerDoParallel(cores=3) # This portion runs and produces expected result result <- foreach(i = 1:N) %do% { tmp1 <- function1(...) tmp2 <- function2(...) tmp2 } # This portion generates error in serialize result <- foreach(i = 1:N) %dopar% { tmp1 <- function1(...) tmp2 <- function2(...) tmp2 } error in serialize(data, node$con) : error writing to connection [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.