The R high performance computing sig might be useful for some of these questions. https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
Dan Daniel Nordlund Bothell, WA USA > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] > On Behalf Of Jeff Newmiller > Sent: Monday, August 19, 2013 4:19 PM > To: Patrick Connolly > Cc: r-help@R-project.org; Hopkins,Bill > Subject: Re: [R] Appropriateness of R functions for multicore > > I don't know... I suppose it depends how it fails. I recommend that you > restrict yourself to using only the data that was passed as parameters to > your parallel function. You may be able to tackle parts of the task and > return only those partial results to confirm how far through the code you > can get. > -------------------------------------------------------------------------- > - > Jeff Newmiller The ..... ..... Go > Live... > DCN:<jdnew...@dcn.davis.ca.us> Basics: ##.#. ##.#. Live > Go... > Live: OO#.. Dead: OO#.. Playing > Research Engineer (Solar/Batteries O.O#. #.O#. with > /Software/Embedded Controllers) .OO#. .OO#. > rocks...1k > -------------------------------------------------------------------------- > - > Sent from my phone. Please excuse my brevity. > > Patrick Connolly <p_conno...@slingshot.co.nz> wrote: > >On Sat, 17-Aug-2013 at 05:09PM -0700, Jeff Newmiller wrote: > > > > > >|> In most threaded multitasking environments it is not safe to > >|> perform IO in multiple threads. In general you will have difficulty > >|> performing IO in parallel processing so it is best to let the > >|> master hand out data to worker tasks and gather results from them > >|> for storage. Keep in mind that just because you have eight cores > >|> for processing doesn't mean you have eight hard disks, so if your > >|> problem is IO bound in single processor operation then it will also > >|> be IO bound in threaded operation. > > > >For tasks which don't involve I/O but fail with mclapply, how does one > >work out where the problem is? The handy browser() function which > >allows for interactive diagnosis won't work with parallel jobs. > > > >What other approaches can one use? > > > >Thanx > > > > > > > > > >------------------------------------------------------------------------- > -- > > > > > > > >|> Jeff Newmiller The ..... ..... Go > >Live... > >|> DCN:<jdnew...@dcn.davis.ca.us> Basics: ##.#. ##.#. > >Live Go... > >|> Live: OO#.. Dead: OO#.. > >Playing > >|> Research Engineer (Solar/Batteries O.O#. #.O#. > >with > >|> /Software/Embedded Controllers) .OO#. .OO#. > >rocks...1k > >|> > >------------------------------------------------------------------------- > -- > > > >|> Sent from my phone. Please excuse my brevity. > >|> > >|> "Hopkins, Bill" <bill.hopk...@level3.com> wrote: > >|> >Has there been any systematic evaluation of which core R functions > >are > >|> >safe for use with multicore? Of current interest, I have tried > >calling > >|> >read.table() via mclapply() to more quickly read in hundreds of raw > >|> >data files (I have a 24 core system with 72 GB running Ubuntu, a > >|> >perfect platform for multicore). There was a 40% failure rate, > >which > >|> >doesn't occur when I invoke read.table() serially from within a > >single > >|> >thread. Another example was using pvec() to invoke > >|> >sapply(strsplit(),...) on a huge character vector (to pull out > >fields > >|> >from within a field). It looked like a perfect application for > >pvec(), > >|> >but it fails when serial execution works. > >|> > > >|> >I thought I'd ask before taking on the task of digging into the > >|> >underlying code to see what is might be causing failure in a > >multicore > >|> >(well, multi-threaded) context. > >|> > > >|> >As an alternative, I could define multiple cluster nodes locally, > >but > >|> >that shifts the tradeoff a bit in whether parallel execution is > >|> >advantageous - the overhead is significantly more, and even with 72 > >GB, > >|> >it does impose greater limits on how many cores can be used. > >|> > > >|> >Bill Hopkins > >|> > > >|> >______________________________________________ > >|> >R-help@r-project.org mailing list > >|> >https://stat.ethz.ch/mailman/listinfo/r-help > >|> >PLEASE do read the posting guide > >|> >http://www.R-project.org/posting-guide.html > >|> >and provide commented, minimal, self-contained, reproducible code. > >|> > >|> ______________________________________________ > >|> R-help@r-project.org mailing list > >|> https://stat.ethz.ch/mailman/listinfo/r-help > >|> PLEASE do read the posting guide > >http://www.R-project.org/posting-guide.html > >|> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.