(I neglected to use reply-all.) ---------- Forwarded message ---------- From: David Romano <drom...@stanford.edu> Date: Sat, May 4, 2013 at 11:25 AM Subject: Re: [R] how to parallelize 'apply' across multiple cores on a Mac To: Charles Berry <ccbe...@ucsd.edu>
On Sat, May 4, 2013 at 9:32 AM, Charles Berry <ccbe...@ucsd.edu> wrote: > David, > > If you insist on explicitly parallelizing this: > > The functions in the recommended package 'parallel' work on a Mac. > > I would not try to work on each tiny column as a separate function call - > too much overhead if you parallelize - instead, bundle up 100-1000 columns > to operate on. > > The calc's you describe are sound simple enough that I would just write > them in C and use the .Call interface to invoke them. You only need enough > working memory in C to operate on one column and space to save the result. > > So a MacBook with 8GB of memory will handle it with room to breathe. > > This is a good use case for the 'inline' package, especially if you are > unfamiliar with the use of .Call. > > > === > > But it might be as fast to forget about paralleizing this (explicitly). > [detailed recommendations deleted] > > On a Mac, the vecLib BLAS will do crossprod using the multiple > cores without your needing to do anything special. So you can forget about > 'parallel', 'multicore', etc. > > > So your remaining problem is to reread steps 2=6 and figure out what > 'minimal.matrix' and 'fill.rows' have to be. > > === > > You can also approach this problem using 'filter', but that can get > 'convoluted' (pun intended - see ?filter). > > HTH, Thanks, Charles, for all the helpful pointers! For the moment, I'll leave parallelization aside, and will explore using 'crossprod' and 'filter'. Although, from your suggestion that 8 GB of memory should be sufficient if I went the parallel, I also wonder whether I'm suffering not just from inefficient use of computing resources, but that there's a memory leak as well: The original 'apply' code would, in much less than a minute, take over the full 18 GB of memory available on my workstation, and then leave it functioning at a crawl for at least a half hour or so. I'll ask about this by reposting this message again with a different subject, so no need to address it in this thread. Thanks again, David ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.