Hi Rainer, Rainer M Krug <rai...@krugs.de> writes: > "Charles C. Berry" <ccbe...@ucsd.edu> writes: > >> On Wed, 17 Jun 2015, William Denton wrote: >> >>> On 17 June 2015, Xebar Saram wrote: >>> >>>> I do alot of modeling work that involves using huge datasets and run >>>> process intensive R processes (such as complex mixed models, Gamms etc). in >>>> R studio all works well yet when i use the orgmode eval on R code blocks it >>>> works well for small simple process but 90% of the time when dealing with >>>> complex models and bug data (up to 256GB) it will just freeze emacs/ess. >>>> sometimes i can C-c or C-g it and other times i need to physically kill >>>> emacs. >>> >>> I've been having the same problem for a while, but wasn't able to >>> isolate it any more than large data sets, lack of memory, and heavy >>> CPU usage. Sometimes everything hangs and I need to power cycle the >>> computer. :( >>> >> >> And you (both) have `ess-eval-visibly' set to nil, right? >> >> I do statistical genomics, which can be compute intensive. Sometimes >> processes need to run for a while, and I get impatient having to wait. >> >> I wrote (and use) ox-ravel[1] to speed up my write-run-revise cycle in >> org-mode. >> >> Basically, ravel will export Org mode to a format that knitr (and the >> like) can run - turning src blocks into `code chunks'. That allows me >> to set the cache=TRUE chunk option, etc. I run knitr on the exported >> document to initialize objects for long running computations or to >> produce a finished report. >> >> When I start a session, I run knitr in the R session, then all the >> cached objects are loaded in and ready to use. >> >> If I write a src block I know will take a long time to export, I >> export from org mode to update the knitr document and re-knit it to >> refresh the cache. > > I have a similar workflow, only that I use a package like > approach, i.e. I tangle function definitions in a folder ./R, data into > ./data (which makes it possible to share org defined variables with R > running outside org) and scripts, i.e. the things which do a analysis, > import data, ... i.e. which might take long, into a folder ./scripts/. I > then add the usual R package infrastructure files (DESCRIPTION, > NAMESPACE, ...). > Then I have one file tangled into ./scripts/init.R: > > #+begin_src R :tangle ./scripts/init.R > library(devtools) > load_all() > #+end_src > > > and one for the analysis: > > #+begin_src R :tangle ./scripts/myAnalysis.R > ## Do some really time intensive and horribly complicated and important > ## stuff here > save( > fileNames, > bw, > cols, > labels, > fit, > dens, > gof, > gofPerProf, > file = "./cache/results.myAnalysis.rds" > ) > #+end_src > > > Now after tangling, I have my code easily available in a new R session: > > 1) start R in the directory in which the DESCRIPTION file is, > 2) run source("./scripts/init.R") > > and I have all my functions and data available. > > To run a analysis, I do > > 3) source("./scripts/myAnalysis.R") > > and the results are saved in a file fn > > To analyse the data further, I can then simply use > > #+begin_src R :tangle ./scripts/myAnalysis.R > fitSing <- attach("./cache/results.myAnalysis.rds") > #+end_src > > > so they won't interfere with my environment in R. > > I can finally remove the attached environment by doing > > #+begin_src R :tangle ./scripts/myAnalysis.R > detach( > name = attr(fitSing, "name"), > character.only = TRUE > ) > #+end_src > > Through these caching and compartmentalizing, I can easily do some > things outside org and some inside, and easily combine all the data. > > Further advantage: I can actually create the package and send it to > somebody for testing and review and it should run out of the box, as in > the DESCRIPTION file all dependencies are defined. > > I am using this approach at the moment for a paper and which will also > result in a paper. By executing all the scripts, one will be able to do > import the raw data, do the analysis and create all graphs used in the > paper. > > Hope this gives you another idea how one can handle long running > analysis in R in org, > > Cheers, > > Rainer >
That is a cool workflow. I especially like the fact that you end up with an R package. So, I'll try my again. Is there there any chance to see working example of this? I'd love to see that. Thanks, Andreas