Thanks, all! I'll try these out. I'm trying to work up something that is platform independent (if possible) for use with mmap. I'll do some tests on these suggestions and see which works best. I'll try to report back in a few days. Cheers!
--j 2012/5/3 "Jens Oehlschlägel" <jens.oehlschlae...@truecluster.com> > Jonathan, > > On some filesystems (e.g. NTFS, see below) it is possible to create > 'sparse' memory-mapped files, i.e. reserving the space without the cost of > actually writing initial values. > Package 'ff' does this automatically and also allows to access the file in > parallel. Check the example below and see how big file creation is > immediate. > > Jens Oehlschlägel > > > > library(ff) > > library(snowfall) > > ncpus <- 2 > > n <- 1e8 > > system.time( > + x <- ff(vmode="double", length=n, filename="c:/Temp/x.ff") > + ) > User System verstrichen > 0.01 0.00 0.02 > > # check finalizer, with an explicit filename we should have a 'close' > finalizer > > finalizer(x) > [1] "close" > > # if not, set it to 'close' inorder to not let slaves delete x on slave > shutdown > > finalizer(x) <- "close" > > sfInit(parallel=TRUE, cpus=ncpus, type="SOCK") > R Version: R version 2.15.0 (2012-03-30) > > snowfall 1.84 initialized (using snow 0.3-9): parallel execution on 2 CPUs. > > > sfLibrary(ff) > Library ff loaded. > Library ff loaded in cluster. > > Warnmeldung: > In library(package = "ff", character.only = TRUE, pos = 2, warn.conflicts > = TRUE, : > 'keep.source' is deprecated and will be ignored > > sfExport("x") # note: do not export the same ff multiple times > > # explicitely opening avoids a gc problem > > sfClusterEval(open(x, caching="mmeachflush")) # opening with > 'mmeachflush' inststead of 'mmnoflush' is a bit slower but prevents OS > write storms when the file is larger than RAM > [[1]] > [1] TRUE > > [[2]] > [1] TRUE > > > system.time( > + sfLapply( chunk(x, length=ncpus), function(i){ > + x[i] <- runif(sum(i)) > + invisible() > + }) > + ) > User System verstrichen > 0.00 0.00 30.78 > > system.time( > + s <- sfLapply( chunk(x, length=ncpus), function(i) quantile(x[i], > c(0.05, 0.95)) ) > + ) > User System verstrichen > 0.00 0.00 4.38 > > # for completeness > > sfClusterEval(close(x)) > [[1]] > [1] TRUE > > [[2]] > [1] TRUE > > > csummary(s) > 5% 95% > Min. 0.04998 0.95 > 1st Qu. 0.04999 0.95 > Median 0.05001 0.95 > Mean 0.05001 0.95 > 3rd Qu. 0.05002 0.95 > Max. 0.05003 0.95 > > # stop slaves > > sfStop() > > Stopping cluster > > > # with the close finalizer we are responsible for deleting the file > explicitely (unless we want to keep it) > > delete(x) > [1] TRUE > > # remove r-side metadata > > rm(x) > > # truly free memory > > gc() > > > > *Gesendet:* Donnerstag, 03. Mai 2012 um 00:23 Uhr > *Von:* "Jonathan Greenberg" <j...@illinois.edu> > *An:* r-help <r-help@r-project.org>, r-sig-...@r-project.org > *Betreff:* [R-sig-hpc] Quickest way to make a large "empty" file on disk? > R-helpers: > > What would be the absolute fastest way to make a large "empty" file (e.g. > filled with all zeroes) on disk, given a byte size and a given number > number of empty values. I know I can use writeBin, but the "object" in > this case may be far too large to store in main memory. I'm asking because > I'm going to use this file in conjunction with mmap to do parallel writes > to this file. Say, I want to create a blank file of 10,000 floating point > numbers. > > Thanks! > > --j > > -- > Jonathan A. Greenberg, PhD > Assistant Professor > Department of Geography and Geographic Information Science > University of Illinois at Urbana-Champaign > 607 South Mathews Avenue, MC 150 > Urbana, IL 61801 > Phone: 415-763-5476 > AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 > http://www.geog.illinois.edu/people/JonathanGreenberg.html > > [[alternative HTML version deleted]] > > _______________________________________________ > R-sig-hpc mailing list > r-sig-...@r-project.org > https://stat.ethz.ch/mailman/listinfo/r-sig-hpc > > > -- Jonathan A. Greenberg, PhD Assistant Professor Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 607 South Mathews Avenue, MC 150 Urbana, IL 61801 Phone: 415-763-5476 AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 http://www.geog.illinois.edu/people/JonathanGreenberg.html [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.