Just a clarification, I am using MonetDBLite for this. 2016-07-11 11:28 GMT-03:00 Lucas Ferreira Mation <lucasmat...@gmail.com>:
> I am writing a package that imports most of the Brazillian socio-economic > micro datasets. > (microdadosBrasil <https://github.com/lucasmation/microdadosBrasil>). The > idea of the package that the data import is very simple, so even users with > verry little R programming knowledge can use the data easily. > Although I would like to have decent performance, the first concern is > usability. > > The package imports data to an in memory data.table object. > I am now trying to implement support for out of memory datasets using > MonetDBLite. > > Is there a (non OS dependent) way to predict if a dataset will fit into > memory or not? Ideally the package would ask the computer for the maximum > amount of RAM that R can use. The package would then default to > MonetDBLite if the available RAM was smaller then 3x the in memory size > of the dataset. > > There will also be an argument for the user to choose himself wether to > use in RAM or out of RAM, but if that argument is not provided the package > would choose for him. > > In any case, does that seems reasonable? Or should I force the user to be > aware of this? > > Another option would be to default to MonetDB (unless the user explicitly > asks for in-memory data). Is MonetDB performance so good that it would > not make much of a difference? > > Another disadvantage of the MonetDB default is that the user will not be > able to run base-R data manipulation commands. So he will have to use dplyr > (which is great and simple) or SQL queries (which few people will know). > > reagards > Lucas > [[alternative HTML version deleted]] ______________________________________________ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel