Dear Prof. Ripley, Thanks for the quick reply.
I do notice an <environment...> in the print output. I assume it is used to keep copies of the initial data used for the model. - Is it safe to assume that it would not affect any other functionality, apart from the usage of those particular functions? - Is there a better/recommended way of reducing the size? Thanks, Tan On Feb 3, 4:56 pm, Prof Brian Ripley <rip...@stats.ox.ac.uk> wrote: > On Tue, 3 Feb 2009, tan wrote: > > I am using rpart to build a model for later predictions. To save the > > prediction across restarts and share the data across nodes I have been > > using "save" to persist the result of rpart to a file and "load" it > > later. But the saved size was becoming unusually large (even with > > binary, compressed mode). The size was also proportional to the amount > > of data that was used to create the model. > > > After tinkering a bit, I figured out that most of the size was because > > of the rpart$functions attribute. If I set it to NULL, the size seems > > to drop dramatically. It can be seen with the following lines of R > > code, where there is a difference, though it is small. The difference > > is more pronounced with large datasets. > > > library(rpart) > > fit <- rpart(Kyphosis ~ Age + Number + Start, data=kyphosis) > > save(fit, file="fit1.sav") > > fit$functions <- NULL > > save(fit, file="fit2.sav") > > > What is the reason behind it? The functions themselves seem small, so > > where it all the bulk coming from? > > Their environments. > > -- > Brian D. Ripley, rip...@stats.ox.ac.uk > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UK Fax: +44 1865 272595 > > ______________________________________________ > r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.