thanks for your answers and sorry that I didnt explain the problem/question sufficiently in the first place. here it comes:
the problem is that when I create a formula inside a function and some large objects exist there too, then saving the output of the formula will save the in this case large environment: test2a <- function(){ large.object <- rnorm(1000000) out <- list(f=formula(u~b)) out } v2a <- test2a() save(v2a,file="~/tmp/v2.rda") size of v2a.rda: 7.4M saving the output of test() yields a file-size on disk of 7.4 Mega bytes, even though the output of the function does not depend on the large object. Given that the formula f is also completely independent of the large.object the behaviour is surprising. It is even more suprising that when the same code is evaluated outside the function in the Globalenv then the saved object does not contain the large.object: large.object <- rnorm(1000000) v3 <- list(f=formula(u~b)) save(v3,file="~/tmp/v3.rda") size of v3.rda: 128 B In my set of functions I make sure that the formula is evaluated in an existing data.frame. Hence, I want to solely use the look-up-variables function and get rid of all the other functions of the formula. Thanks Thomas William Dunlap <wdun...@tibco.com> writes: > I didn't see where you said what your goal was in making the > environment of a formula and empty environment. I'm guessing that you > want to make sure the variables in the formula come from the > data.frame given to a fitting function along with the formula (so that > typos cause errors for sure instead of sometimes giving an incorrect > answer). > > Note that environment(formula) is used to look up not only the > variables (and functions) in a formula, but also to look up some > things used in a call to model.frame. Hence setting the formula's > environment to emptyenv() is not very useful - it limits things too > much. > > > form1 <- y ~ x1 + x2 environment(form1) <- emptyenv() dat <- > > data.frame(y=log(1:10), x1=1/(1:10), x2=sqrt(1:10)) fit <- > > lm(form1, data=dat) > Error in eval(expr, envir, enclos) : could not find function "list" > > traceback() > 7: eval(expr, envir, enclos) 6: eval(predvars, data, env) 5: > model.frame.default(formula = form1, data = dat, drop.unused.levels = > TRUE) 4: model.frame(formula = form1, data = dat, drop.unused.levels = > TRUE) 3: eval(expr, envir, enclos) 2: eval(mf, parent.frame()) 1: > lm(form1, data = dat) > > I'm a bit surprised that this error happens - it might be avoided by > rewriting some stuff in model.frame. I can avoid it by doing > > e <- new.env(parent=emptyenv()) e$list <- base::list > > environment(form1) <- e fit <- lm(form1, data=dat) > The fix may not be worthwhile because it won't help you with a formula > like y~x1+sin(x2) - 'sin' will not be found. > > You could use environment(form1) <- parent.env(globalenv()) so all > attached packages may be used but not globalenv(). Since packages > tend to contain functions and not much data this may help if you are > just trying to generate errors when there is a typo in the formula. > > Knowing why you want the environment of a formula to be empty would > help answer your question. > > Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > >> -----Original Message----- From: r-help-boun...@r-project.org >> [mailto:r-help-boun...@r-project.org] On Behalf Of Charles Berry >> Sent: Wednesday, March 20, 2013 7:04 PM To: r-h...@stat.math.ethz.ch >> Subject: Re: [R] behaviour of formula objects and environment inside >> functions >> >> Thomas Alexander Gerds <tag <at> biostat.ku.dk> writes: >> >> > Dear List >> > I am looking for the recommended way to create a formula inside a >> > function with an empty environment. I tried several versions (see >> > below), and one of them seemed to work, but I dont understand why >> > there is a difference between .GlobalEnv and the environment >> > inside a function. I would be greatful for any reference or >> > explanation or advice. >> [snip] >> >> From ?formula >> >> Environments: >> >> A formula object has an associated environment, and this >> environment (rather than the parent environment) is used by >> model.frame' to evaluate variables that are not found in the >> supplied 'data' argument. >> >> So write four functions that: >> >> 1) creates a formula 2) creates some data 3) evaluates a formula >> using model.frame (even implicitly with lm(),say) 4) calls the >> functions from 1, 2, and 3 >> >> When you run '4', the result will depend on the environment of data >> from 2 and the environment of the formula from 1. If they are both >> in the same environment, fine. If not, you might get lucky and have >> the data in a place where it will be found nevertheless. >> >> If you are really unlucky the '4' function will find some other data >> that match the formula and use it. >> >> HTH, >> >> Chuck >> >> ______________________________________________ R-help@r-project.org >> mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do >> read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. -- Thomas A. Gerds -- Assoc. Prof. Department of Biostatistics Copenhagen University of Copenhagen, Oester Farimagsgade 5, 1014 Copenhagen, Denmark ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.