Thanks all, I see where I misunderstood the issue. I would like to suggest though to add a similar warning to the help page of with() and within() like there is already on subset() and transform().
Cheers Joris On Wed, Apr 1, 2015 at 9:18 PM, Duncan Murdoch <murdoch.dun...@gmail.com> wrote: > On 01/04/2015 2:33 PM, Joris Meys wrote: > >> Thank you for the insights. I understood as much from the code, but I >> can't really see how this can cause a problem when using with() or within() >> within a package or a function. The environments behave like I would >> expect, as does the evaluation of the arguments. The second argument is >> supposed to be an expression, so I would expect that expression to be >> evaluated in the data frame first. >> > > I don't know the context within which you were told that they are > problematic, but one issue is that it makes typo detection harder, since > the code analysis won't see typos. > > For example: > > df <- data.frame(col1 = 1) > global <- 3 > > with(df, col1 + global) # fine > with(df, col1 + Global) # typo, but still no warning > > whereas > > df$col1 + global # fine > df$col1 + Global # "no visible binding for global variable 'Global'" > > and of course you'll get in a real mess later with the with() code if you > add a column named "global" to your dataframe. > > Duncan Murdoch > > >> I believed the warning in subset() and transform() refers to the >> consequences of using the dotted argument and the evaluation thereof inside >> the function, but I might have misunderstood this. I've always considered >> within() the programming equivalent of the convenience function transform(). >> >> Sorry for using the r-devel list, but I reckoned this could have >> consequences for package developers like me. More explicitly: if within() >> poses the same risk as transform() (which I'm still not sure of), a warning >> on the help page of within() would be suited imho. I will use the r-help >> list in the future. >> >> Kind regards >> Joris >> >> On Wed, Apr 1, 2015 at 7:55 PM, Duncan Murdoch <murdoch.dun...@gmail.com >> <mailto:murdoch.dun...@gmail.com>> wrote: >> >> On 01/04/2015 1:35 PM, Gabriel Becker wrote: >> >> Joris, >> >> >> The second argument to evalq is envir, so that line says, >> roughly, "call >> environment() to generate me a new environment within the >> environment >> defined by data". >> >> >> I think that's not quite right. environment() returns the current >> environment, it doesn't create a new one. It is evalq() that >> created a new environment from data, and environment() just >> returns it. >> >> Here's what happens. I've put the code first, the description of >> what happens on the line below. >> >> parent <- parent.frame() >> >> Get the environment from which within.data.frame was called. >> >> e <- evalq(environment(), data, parent) >> >> Create a new environment containing the columns of data, with the >> parent being the environment where we were called. >> Return it and store it in e. >> >> eval(substitute(expr), e) >> >> Evaluate the expression in this new environment. >> >> l <- as.list(e) >> >> Convert it to a list. >> >> l <- l[!vapply(l, is.null, NA, USE.NAMES = FALSE)] >> >> Delete NULL entries from the list. >> >> nD <- length(del <- setdiff(names(data), (nl <- names(l)))) >> >> Find out if any columns were deleted. >> >> data[nl] <- l >> >> Set the columns of data to the values from the list. >> >> if (nD) >> data[del] <- if (nD == 1) >> NULL >> else vector("list", nD) >> data >> >> Delete the columns from data which were deleted from the list. >> >> >> >> Note that that is is only generating e, the environment that >> expr will be >> evaluated within in the next line (the call to eval). This >> means that expr >> is evaluated in an environment which is inside the environment >> defined by >> data, so you get non-standard evaluation in that symbols >> defined in data >> will be available to expr earlier in symbol lookup than those >> in the >> environment that within() was called from. >> >> >> This again sounds like there are two environments created, when >> really there's just one, but the last part is correct. >> >> Duncan Murdoch >> >> >> >> This is easy to confirm from the behavior of these functions: >> >> > df = data.frame(x = 1:10, y = rnorm(10)) >> > x = "I'm a character" >> > mean(x) >> [1] NA >> Warning message: >> In mean.default(x) : argument is not numeric or logical: >> returning NA >> > within(df, mean.x <- mean(x)) >> x y mean.x >> 1 1 0.396758869 5.5 >> 2 2 0.945679050 5.5 >> 3 3 1.980039723 5.5 >> 4 4 -0.187059706 5.5 >> 5 5 0.008220067 5.5 >> 6 6 0.451175885 5.5 >> 7 7 -0.262064017 5.5 >> 8 8 -0.652301191 5.5 >> 9 9 0.673609455 5.5 >> 10 10 -0.075590905 5.5 >> > with(df, mean(x)) >> [1] 5.5 >> >> P.S. this is probably an r-help question. >> >> Best, >> ~G >> >> >> >> >> On Wed, Apr 1, 2015 at 10:21 AM, Joris Meys >> <jorism...@gmail.com <mailto:jorism...@gmail.com>> wrote: >> >> > Dear list members, >> > >> > I'm a bit confused about the evaluation of expressions using >> with() or >> > within() versus subset() and transform(). I always teach my >> students to use >> > with() and within() because of the warning mentioned in the >> helppages of >> > subset() and transform(). Both functions use nonstandard >> evaluation and are >> > to be used only interactively. >> > >> > I've never seen that warning on the help page of with() and >> within(), so I >> > assumed both functions can safely be used in functions and >> packages. I've >> > now been told that both functions pose the same risk as >> subset() and >> > transform(). >> > >> > Looking at the source code I've noticed the extra step: >> > >> > e <- evalq(environment(), data, parent) >> > >> > which, at least according to my understanding, should ensure >> that the >> > functions follow the standard evaluation rules. Could >> somebody with more >> > knowledge than I have shed a bit of light on this issue? >> > >> > Thank you >> > Joris >> > >> > -- >> > Joris Meys >> > Statistical consultant >> > >> > Ghent University >> > Faculty of Bioscience Engineering >> > Department of Mathematical Modelling, Statistics and >> Bio-Informatics >> > >> > tel : +32 (0)9 264 61 79 <tel:%2B32%20%280%299%20264%2061%2079> >> > joris.m...@ugent.be >> > ------------------------------- >> > Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php >> > >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-devel@r-project.org <mailto:R-devel@r-project.org> mailing >> list >> > https://stat.ethz.ch/mailman/listinfo/r-devel >> > >> >> >> >> >> >> >> >> -- >> Joris Meys >> Statistical consultant >> >> Ghent University >> Faculty of Bioscience Engineering >> Department of Mathematical Modelling, Statistics and Bio-Informatics >> >> tel : +32 (0)9 264 61 79 >> joris.m...@ugent.be >> ------------------------------- >> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php >> > > -- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Mathematical Modelling, Statistics and Bio-Informatics tel : +32 (0)9 264 61 79 joris.m...@ugent.be ------------------------------- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel