On 01/04/2015 2:33 PM, Joris Meys wrote:
Thank you for the insights. I understood as much from the code, but I
can't really see how this can cause a problem when using with() or
within() within a package or a function. The environments behave like
I would expect, as does the evaluation of the arguments. The second
argument is supposed to be an expression, so I would expect that
expression to be evaluated in the data frame first.
I don't know the context within which you were told that they are
problematic, but one issue is that it makes typo detection harder, since
the code analysis won't see typos.
For example:
df <- data.frame(col1 = 1)
global <- 3
with(df, col1 + global) # fine
with(df, col1 + Global) # typo, but still no warning
whereas
df$col1 + global # fine
df$col1 + Global # "no visible binding for global variable 'Global'"
and of course you'll get in a real mess later with the with() code if
you add a column named "global" to your dataframe.
Duncan Murdoch
I believed the warning in subset() and transform() refers to the
consequences of using the dotted argument and the evaluation thereof
inside the function, but I might have misunderstood this. I've always
considered within() the programming equivalent of the convenience
function transform().
Sorry for using the r-devel list, but I reckoned this could have
consequences for package developers like me. More explicitly: if
within() poses the same risk as transform() (which I'm still not sure
of), a warning on the help page of within() would be suited imho. I
will use the r-help list in the future.
Kind regards
Joris
On Wed, Apr 1, 2015 at 7:55 PM, Duncan Murdoch
<murdoch.dun...@gmail.com <mailto:murdoch.dun...@gmail.com>> wrote:
On 01/04/2015 1:35 PM, Gabriel Becker wrote:
Joris,
The second argument to evalq is envir, so that line says,
roughly, "call
environment() to generate me a new environment within the
environment
defined by data".
I think that's not quite right. environment() returns the current
environment, it doesn't create a new one. It is evalq() that
created a new environment from data, and environment() just
returns it.
Here's what happens. I've put the code first, the description of
what happens on the line below.
parent <- parent.frame()
Get the environment from which within.data.frame was called.
e <- evalq(environment(), data, parent)
Create a new environment containing the columns of data, with the
parent being the environment where we were called.
Return it and store it in e.
eval(substitute(expr), e)
Evaluate the expression in this new environment.
l <- as.list(e)
Convert it to a list.
l <- l[!vapply(l, is.null, NA, USE.NAMES = FALSE)]
Delete NULL entries from the list.
nD <- length(del <- setdiff(names(data), (nl <- names(l))))
Find out if any columns were deleted.
data[nl] <- l
Set the columns of data to the values from the list.
if (nD)
data[del] <- if (nD == 1)
NULL
else vector("list", nD)
data
Delete the columns from data which were deleted from the list.
Note that that is is only generating e, the environment that
expr will be
evaluated within in the next line (the call to eval). This
means that expr
is evaluated in an environment which is inside the environment
defined by
data, so you get non-standard evaluation in that symbols
defined in data
will be available to expr earlier in symbol lookup than those
in the
environment that within() was called from.
This again sounds like there are two environments created, when
really there's just one, but the last part is correct.
Duncan Murdoch
This is easy to confirm from the behavior of these functions:
> df = data.frame(x = 1:10, y = rnorm(10))
> x = "I'm a character"
> mean(x)
[1] NA
Warning message:
In mean.default(x) : argument is not numeric or logical:
returning NA
> within(df, mean.x <- mean(x))
x y mean.x
1 1 0.396758869 5.5
2 2 0.945679050 5.5
3 3 1.980039723 5.5
4 4 -0.187059706 5.5
5 5 0.008220067 5.5
6 6 0.451175885 5.5
7 7 -0.262064017 5.5
8 8 -0.652301191 5.5
9 9 0.673609455 5.5
10 10 -0.075590905 5.5
> with(df, mean(x))
[1] 5.5
P.S. this is probably an r-help question.
Best,
~G
On Wed, Apr 1, 2015 at 10:21 AM, Joris Meys
<jorism...@gmail.com <mailto:jorism...@gmail.com>> wrote:
> Dear list members,
>
> I'm a bit confused about the evaluation of expressions using
with() or
> within() versus subset() and transform(). I always teach my
students to use
> with() and within() because of the warning mentioned in the
helppages of
> subset() and transform(). Both functions use nonstandard
evaluation and are
> to be used only interactively.
>
> I've never seen that warning on the help page of with() and
within(), so I
> assumed both functions can safely be used in functions and
packages. I've
> now been told that both functions pose the same risk as
subset() and
> transform().
>
> Looking at the source code I've noticed the extra step:
>
> e <- evalq(environment(), data, parent)
>
> which, at least according to my understanding, should ensure
that the
> functions follow the standard evaluation rules. Could
somebody with more
> knowledge than I have shed a bit of light on this issue?
>
> Thank you
> Joris
>
> --
> Joris Meys
> Statistical consultant
>
> Ghent University
> Faculty of Bioscience Engineering
> Department of Mathematical Modelling, Statistics and
Bio-Informatics
>
> tel : +32 (0)9 264 61 79 <tel:%2B32%20%280%299%20264%2061%2079>
> joris.m...@ugent.be
> -------------------------------
> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel@r-project.org <mailto:R-devel@r-project.org> mailing
list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
--
Joris Meys
Statistical consultant
Ghent University
Faculty of Bioscience Engineering
Department of Mathematical Modelling, Statistics and Bio-Informatics
tel : +32 (0)9 264 61 79
joris.m...@ugent.be
-------------------------------
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel