On 01/04/2015 2:33 PM, Joris Meys wrote:
Thank you for the insights. I understood as much from the code, but I can't really see how this can cause a problem when using with() or within() within a package or a function. The environments behave like I would expect, as does the evaluation of the arguments. The second argument is supposed to be an expression, so I would expect that expression to be evaluated in the data frame first.

I don't know the context within which you were told that they are problematic, but one issue is that it makes typo detection harder, since the code analysis won't see typos.

For example:

df <- data.frame(col1 = 1)
global <- 3

with(df, col1 + global)  # fine
with(df, col1 + Global)  # typo, but still no warning

whereas

df$col1 + global  # fine
df$col1 + Global # "no visible binding for global variable 'Global'"

and of course you'll get in a real mess later with the with() code if you add a column named "global" to your dataframe.

Duncan Murdoch


I believed the warning in subset() and transform() refers to the consequences of using the dotted argument and the evaluation thereof inside the function, but I might have misunderstood this. I've always considered within() the programming equivalent of the convenience function transform().

Sorry for using the r-devel list, but I reckoned this could have consequences for package developers like me. More explicitly: if within() poses the same risk as transform() (which I'm still not sure of), a warning on the help page of within() would be suited imho. I will use the r-help list in the future.

Kind regards
Joris

On Wed, Apr 1, 2015 at 7:55 PM, Duncan Murdoch <murdoch.dun...@gmail.com <mailto:murdoch.dun...@gmail.com>> wrote:

    On 01/04/2015 1:35 PM, Gabriel Becker wrote:

        Joris,


        The second argument to evalq is envir, so that line says,
        roughly, "call
        environment() to generate me a new environment within the
        environment
        defined by data".


    I think that's not quite right.  environment() returns the current
    environment, it doesn't create a new one.  It is evalq() that
    created a new environment from data, and environment() just
    returns it.

    Here's what happens.  I've put the code first, the description of
    what happens on the line below.

        parent <- parent.frame()

    Get the environment from which within.data.frame was called.

        e <- evalq(environment(), data, parent)

    Create a new environment containing the columns of data, with the
    parent being the environment where we were called.
    Return it and store it in e.

        eval(substitute(expr), e)

    Evaluate the expression in this new environment.

        l <- as.list(e)

    Convert it to a list.

        l <- l[!vapply(l, is.null, NA, USE.NAMES = FALSE)]

    Delete NULL entries from the list.

        nD <- length(del <- setdiff(names(data), (nl <- names(l))))

    Find out if any columns were deleted.

        data[nl] <- l

    Set the columns of data to the values from the list.

        if (nD)
            data[del] <- if (nD == 1)
                NULL
            else vector("list", nD)
        data

    Delete the columns from data which were deleted from the list.



        Note that that is is only generating e, the environment that
        expr will be
        evaluated within in the next line (the call to eval). This
        means that expr
        is evaluated in an environment which is inside the environment
        defined by
        data, so you get non-standard evaluation in that symbols
        defined in data
        will be available to expr earlier in symbol lookup than those
        in the
        environment that within() was called from.


    This again sounds like there are two environments created, when
    really there's just one, but the last part is correct.

    Duncan Murdoch



        This is easy to confirm from the behavior of these functions:

        > df = data.frame(x = 1:10, y = rnorm(10))
        > x = "I'm a character"
        > mean(x)
        [1] NA
        Warning message:
        In mean.default(x) : argument is not numeric or logical:
        returning NA
        > within(df, mean.x <- mean(x))
             x            y mean.x
        1   1  0.396758869    5.5
        2   2  0.945679050    5.5
        3   3  1.980039723    5.5
        4   4 -0.187059706    5.5
        5   5  0.008220067    5.5
        6   6  0.451175885    5.5
        7   7 -0.262064017    5.5
        8   8 -0.652301191    5.5
        9   9  0.673609455    5.5
        10 10 -0.075590905    5.5
        > with(df, mean(x))
        [1] 5.5

        P.S. this is probably an r-help question.

        Best,
        ~G




        On Wed, Apr 1, 2015 at 10:21 AM, Joris Meys
        <jorism...@gmail.com <mailto:jorism...@gmail.com>> wrote:

        > Dear list members,
        >
        > I'm a bit confused about the evaluation of expressions using
        with() or
        > within() versus subset() and transform(). I always teach my
        students to use
        > with() and within() because of the warning mentioned in the
        helppages of
        > subset() and transform(). Both functions use nonstandard
        evaluation and are
        > to be used only interactively.
        >
        > I've never seen that warning on the help page of with() and
        within(), so I
        > assumed both functions can safely be used in functions and
        packages. I've
        > now been told that both functions pose the same risk as
        subset() and
        > transform().
        >
        > Looking at the source code I've noticed the extra step:
        >
        > e <- evalq(environment(), data, parent)
        >
        > which, at least according to my understanding, should ensure
        that the
        > functions follow the standard evaluation rules. Could
        somebody with more
        > knowledge than I have shed a bit of light on this issue?
        >
        > Thank you
        > Joris
        >
        > --
        > Joris Meys
        > Statistical consultant
        >
        > Ghent University
        > Faculty of Bioscience Engineering
        > Department of Mathematical Modelling, Statistics and
        Bio-Informatics
        >
        > tel : +32 (0)9 264 61 79 <tel:%2B32%20%280%299%20264%2061%2079>
        > joris.m...@ugent.be
        > -------------------------------
        > Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
        >
        >         [[alternative HTML version deleted]]
        >
        > ______________________________________________
        > R-devel@r-project.org <mailto:R-devel@r-project.org> mailing
        list
        > https://stat.ethz.ch/mailman/listinfo/r-devel
        >







--
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Mathematical Modelling, Statistics and Bio-Informatics

tel :  +32 (0)9 264 61 79
joris.m...@ugent.be
-------------------------------
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to