That might help, but protecting things is a fairly cheap operation, so I don't know if people would bother with the naming convention. It's just as easy to just protect things if you're not sure.

One way things can go wrong is when you think you protected something, but then the pointer changes and the new pointer is not protected. Maybe a linter could recognize that some code path assigned a new value to a variable without protecting it? I guess it's easier to recognize that you made an assignment to varName_PROT without protecting it again than to look at the PROTECT calls, but it's not really that different.

Duncan Murdoch



On 2025-04-11 11:57 a.m., Paul McQuesten wrote:
For a long-term horizon, would it help R developers to use a naming convention?
Perhaps, varName_PROT, or the inverse varName_UNPROT?
Eventually, teach some linter about that?

On Fri, Apr 11, 2025 at 10:40 AM Duncan Murdoch <murdoch.dun...@gmail.com <mailto:murdoch.dun...@gmail.com>> wrote:

    On a tangent from the main topic of this thread:  sometimes (especially
    to non-experts) it's not obvious whether a variable is protected or not.

    I don't think there's any easy way to determine that, but perhaps there
    should be.  Would it be possible to add a run-time test you could call
    in C code (e.g. is_protected(x)) that would do the same search the
    garbage collector does in order to determine if a particular pointer is
    protected?

    This would be an expensive operation, similar in cost to actually doing
    a garbage collection.  You wouldn't want to do it routinely, but it
    would be really helpful in debugging.

    Duncan Murdoch

    On 2025-04-11 6:05 a.m., Suharto Anggono Suharto Anggono via R-devel
    wrote:
     >   On second thought, I wonder if the caching in my changed
    'StringFromLogical' in my previous message is safe. While 'ans' in
    the C function 'coerceToString' is protected, its element is also
    protected. If the object corresponding to 'ans' is then no longer
    protected, is it possible for the cached object 'TrueCh' or
    'FalseCh' in 'StringFromLogical' to be garbage collected? If it is,
    I think of clearing the cache for each first filling. For example,
    by abusing 'warn' argument, the following is added to my changed
    'StringFromLogical'.
     >
     >   if (*warn) TrueCh = FalseCh = NULL;
     >
     > Correspondingly, in 'coerceToString',
     >
     >   warn = i == 0;
     >
     > is inserted before
     >
     >   SET_STRING_ELT(ans, i, StringFromLogical(LOGICAL_ELT(v, i),
    &warn));
     >
     > for LGLSXP case.
     >
     > ---------------------
     > On Thursday, 10 April 2025 at 10:54:03 pm GMT+7, Martin Maechler
    <maech...@stat.math.ethz.ch <mailto:maech...@stat.math.ethz.ch>> wrote:
     >
     >
     >>>>>> Suharto Anggono Suharto Anggono via R-devel
     >>>>>>      on Thu, 10 Apr 2025 07:53:04 +0000 (UTC) writes:
     >
     >      > Chain of calls of C functions in coerce.c for
    as.character(<logical>) in R:
     >
     >      > do_asatomic
     >      > ascommon
     >      > coerceVector
     >      > coerceToString
     >      > StringFromLogical (for each element)
     >
     >      > The definition of 'StringFromLogical' in coerce.c :
     >
     >      > Chain of calls of C functions in coerce.c for
    as.character(<logical>) in R:
     >      >
     >      > do_asatomic
     >      > ascommon
     >      > coerceVector
     >      > coerceToString
     >      > StringFromLogical (for each element)
     >      >
     >      > The definition of 'StringFromLogical' in coerce.c :
     >      >
     >      > attribute_hidden SEXP StringFromLogical(int x, int *warn)
     >      > {
     >      >    int w;
     >      >    formatLogical(&x, 1, &w);
     >      >    if (x == NA_LOGICAL) return NA_STRING;
     >      >    else return mkChar(EncodeLogical(x, w));
     >      > }
     >      >
     >      > The definition of 'EncodeLogical' in printutils.c :
     >      >
     >      > const char *EncodeLogical(int x, int w)
     >      > {
     >      >    static char buff[NB];
     >      >    if(x == NA_LOGICAL) snprintf(buff, NB, "%*s", min(w,
    (NB-1)), CHAR(R_print.na_string));
     >      >    else if(x) snprintf(buff, NB, "%*s", min(w, (NB-1)),
    "TRUE");
     >      >    else snprintf(buff, NB, "%*s", min(w, (NB-1)), "FALSE");
     >      >    buff[NB-1] = '\0';
     >      >    return buff;
     >      > }
     >      >
     >      > > L <- sample(c(TRUE, FALSE), 10^7, replace = TRUE)
     >      > > system.time(as.character(L))
     >      >    user  system elapsed
     >      >    2.69    0.02    2.73
     >      > > system.time(c("FALSE", "TRUE")[L+1])
     >      >    user  system elapsed
     >      >    0.15    0.04    0.20
     >      > > system.time(c("FALSE", "TRUE")[L+1L])
     >      >    user  system elapsed
     >      >    0.08    0.05    0.13
     >      > > L <- rep(NA, 10^7)
     >      > > system.time(as.character(L))
     >      >    user  system elapsed
     >      >    0.11    0.00    0.11
     >      > > system.time(c("FALSE", "TRUE")[L+1])
     >      >    user  system elapsed
     >      >    0.16    0.06    0.22
     >      > > system.time(c("FALSE", "TRUE")[L+1L])
     >      >    user  system elapsed
     >      >    0.09    0.03    0.12
     >      >
     >      > `as.character` of a logical vector that is all NA is fast
    enough.
     >      > It appears that the call to 'formatLogical' inside > the C
    function
     >      > 'StringFromLogical' does not introduce much    > slowdown.
     >
     >
     >      > I found that using string literal inside the C function
    'StringFromLogical', by replacing
     >      > EncodeLogical(x, w)
     >      > with
     >      > x ? "TRUE" : "FALSE"
     >      > (and the call to 'formatLogical' is not needed anymore),
    make it faster.
     >
     > indeed! ... and we also notice that the 'w' argument is neither
     > needed anymore, and that makes sense: At this point when you
     > know you have a an R logical value there are only three
     > possibilities and no reason ever to warn about the conversion.
     >
     >      > Alternatively,
     > or in addition !
     >
     >
     >      > "fast path" could be introduced in 'EncodeLogical',
    potentially also benefits format() in R.
     >      > For example, without replacing existing code, the
    following fragment could be inserted.
     >      >
     >      >    if(x == NA_LOGICAL) {if(w == R_print.na_width) return
    CHAR(R_print.na_string);}
     >      >    else if(x) {if(w == 4) return "TRUE";}
     >      >    else {if(w == 5) return "FALSE";}
     >      >
     >      > However, with either of them, c("FALSE", "TRUE")[L+1L] is
    still faster than as.character(L) .
     >      >
     >      > Precomputing or caching possible results of the C function
    'StringFromLogical' allows as.character(L) to be as fast as
    c("FALSE", "TRUE")[L+1L] in R. For example, 'StringFromLogical'
    could be changed to
     >      >
     >      > attribute_hidden SEXP StringFromLogical(int x, int *warn)
     >      > {
     >      >    static SEXP TrueCh, FalseCh;
     >      >    if (x == NA_LOGICAL) return NA_STRING;
     >      >    else if (x) return TrueCh ? TrueCh : (TrueCh =
    mkChar("TRUE"));
     >      >    else return FalseCh ? FalseCh : (FalseCh =
    mkChar("FALSE"));
     >
     >      > }
     >
     > Indeed, and something along this line (storing the other two
    constant strings) was also
     > my thought when seeing the
     >    mkChar(x ? "TRUE" : "FALSE)
     > you implicitly proposed above.
     >
     > I'm looking into applying both speedups;
     > thank you very much, Suharto!
     >
     > Martin
     >
     >
     > --
     > Martin Maechler
     > ETH Zurich  and  R Core team
     >
     >       [[alternative HTML version deleted]]
     >
     > ______________________________________________
     > R-devel@r-project.org <mailto:R-devel@r-project.org> mailing list
     > https://stat.ethz.ch/mailman/listinfo/r-devel
    <https://stat.ethz.ch/mailman/listinfo/r-devel>

    ______________________________________________
    R-devel@r-project.org <mailto:R-devel@r-project.org> mailing list
    https://stat.ethz.ch/mailman/listinfo/r-devel
    <https://stat.ethz.ch/mailman/listinfo/r-devel>


______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to