Re: [Rd] Undocumented 'use.names' argument to c()

2016-09-25 Thread Suharto Anggono Suharto Anggono via R-devel
>From comments in 
>http://stackoverflow.com/questions/24815572/why-does-function-c-accept-an-undocumented-argument/24815653
> : The code of c() and unlist() was formerly shared but has been (long time 
>passing) separated. From July 30, 1998, is where do_c got split into do_c and 
>do_unlist.

With the implementation of 'c.Date' in R devel r71350, an argument named 
'use.names' is included for concatenation. So, it doesn't follow the documented 
'c'. But, 'c.Date' is not explicitly documented in Dates.Rd, that has 'c.Date' 
as an alias.

On Sat, 24/9/16, Martin Maechler  wrote:

 Subject: Re: [Rd] Undocumented 'use.names' argument to c()
 To: "Karl Millar" 

 Date: Saturday, 24 September, 2016, 9:12 PM
 
 > Karl Millar via R-devel 
> on Fri, 23 Sep 2016 11:12:49 -0700 writes:

> I'd expect that a lot of the performance overhead could be eliminated
> by simply improving the underlying code.  IMHO, we should ignore it in
> deciding the API that we want here.

I agree partially.  Even if the underlying code can be made
faster, the 'use.names = FALSE' version will still be faster
than the default, notably in some "long" cases.

More further down.

> On Fri, Sep 23, 2016 at 10:54 AM, Henrik Bengtsson
>  wrote:
>> I'd vote for it to stay.  It could of course suprise someone who'd
>> expect c(list(a=1), b=2, use.names = FALSE) to generate list(a=1, b=2,
>> use.names=FALSE).   On the upside, is the performance gain from using
>> use.names=FALSE.  Below benchmarks show that the combining of the
>> names attributes themselves takes ~20-25 times longer than the
>> combining of the integers themselves.  Also, at no surprise,
>> use.names=FALSE avoids some memory allocations.
>> 
>>> options(digits = 2)
>>> 
>>> a <- b <- c <- d <- 1:1e4
>>> names(c) <- c
>>> names(d) <- d
>>> 
>>> stats <- microbenchmark::microbenchmark(
>> +   c(a, b, use.names=FALSE),
>> +   c(c, d, use.names=FALSE),
>> +   c(a, d, use.names=FALSE),
>> +   c(a, b, use.names=TRUE),
>> +   c(a, d, use.names=TRUE),
>> +   c(c, d, use.names=TRUE),
>> +   unit = "ms"
>> + )
>>> 
>>> stats
>> Unit: milliseconds
>> expr   minlq  mean medianuq   max neval
>> c(a, b, use.names = FALSE) 0.031 0.032 0.049  0.034 0.036 1.474   100
>> c(c, d, use.names = FALSE) 0.031 0.031 0.035  0.034 0.035 0.064   100
>> c(a, d, use.names = FALSE) 0.031 0.031 0.049  0.034 0.035 1.452   100
>> c(a, b, use.names = TRUE) 0.031 0.031 0.055  0.034 0.036 2.094   100
>> c(a, d, use.names = TRUE) 0.510 0.526 0.588  0.549 0.617 1.998   100
>> c(c, d, use.names = TRUE) 0.780 0.815 0.886  0.841 0.944 1.430   100
>> 
>>> profmem::profmem(c(c, d, use.names=FALSE))
>> Rprofmem memory profiling of:
>> c(c, d, use.names = FALSE)
>> 
>> Memory allocations:
>> bytes  calls
>> 1 80040 
>> total 80040
>> 
>>> profmem::profmem(c(c, d, use.names=TRUE))
>> Rprofmem memory profiling of:
>> c(c, d, use.names = TRUE)
>> 
>> Memory allocations:
>> bytes  calls
>> 1  80040 
>> 2 160040 
>> total 240080
>> 
>> /Henrik
>> 
>> On Fri, Sep 23, 2016 at 10:25 AM, William Dunlap via R-devel
>>  wrote:
>>> In Splus c() and unlist() called the same C code, but with a different
>>> 'sys_index'  code (the last argument to .Internal) and c() did not 
consider
>>> an argument named 'use.names' special.

Thank you, Bill, very much, for making the historical context
clear, and giving us the facts, there.

OTOH, it is also true in R, that  c() and unlist() share code
.. quite a bit less though .. but more importantly, the very
original C code of Ross Ihaka (and possibly Robert Gentleman)
had explicitly considered both extra arguments 'recursive' and
'use.names', and not just the first.

The fact that c() has always been a .Primitive function and that
these have no formals()  had contributed to what I think to be a
documentation glitch early on, and when, quite a bit later, we've
added a fake argument list for printing, the then current
documentation was used.

This was the reason for declaring it a documentation "hole"
rather than something we do not want.

(read on)

 c
>>> function(..., recursive = F)
>>> .Internal(c(..., recursive = recursive), "S_unlist", TRUE, 1)
 unlist
>>> function(data, recursive = T, use.names = T)
>>> .Internal(unlist(data, recursive = recursive, use.names = use.names),
>>> "S_unlist", TRUE, 2)
 c(A=1,B=2,use.names=FALSE)
>>> A B use.names
>>> 1 2 0
>>> 
>>> The C code used sys_index==2 to mean 'the last  argument is the 
'use.names'
>>> argument, if sys_index==1 only the recursive argument was considered
>>> special.
>>> 
>>> Sys.funs.c:
>>> 405 S_unlist(vector *ent, 

Re: [Rd] Undocumented 'use.names' argument to c()

2016-09-25 Thread Martin Maechler
> Suharto Anggono Suharto Anggono via R-devel 
> on Sun, 25 Sep 2016 14:12:10 + writes:

>> From comments in
>> 
http://stackoverflow.com/questions/24815572/why-does-function-c-accept-an-undocumented-argument/24815653
>> : The code of c() and unlist() was formerly shared but
>> has been (long time passing) separated. From July 30,
>> 1998, is where do_c got split into do_c and do_unlist.
> With the implementation of 'c.Date' in R devel r71350, an
> argument named 'use.names' is included for
> concatenation. So, it doesn't follow the documented
> 'c'. But, 'c.Date' is not explicitly documented in
> Dates.Rd, that has 'c.Date' as an alias.

I do not see any  c.Date  in R-devel with a 'use.names'; its a
base function, hence not hidden ..

As mentioned before, 'use.names' is used in unlist() in quite a
few places, and such an argument also exists for

lengths()   and
all.equal.list()

and now c() 

> 
> On Sat, 24/9/16, Martin Maechler
>  wrote:

>  Subject: Re: [Rd] Undocumented 'use.names' argument to
> c() To: "Karl Millar" 

>  Date: Saturday, 24 September, 2016, 9:12 PM
 
>> Karl Millar via R-devel 
> on Fri, 23 Sep 2016 11:12:49 -0700 writes:

>> I'd expect that a lot of the performance overhead could
>> be eliminated by simply improving the underlying code.
>> IMHO, we should ignore it in deciding the API that we
>> want here.

> I agree partially.  Even if the underlying code can be
> made faster, the 'use.names = FALSE' version will still be
> faster than the default, notably in some "long" cases.

> More further down.

>> On Fri, Sep 23, 2016 at 10:54 AM, Henrik Bengtsson
>>  wrote:
>>> I'd vote for it to stay.  It could of course suprise
>>> someone who'd expect c(list(a=1), b=2, use.names =
>>> FALSE) to generate list(a=1, b=2, use.names=FALSE).  On
>>> the upside, is the performance gain from using
>>> use.names=FALSE.  Below benchmarks show that the
>>> combining of the names attributes themselves takes
>>> ~20-25 times longer than the combining of the integers
>>> themselves.  Also, at no surprise, use.names=FALSE
>>> avoids some memory allocations.
>>> 
 options(digits = 2)
 
 a <- b <- c <- d <- 1:1e4 names(c) <- c names(d) <- d
 
 stats <- microbenchmark::microbenchmark(
>>> + c(a, b, use.names=FALSE), + c(c, d, use.names=FALSE),
>>> + c(a, d, use.names=FALSE), + c(a, b, use.names=TRUE), +
>>> c(a, d, use.names=TRUE), + c(c, d, use.names=TRUE), +
>>> unit = "ms" + )
 
 stats
>>> Unit: milliseconds expr min lq mean median uq max neval
>>> c(a, b, use.names = FALSE) 0.031 0.032 0.049 0.034 0.036
>>> 1.474 100 c(c, d, use.names = FALSE) 0.031 0.031 0.035
>>> 0.034 0.035 0.064 100 c(a, d, use.names = FALSE) 0.031
>>> 0.031 0.049 0.034 0.035 1.452 100 c(a, b, use.names =
>>> TRUE) 0.031 0.031 0.055 0.034 0.036 2.094 100 c(a, d,
>>> use.names = TRUE) 0.510 0.526 0.588 0.549 0.617 1.998
>>> 100 c(c, d, use.names = TRUE) 0.780 0.815 0.886 0.841
>>> 0.944 1.430 100
>>> 
 profmem::profmem(c(c, d, use.names=FALSE))
>>> Rprofmem memory profiling of: c(c, d, use.names = FALSE)
>>> 
>>> Memory allocations: bytes calls 1 80040  total
>>> 80040
>>> 
 profmem::profmem(c(c, d, use.names=TRUE))
>>> Rprofmem memory profiling of: c(c, d, use.names = TRUE)
>>> 
>>> Memory allocations: bytes calls 1 80040  2
>>> 160040  total 240080
>>> 
>>> /Henrik
>>> 
>>> On Fri, Sep 23, 2016 at 10:25 AM, William Dunlap via
>>> R-devel  wrote:
 In Splus c() and unlist() called the same C code, but
 with a different 'sys_index' code (the last argument to
 .Internal) and c() did not consider an argument named
 'use.names' special.

> Thank you, Bill, very much, for making the historical
> context clear, and giving us the facts, there.

> OTOH, it is also true in R, that c() and unlist() share
> code .. quite a bit less though .. but more importantly,
> the very original C code of Ross Ihaka (and possibly
> Robert Gentleman) had explicitly considered both extra
> arguments 'recursive' and 'use.names', and not just the
> first.

> The fact that c() has always been a .Primitive function
> and that these have no formals() had contributed to what I
> think to be a documentation glitch early on, and when,
> quite a bit later, we've added a fake argument list for
> printing, the then current documentation was used.

> This was the reason for declaring it a documentation
> "hole" rather than something we do not want.

> (read on)

> c
 function(..., recursive = F) .Internal(c(..., recursive
 = recursive), "S_unli

Re: [Rd] withAutoprint({ .... }) ?

2016-09-25 Thread Martin Maechler
> Henrik Bengtsson 
> on Sat, 24 Sep 2016 11:31:49 -0700 writes:

> Martin, did you post your code for withAutoprint() anywhere?
> Building withAutoprint() on top of source() definitely makes sense,
> unless, as Bill says, source() itself could provide the same feature.

I was really mainly asking for advice about the function name
.. and got none.

I'm now committing my version (including (somewhat incomplete)
documentation, so you (all) can look at it and try / test it further.

> To differentiate between withAutoprint({ x <- 1 }) and
> withAutoprint(expr) where is an expression / language object, one
> could have an optional argument `substitute=TRUE`, e.g.

> withAutoprint <- function(expr, substitute = TRUE, ...) {
>if (substitute) expr <- substitute(expr)
>[...]
> }

I think my approach is nicer insofar it does not seem to need
such an argument I'm sure you'll try to disprove that ;-)

Martin

> Just some thoughts
> /Henrik


> On Sat, Sep 24, 2016 at 6:37 AM, Martin Maechler
>  wrote:
>>> William Dunlap 
>>> on Fri, 2 Sep 2016 08:33:47 -0700 writes:
>> 
>> > Re withAutoprint(), Splus's source() function could take a expression
>> > (literal or not) in place of a file name or text so it could support
>> > withAutoprint-like functionality in its GUI.  E.g.,
>> 
>> >> source(auto.print=TRUE, exprs.literal= { x <- 3:7 ; sum(x) ; y <- 
log(x)
>> > ; x - 100}, prompt="--> ")
--> x <- 3:7
--> sum(x)
>> > [1] 25
--> y <- log(x)
--> x - 100
>> > [1] -97 -96 -95 -94 -93
>> 
>> > or
>> 
>> >> expr <- quote({ x <- 3:7 ; sum(x) ; y <- log(x) ; x - 100})
>> >> source(auto.print=TRUE, exprs = expr, prompt="--> ")
--> x <- 3:7
--> sum(x)
>> > [1] 25
--> y <- log(x)
--> x - 100
>> > [1] -97 -96 -95 -94 -93
>> 
>> > It was easy to implement, since exprs's default value is parse(file) or
>> > parse(text=text), which source is calculating anyway.
>> 
>> 
>> > Bill Dunlap
>> > TIBCO Software
>> > wdunlap tibco.com
>> 
>> Thank you, Bill  (and the other correspondents); that's indeed a
>> very good suggestion :
>> 
>> I've come to the conclusion that Duncan and Bill are right:  One
>> should do this in R (not C) and as Bill hinted, one should use
>> source().  I first tried to do it separately, just "like source()",
>> but a considerable part of the source of source()  {:-)} is
>> about using src attributes instead of deparse() when the former
>> are present,  and it does make sense to generalize
>> withAutoprint() to have the same feature, so after all, have it
>> call source().
>> 
>> I've spent a few hours now trying things and variants, also
>> found I needed to enhance source()  very slightly also in a few
>> other details, and now (in my uncommitted version of R-devel),
>> 
>> withAutoprint({ x <- 1:12; x-1; (y <- (x-5)^2); z <- y; z - 10 })
>> 
>> produces
>> 
>>> withAutoprint({ x <- 1:12; x-1; (y <- (x-5)^2); z <- y; z - 10 })
>>> x <- 1:12
>>> x - 1
>> [1]  0  1  2  3  4  5  6  7  8  9 10 11
>>> (y <- (x - 5)^2)
>> [1] 16  9  4  1  0  1  4  9 16 25 36 49
>>> z <- y
>>> z - 10
>> [1]   6  -1  -6  -9 -10  -9  -6  -1   6  15  26  39
>>> 
>> 
>> and is equivalent to
>> 
>> withAutoprint(expression(x <- 1:12, x-1, (y <- (x-5)^2), z <- y, z - 10 
))
>> 
>> I don't see any way around the "mis-feature" that all "input"
>> expressions are in the end shown twice in the "output" (the
>> first time by showing the withAutoprint(...) call itself).
>> 
>> The function *name* is "not bad" but also a bit longish;
>> maybe there are better ideas?  (not longer, no "_" - I know this
>> is a matter of taste only)
>> 
>> Martin
>> 
>> > On Fri, Sep 2, 2016 at 4:56 AM, Martin Maechler 

>> > wrote:
>> 
>> >> On R-help, with subject
>> >> '[R] source() does not include added code'
>> >>
>> >> > Joshua Ulrich 
>> >> > on Wed, 31 Aug 2016 10:35:01 -0500 writes:
>> >>
>> >> > I have quantstrat installed and it works fine for me.  If you're
>> >> > asking why the output of t(tradeStats('macross')) isn't being
>> >> printed,
>> >> > that's because of what's described in the first paragraph in the
>> >> > *Details* section of help("source"):
>> >>
>> >> > Note that running code via ‘source’ differs in a few respects from
>> >> > entering it at the R command line.  Since expressions are not
>> >> > executed at the top level, auto-printing is not done.  So you will
>> >> > need to include explicit ‘print’ calls for things you want to be
>> >> > printed (and remember that this includes plotting by ‘lattice’,
>> >> > FAQ Q7.22).
>> >>
>> >>
 

Re: [Rd] withAutoprint({ .... }) ?

2016-09-25 Thread Henrik Bengtsson
On Sun, Sep 25, 2016 at 9:29 AM, Martin Maechler
 wrote:
>> Henrik Bengtsson 
>> on Sat, 24 Sep 2016 11:31:49 -0700 writes:
>
> > Martin, did you post your code for withAutoprint() anywhere?
> > Building withAutoprint() on top of source() definitely makes sense,
> > unless, as Bill says, source() itself could provide the same feature.
>
> I was really mainly asking for advice about the function name
> .. and got none.

I missed that part.  I think the name is good.  A shorter alternative
would be withEcho(), but could be a little bit misleading since it
doesn't reflect 'print=TRUE' to source().

>
> I'm now committing my version (including (somewhat incomplete)
> documentation, so you (all) can look at it and try / test it further.
>
> > To differentiate between withAutoprint({ x <- 1 }) and
> > withAutoprint(expr) where is an expression / language object, one
> > could have an optional argument `substitute=TRUE`, e.g.
>
> > withAutoprint <- function(expr, substitute = TRUE, ...) {
> >if (substitute) expr <- substitute(expr)
> >[...]
> > }
>
> I think my approach is nicer insofar it does not seem to need
> such an argument I'm sure you'll try to disprove that ;-)

Nah, I like that you've extended source() with the 'exprs' argument.

May I suggest to add:

svn diff src/library/base/R/
Index: src/library/base/R/source.R
===
--- src/library/base/R/source.R (revision 71357)
+++ src/library/base/R/source.R (working copy)
@@ -198,7 +198,7 @@
  if (!tail) {
 # Deparse.  Must drop "expression(...)"
 dep <- substr(paste(deparse(ei, width.cutoff = width.cutoff,
-control = "showAttributes"),
+  control = c("keepInteger", "showAttributes")),
  collapse = "\n"), 12L, 1e+06L)
 dep <- paste0(prompt.echo,
   gsub("\n", paste0("\n", continue.echo), dep))

such that you get:

> withAutoprint(x <- c(1L, NA_integer_, NA))
> x <- c(1L, NA_integer_, NA)

because without it, you get:

> withAutoprint(x <- c(1L, NA_integer_, NA))
> x <- c(1, NA, NA)

Thanks,

Henrik


>
> Martin
>
> > Just some thoughts
> > /Henrik
>
>
> > On Sat, Sep 24, 2016 at 6:37 AM, Martin Maechler
> >  wrote:
> >>> William Dunlap 
> >>> on Fri, 2 Sep 2016 08:33:47 -0700 writes:
> >>
> >> > Re withAutoprint(), Splus's source() function could take a expression
> >> > (literal or not) in place of a file name or text so it could support
> >> > withAutoprint-like functionality in its GUI.  E.g.,
> >>
> >> >> source(auto.print=TRUE, exprs.literal= { x <- 3:7 ; sum(x) ; y <- 
> log(x)
> >> > ; x - 100}, prompt="--> ")
> --> x <- 3:7
> --> sum(x)
> >> > [1] 25
> --> y <- log(x)
> --> x - 100
> >> > [1] -97 -96 -95 -94 -93
> >>
> >> > or
> >>
> >> >> expr <- quote({ x <- 3:7 ; sum(x) ; y <- log(x) ; x - 100})
> >> >> source(auto.print=TRUE, exprs = expr, prompt="--> ")
> --> x <- 3:7
> --> sum(x)
> >> > [1] 25
> --> y <- log(x)
> --> x - 100
> >> > [1] -97 -96 -95 -94 -93
> >>
> >> > It was easy to implement, since exprs's default value is parse(file) 
> or
> >> > parse(text=text), which source is calculating anyway.
> >>
> >>
> >> > Bill Dunlap
> >> > TIBCO Software
> >> > wdunlap tibco.com
> >>
> >> Thank you, Bill  (and the other correspondents); that's indeed a
> >> very good suggestion :
> >>
> >> I've come to the conclusion that Duncan and Bill are right:  One
> >> should do this in R (not C) and as Bill hinted, one should use
> >> source().  I first tried to do it separately, just "like source()",
> >> but a considerable part of the source of source()  {:-)} is
> >> about using src attributes instead of deparse() when the former
> >> are present,  and it does make sense to generalize
> >> withAutoprint() to have the same feature, so after all, have it
> >> call source().
> >>
> >> I've spent a few hours now trying things and variants, also
> >> found I needed to enhance source()  very slightly also in a few
> >> other details, and now (in my uncommitted version of R-devel),
> >>
> >> withAutoprint({ x <- 1:12; x-1; (y <- (x-5)^2); z <- y; z - 10 })
> >>
> >> produces
> >>
> >>> withAutoprint({ x <- 1:12; x-1; (y <- (x-5)^2); z <- y; z - 10 })
> >>> x <- 1:12
> >>> x - 1
> >> [1]  0  1  2  3  4  5  6  7  8  9 10 11
> >>> (y <- (x - 5)^2)
> >> [1] 16  9  4  1  0  1  4  9 16 25 36 49
> >>> z <- y
> >>> z - 10
> >> [1]   6  -1  -6  -9 -10  -9  -6  -1   6  15  26  39
> >>>
> >>
> >> and is equivalent to
> >>
> >> withAutoprint(expression(x <- 1:12, x-1, (y <- (x-5)^2), z <- y, z - 
> 10 ))
> >>
> >> I don't see any way around the "mis-feature" that all "input"
> >> exp