Re: [Rd] [External] Re: New pipe operator

2020-12-06 Thread Jan Gorecki
Luke,
When writing a blog post on that, could you please describe
performance implications that this new feature will carry?
AFAIU, compared to a standard way of using temporary variables, pipes
will allow to not increment REFCNT of objects being piped into.
Therefore peak memory usage could be lower in some cases.

As for brackets required on RHS, I think it makes sense to be
consistent and either require brackets for anonymous functions the
same way we require for function name, or not require brackets for
both of them.

Best,
Jan

On Sat, Dec 5, 2020 at 8:10 PM  wrote:
>
> We went back and forth on this several times. The key advantage of
> requiring parentheses is to keep things simple and consistent.  Let's
> get some experience with that. If experience shows requiring
> parentheses creates too many issues then we can add the option of
> dropping them later (with special handling of :: and :::). It's easier
> to add flexibility and complexity than to restrict it after the fact.
>
> Best,
>
> luke
>
> On Sat, 5 Dec 2020, Hugh Parsonage wrote:
>
> > I'm surprised by the aversion to
> >
> > mtcars |> nrow
> >
> > over
> >
> > mtcars |> nrow()
> >
> > and I think the decision to disallow the former should be
> > reconsidered.  The pipe operator is only going to be used when the rhs
> > is a function, so there is no ambiguity with omitting the parentheses.
> > If it's disallowed, it becomes inconsistent with other treatments like
> > sapply(mtcars, typeof) where sapply(mtcars, typeof()) would just be
> > noise.  I'm not sure why this decision was taken
> >
> > If the only issue is with the double (and triple) colon operator, then
> > ideally `mtcars |> base::head` should resolve to `base::head(mtcars)`
> > -- in other words, demote the precedence of |>
> >
> > Obviously (looking at the R-Syntax branch) this decision was
> > considered, put into place, then dropped, but I can't see why
> > precisely.
> >
> > Best,
> >
> >
> > Hugh.
> >
> >
> >
> >
> >
> >
> >
> > On Sat, 5 Dec 2020 at 04:07, Deepayan Sarkar  
> > wrote:
> >>
> >> On Fri, Dec 4, 2020 at 7:35 PM Duncan Murdoch  
> >> wrote:
> >>>
> >>> On 04/12/2020 8:13 a.m., Hiroaki Yutani wrote:
> >   Error: function '::' not supported in RHS call of a pipe
> 
>  To me, this error looks much more friendly than magrittr's error.
>  Some of them got too used to specify functions without (). This
>  is OK until they use `::`, but when they need to use it, it takes
>  hours to figure out why
> 
>  mtcars %>% base::head
>  #> Error in .::base : unused argument (head)
> 
>  won't work but
> 
>  mtcars %>% head
> 
>  works. I think this is a too harsh lesson for ordinary R users to
>  learn `::` is a function. I've been wanting for magrittr to drop the
>  support for a function name without () to avoid this confusion,
>  so I would very much welcome the new pipe operator's behavior.
>  Thank you all the developers who implemented this!
> >>>
> >>> I agree, it's an improvement on the corresponding magrittr error.
> >>>
> >>> I think the semantics of not evaluating the RHS, but treating the pipe
> >>> as purely syntactical is a good decision.
> >>>
> >>> I'm not sure I like the recommended way to pipe into a particular 
> >>> argument:
> >>>
> >>>mtcars |> subset(cyl == 4) |> \(d) lm(mpg ~ disp, data = d)
> >>>
> >>> or
> >>>
> >>>mtcars |> subset(cyl == 4) |> function(d) lm(mpg ~ disp, data = d)
> >>>
> >>> both of which are equivalent to
> >>>
> >>>mtcars |> subset(cyl == 4) |> (function(d) lm(mpg ~ disp, data = d))()
> >>>
> >>> It's tempting to suggest it should allow something like
> >>>
> >>>mtcars |> subset(cyl == 4) |> lm(mpg ~ disp, data = .)
> >>
> >> Which is really not that far off from
> >>
> >> mtcars |> subset(cyl == 4) |> \(.) lm(mpg ~ disp, data = .)
> >>
> >> once you get used to it.
> >>
> >> One consequence of the implementation is that it's not clear how
> >> multiple occurrences of the placeholder would be interpreted. With
> >> magrittr,
> >>
> >> sort(runif(10)) %>% ecdf(.)(.)
> >> ## [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
> >>
> >> This is probably what you would expect, if you expect it to work at all, 
> >> and not
> >>
> >> ecdf(sort(runif(10)))(sort(runif(10)))
> >>
> >> There would be no such ambiguity with anonymous functions
> >>
> >> sort(runif(10)) |> \(.) ecdf(.)(.)
> >>
> >> -Deepayan
> >>
> >>> which would be expanded to something equivalent to the other versions:
> >>> but that makes it quite a bit more complicated.  (Maybe _ or \. should
> >>> be used instead of ., since those are not legal variable names.)
> >>>
> >>> I don't think there should be an attempt to copy magrittr's special
> >>> casing of how . is used in determining whether to also include the
> >>> previous value as first argument.
> >>>
> >>> Duncan Murdoch
> >>>
> >>>
> 
>  Best,
>  Hiroaki Yutani
> 
>  2020年12月4日(金) 20:51 Duncan Murdoch :
> >>>

[Rd] as.POSIXct.numeric change default of origin argument

2020-12-06 Thread Jan Gorecki
Hello all,

I would like to propose to change the default value for "origin"
argument in as.POSIXct.numeric method, from current missing to new
"1970-01-01".
My proposal is motivated by the fact that this is the most commonly
needed value for "origin" argument and having that as a default seems
reasonable.
Proposed change seems to be pretty safe because it is now an error.

Best Regards,
Jan Gorecki

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] as.POSIXct.numeric change default of origin argument

2020-12-06 Thread Spencer Graves
	  The fda package already includes as.POSIXct1970, which also sets 
tz="GMT" by default.



	  I made the equivalent thing for as.Date available as 
"Ecfun::as.Date1970".



	  If the Core R team doesn't want to make the change for the existing 
functions, they might consider adding alternatives like this.  And, of 
course, Jan Gorecki and others can use these (if they aren't already 
using them or something equivalent).



  sg


On 2020-12-06 05:04, Jan Gorecki wrote:

Hello all,

I would like to propose to change the default value for "origin"
argument in as.POSIXct.numeric method, from current missing to new
"1970-01-01".
My proposal is motivated by the fact that this is the most commonly
needed value for "origin" argument and having that as a default seems
reasonable.
Proposed change seems to be pretty safe because it is now an error.

Best Regards,
Jan Gorecki

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] as.POSIXct.numeric change default of origin argument

2020-12-06 Thread Achim Zeileis

On Sun, 6 Dec 2020, Jan Gorecki wrote:


Hello all,

I would like to propose to change the default value for "origin"
argument in as.POSIXct.numeric method, from current missing to new
"1970-01-01".
My proposal is motivated by the fact that this is the most commonly
needed value for "origin" argument and having that as a default seems
reasonable.
Proposed change seems to be pretty safe because it is now an error.


I would also be in favor of this (and have been for years), mostly to make 
it consistent with the as.numeric() method. Same for "Date".


To support the latter, the "zoo" package provides a separate as.Date() 
generic that enables the as.Date.numeric() with different default.


The main argument of R Core against it is that it is too uncertain whether 
the origin is really 1970-01-01, e.g., when importing from Excel or SAS.


Best wishes,
Z


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [External] Re: New pipe operator

2020-12-06 Thread Dénes Tóth

Dear Luke,

In the meantime I checked the R-syntax branch and the docs; they are 
very helpful. I would also like to thank you for putting effort into 
this feature. Keeping it at the syntax level is also a very smart 
decision. However, the current API might not exploit the full power of 
the basic idea.


1) Requiring either an anonymous function or a function call, but not 
allowing for symbols which point to functions is inconsistent and will 
be misleading for non-experts.


foo <- function(x) x
identical(foo, function(x) x)

mtcars |> foo   #bang!
mtcars |> function(x) x #fine?

You stated in :
"
Another variation supported by the implementation is that a symbol on
the RHS is interpreted as the name of a function to call with the LHS
as argument:

```r
> quote(x |> f)
f(x)
```
"

So clearly this is not an implementation issue but a design decision.

As a remedy, two different pipe operators could be introduced:

LHS |> RHS-> RHS is treated as a function call
LHS |>> RHS   -> RHS is treated as a function

If |>> is used, it would not matter which notation is used for the RHS 
expression; the parser would assume it evaluates to a function.


2) Simplified lambda expression:
IMHO in the vast majority of use cases, this is used for single-argument 
functions, so parenthesis would not be required. Hence, both forms would 
be valid and equivalent:


\x x + 1
\(x) x + 1


3) Function composition:
Allowing for concise composition of functions would be a great feature. 
E.g., instead of


foo <- function(x) print(mean(sqrt(x), na.rm = TRUE), digits = 2)

or

foo <- \x {x |> sqrt() |> mean(na.rm = TRUE) |> print(digits = 2)}

one could write

foo <- \x |> sqrt() |> mean(na.rm = TRUE) |> print(digits = 2)

So basically if the lambda argument is followed by a pipe operator, the 
pipe chain is transformed to a function body where the first lambda 
argument is inserted into the first position of the pipeline.



Best,
Denes


On 12/5/20 7:10 PM, luke-tier...@uiowa.edu wrote:

We went back and forth on this several times. The key advantage of
requiring parentheses is to keep things simple and consistent.  Let's
get some experience with that. If experience shows requiring
parentheses creates too many issues then we can add the option of
dropping them later (with special handling of :: and :::). It's easier
to add flexibility and complexity than to restrict it after the fact.

Best,

luke

On Sat, 5 Dec 2020, Hugh Parsonage wrote:


I'm surprised by the aversion to

mtcars |> nrow

over

mtcars |> nrow()

and I think the decision to disallow the former should be
reconsidered.  The pipe operator is only going to be used when the rhs
is a function, so there is no ambiguity with omitting the parentheses.
If it's disallowed, it becomes inconsistent with other treatments like
sapply(mtcars, typeof) where sapply(mtcars, typeof()) would just be
noise.  I'm not sure why this decision was taken

If the only issue is with the double (and triple) colon operator, then
ideally `mtcars |> base::head` should resolve to `base::head(mtcars)`
-- in other words, demote the precedence of |>

Obviously (looking at the R-Syntax branch) this decision was
considered, put into place, then dropped, but I can't see why
precisely.

Best,


Hugh.







On Sat, 5 Dec 2020 at 04:07, Deepayan Sarkar 
 wrote:


On Fri, Dec 4, 2020 at 7:35 PM Duncan Murdoch 
 wrote:


On 04/12/2020 8:13 a.m., Hiroaki Yutani wrote:

  Error: function '::' not supported in RHS call of a pipe


To me, this error looks much more friendly than magrittr's error.
Some of them got too used to specify functions without (). This
is OK until they use `::`, but when they need to use it, it takes
hours to figure out why

mtcars %>% base::head
#> Error in .::base : unused argument (head)

won't work but

mtcars %>% head

works. I think this is a too harsh lesson for ordinary R users to
learn `::` is a function. I've been wanting for magrittr to drop the
support for a function name without () to avoid this confusion,
so I would very much welcome the new pipe operator's behavior.
Thank you all the developers who implemented this!


I agree, it's an improvement on the corresponding magrittr error.

I think the semantics of not evaluating the RHS, but treating the pipe
as purely syntactical is a good decision.

I'm not sure I like the recommended way to pipe into a particular 
argument:


   mtcars |> subset(cyl == 4) |> \(d) lm(mpg ~ disp, data = d)

or

   mtcars |> subset(cyl == 4) |> function(d) lm(mpg ~ disp, data = d)

both of which are equivalent to

   mtcars |> subset(cyl == 4) |> (function(d) lm(mpg ~ disp, data = 
d))()


It's tempting to suggest it should allow something like

   mtcars |> subset(cyl == 4) |> lm(mpg ~ disp, data = .)


Which is really not that far off from

mtcars |> subset(cyl == 4) |> \(.) lm(mpg ~ disp, data = .)

once you get used to it.

One consequence of the implementation is that it's not clear how
multiple occurrences of the 

Re: [Rd] as.POSIXct.numeric change default of origin argument

2020-12-06 Thread Gabor Grothendieck
For example, this works:

  library(zoo)
  as.Date(0)
  ## [1] "1970-01-01"

On Sun, Dec 6, 2020 at 7:10 AM Achim Zeileis  wrote:
>
> On Sun, 6 Dec 2020, Jan Gorecki wrote:
>
> > Hello all,
> >
> > I would like to propose to change the default value for "origin"
> > argument in as.POSIXct.numeric method, from current missing to new
> > "1970-01-01".
> > My proposal is motivated by the fact that this is the most commonly
> > needed value for "origin" argument and having that as a default seems
> > reasonable.
> > Proposed change seems to be pretty safe because it is now an error.
>
> I would also be in favor of this (and have been for years), mostly to make
> it consistent with the as.numeric() method. Same for "Date".
>
> To support the latter, the "zoo" package provides a separate as.Date()
> generic that enables the as.Date.numeric() with different default.
>
> The main argument of R Core against it is that it is too uncertain whether
> the origin is really 1970-01-01, e.g., when importing from Excel or SAS.
>
> Best wishes,
> Z
>
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] installling R-devel on Windows

2020-12-06 Thread Gabor Grothendieck
I tried it from another computer and it did work.  Is there some way
of installing R devel using the analog of the R --vanilla flag so that I can
do it in a reproducible manner. It seems to remember prior settings
and maybe that is a problem although one would not expect a setting
that could lead to what occurs.  I don't see anything documenting flags
on the Rtools40 page.

On Sat, Dec 5, 2020 at 9:52 AM Gabor Grothendieck
 wrote:
>
> I clicked on the download link at
> https://cran.r-project.org/bin/windows/base/rdevel.html
> and then opened the downloaded file which starts the installation process.
> I specified a new directory that does not exist, R-test, to be sure that
> it would not get confused with an old directory.
>
> I repeated this using different directories and on different days.
>
> I tried it from a user and an Admin account.
>
> If I use the exact same procedure to install R-4.0.3patched it works.
>
> I have successfully downloaded and installed R maybe hundreds
> of times over the last 10 to 20 years and have never before
> encountered this.
>
>
>
>
>
> On Sat, Dec 5, 2020 at 9:13 AM Jeroen Ooms  wrote:
> >
> > On Sat, Dec 5, 2020 at 3:00 PM Gabor Grothendieck
> >  wrote:
> > >
> > > When I try to install r-devel on Windows all I get is this.  No other
> > > files.  This also occurred yesterday as well.
> >
> > It just tested it to be sure, but it works fine for me. Are you using
> > the official installer from
> > https://cran.r-project.org/bin/windows/base/rdevel.html ?
> >
> > The default install path is not R-test C:\Program Files\R\R-devel.
> > Perhaps you have old files lingering from previous installations that
> > cause permission problems during the installation process?
>
>
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] installling R-devel on Windows

2020-12-06 Thread Gabor Grothendieck
I meant on the R devel download page. (I was just installing Rtools40
on another computer.)

On Sun, Dec 6, 2020 at 10:27 AM Gabor Grothendieck
 wrote:
>
> I tried it from another computer and it did work.  Is there some way
> of installing R devel using the analog of the R --vanilla flag so that I can
> do it in a reproducible manner. It seems to remember prior settings
> and maybe that is a problem although one would not expect a setting
> that could lead to what occurs.  I don't see anything documenting flags
> on the Rtools40 page.
>
> On Sat, Dec 5, 2020 at 9:52 AM Gabor Grothendieck
>  wrote:
> >
> > I clicked on the download link at
> > https://cran.r-project.org/bin/windows/base/rdevel.html
> > and then opened the downloaded file which starts the installation process.
> > I specified a new directory that does not exist, R-test, to be sure that
> > it would not get confused with an old directory.
> >
> > I repeated this using different directories and on different days.
> >
> > I tried it from a user and an Admin account.
> >
> > If I use the exact same procedure to install R-4.0.3patched it works.
> >
> > I have successfully downloaded and installed R maybe hundreds
> > of times over the last 10 to 20 years and have never before
> > encountered this.
> >
> >
> >
> >
> >
> > On Sat, Dec 5, 2020 at 9:13 AM Jeroen Ooms  wrote:
> > >
> > > On Sat, Dec 5, 2020 at 3:00 PM Gabor Grothendieck
> > >  wrote:
> > > >
> > > > When I try to install r-devel on Windows all I get is this.  No other
> > > > files.  This also occurred yesterday as well.
> > >
> > > It just tested it to be sure, but it works fine for me. Are you using
> > > the official installer from
> > > https://cran.r-project.org/bin/windows/base/rdevel.html ?
> > >
> > > The default install path is not R-test C:\Program Files\R\R-devel.
> > > Perhaps you have old files lingering from previous installations that
> > > cause permission problems during the installation process?
> >
> >
> >
> > --
> > Statistics & Software Consulting
> > GKX Group, GKX Associates Inc.
> > tel: 1-877-GKX-GROUP
> > email: ggrothendieck at gmail.com
>
>
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] New pipe operator

2020-12-06 Thread Avi Gross via R-devel
Naming is another whole topic.

I have seen suggestions that the current pipeline symbol used be phrased as 
THEN so

data %>% f1 %>% f2()

would be said as something like:
take data then apply f1 then f2

or some variants.

There are words other than pipe or pipeline that might also work such as 
"assembly line" or "conveyor belt" that might fit some kinds of pipelining 
better than others. My original exposure to UNIX in the early 80's used a 
pipeline of multiple processes whose standard input and/or standard output (and 
sometimes also standard error) were redirected to an anonymous "pipe" device 
that buffered whatever (usually) text that was thrown at it and the processes 
reading and writing from it were paused and restarted as needed when data was 
ready. Problems often could be decomposed into multiple parts that had a 
solution using some program and it was not unusual to do something like:

cat *.c | grep -v ... | grep ... | sed ... | cut ... >output

Of course something like the above was often rewritten to be done within a 
single awk script or perl or whatever. You could view the above though from the 
perspective of "data" in some form, often text, being passed from one 
function(ality) to another and changing a bit each step of the way. A very 
common use of this form of pipeline was used to deal with embedded text in a 
different language in typsetting:

tbl filename | eqn | pic | troff | ...

The above would open a file, pass through all lines except those between 
markers that specified a table starting and ending. Those lines would be 
processed and transformed into the troff language equivalent. The old plus new 
lines now went to eqn which found and transformed equations similarly then to 
pic which transformed instructions it knew to image descriptions in troff and 
finally troff processed the whole mess and then off to the printer.

Clearly the above can be seen as a data pipeline using full processes as nodes.

The way R is using the pipeline may just use functions but you can imagine it 
as having similarities and differences. Current implementations may be linear 
with lazy evaluation and with every part running to completion before the next 
part starts. Every "object" is fully made, then used, then often removed as a 
temporary object. There is no buffering. But in principle, you can make 
UNIX-like pipelines using parallelism within a process too. 

Would there be scenarios where phrases like "assembly line" or "conveyor belt" 
make sense to describe the method properly? The word pipe suggests a linearity 
to some whereas conveyor belts these days also can be used to selectively shunt 
things one way or another as in assembling all parts of your order from 
different parts of a warehouse and arranging they all end up in the same 
delivery area. Making applications do that dynamically may have other names. 
Think flowchart!

Time to go do something useful.

-Original Message-
From: R-devel  On Behalf Of Hiroaki Yutani
Sent: Saturday, December 5, 2020 10:29 PM
To: Abby Spurdle 
Cc: r-devel 
Subject: Re: [Rd] New pipe operator

It is common practice to call |> as pipe (or pipeline operator) among many 
languages including ones that recently introduced it as an experimental feature.
Pipeline is a
common feature for functional programming, not just for "data pipeline."

F#: 
https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/symbol-and-operator-reference/
Elixir: https://hexdocs.pm/elixir/operators.html#general-operators
Typescript:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Pipeline_operator
Ruby: https://bugs.ruby-lang.org/issues/15799

(This blog post about the history of pipe operator might be
interesting: 
https://mamememo.blogspot.com/2019/06/a-brief-history-of-pipeline-operator.html
)

I agree this is a bit confusing for those who are familiar with other "pipe" 
concepts, but there's no other appropriate term to call |>.

2020年12月6日(日) 12:22 Gregory Warnes :
>
> If we’re being mathematically pedantic, the “pipe” operator is 
> actually function composition.
>
> That being said, pipes are a simple and well-known idiom. While being less
> than mathematically exact, it seems a reasonable   label for the (very
> useful) behavior.
>
> On Sat, Dec 5, 2020 at 9:43 PM Abby Spurdle  wrote:
>
> > > This is a good addition
> >
> > I can't understand why so many people are calling this a "pipe".
> > Pipes connect processes, via their I/O streams.
> > Arguably, a more general interpretation would include sockets and files.
> >
> > https://en.wikipedia.org/wiki/Pipeline_(Unix)
> > https://en.wikipedia.org/wiki/Named_pipe
> > https://en.wikipedia.org/wiki/Anonymous_pipe
> >
> > As far as I can tell, the magrittr-like operators are functions (not 
> > pipes), with nonstandard syntax.
> > This is not consistent with R's original design philosophy, building 
> > on C, Lisp and S, along with lots of *import

Re: [Rd] [External] Re: New pipe operator

2020-12-06 Thread Duncan Murdoch

On 06/12/2020 9:43 a.m., Dénes Tóth wrote:

Dear Luke,

In the meantime I checked the R-syntax branch and the docs; they are
very helpful. I would also like to thank you for putting effort into
this feature. Keeping it at the syntax level is also a very smart
decision. However, the current API might not exploit the full power of
the basic idea.

1) Requiring either an anonymous function or a function call, but not
allowing for symbols which point to functions is inconsistent and will
be misleading for non-experts.

foo <- function(x) x
identical(foo, function(x) x)

mtcars |> foo   #bang!
mtcars |> function(x) x #fine?


You are missing the point.  The value of the RHS is irrelevant to the 
transformation.  All that matters is its form.  So "foo" and 
"function(x) x" are completely different things, even if identical() 
thinks their value is the same.


It's also true that "foo()" and "function(x) x" are completely 
different, but they are well-defined forms:  one is a call, the other is 
an anonymous function definition.


Accepting a plain "foo" would add a third form (a name), which might 
make sense, but hardly gains anything: whereas dropping the anonymous 
function definition costs quite a bit.  Without special-casing anonymous 
function definitions you'd need to enter


mtcars |> (function(x) x)()

or

mtcars |> (\(x) x)()

which are both quite difficult to read.

Duncan Murdoch



You stated in :
"
Another variation supported by the implementation is that a symbol on
the RHS is interpreted as the name of a function to call with the LHS
as argument:

```r
  > quote(x |> f)
f(x)
```
"

So clearly this is not an implementation issue but a design decision.

As a remedy, two different pipe operators could be introduced:

LHS |> RHS-> RHS is treated as a function call
LHS |>> RHS   -> RHS is treated as a function

If |>> is used, it would not matter which notation is used for the RHS
expression; the parser would assume it evaluates to a function.

2) Simplified lambda expression:
IMHO in the vast majority of use cases, this is used for single-argument
functions, so parenthesis would not be required. Hence, both forms would
be valid and equivalent:

\x x + 1
\(x) x + 1


3) Function composition:
Allowing for concise composition of functions would be a great feature.
E.g., instead of

foo <- function(x) print(mean(sqrt(x), na.rm = TRUE), digits = 2)

or

foo <- \x {x |> sqrt() |> mean(na.rm = TRUE) |> print(digits = 2)}

one could write

foo <- \x |> sqrt() |> mean(na.rm = TRUE) |> print(digits = 2)

So basically if the lambda argument is followed by a pipe operator, the
pipe chain is transformed to a function body where the first lambda
argument is inserted into the first position of the pipeline.


Best,
Denes


On 12/5/20 7:10 PM, luke-tier...@uiowa.edu wrote:

We went back and forth on this several times. The key advantage of
requiring parentheses is to keep things simple and consistent.  Let's
get some experience with that. If experience shows requiring
parentheses creates too many issues then we can add the option of
dropping them later (with special handling of :: and :::). It's easier
to add flexibility and complexity than to restrict it after the fact.

Best,

luke

On Sat, 5 Dec 2020, Hugh Parsonage wrote:


I'm surprised by the aversion to

mtcars |> nrow

over

mtcars |> nrow()

and I think the decision to disallow the former should be
reconsidered.  The pipe operator is only going to be used when the rhs
is a function, so there is no ambiguity with omitting the parentheses.
If it's disallowed, it becomes inconsistent with other treatments like
sapply(mtcars, typeof) where sapply(mtcars, typeof()) would just be
noise.  I'm not sure why this decision was taken

If the only issue is with the double (and triple) colon operator, then
ideally `mtcars |> base::head` should resolve to `base::head(mtcars)`
-- in other words, demote the precedence of |>

Obviously (looking at the R-Syntax branch) this decision was
considered, put into place, then dropped, but I can't see why
precisely.

Best,


Hugh.







On Sat, 5 Dec 2020 at 04:07, Deepayan Sarkar
 wrote:


On Fri, Dec 4, 2020 at 7:35 PM Duncan Murdoch
 wrote:


On 04/12/2020 8:13 a.m., Hiroaki Yutani wrote:

   Error: function '::' not supported in RHS call of a pipe


To me, this error looks much more friendly than magrittr's error.
Some of them got too used to specify functions without (). This
is OK until they use `::`, but when they need to use it, it takes
hours to figure out why

mtcars %>% base::head
#> Error in .::base : unused argument (head)

won't work but

mtcars %>% head

works. I think this is a too harsh lesson for ordinary R users to
learn `::` is a function. I've been wanting for magrittr to drop the
support for a function name without () to avoid this confusion,
so I would very much welcome the new pipe operator's behavior.
Thank you all the developers who implemented this!


I agree, it's an improvement on th

Re: [Rd] [External] Re: New pipe operator

2020-12-06 Thread Dénes Tóth




On 12/6/20 4:32 PM, Duncan Murdoch wrote:
> On 06/12/2020 9:43 a.m., Dénes Tóth wrote:
>> Dear Luke,
>>
>> In the meantime I checked the R-syntax branch and the docs; they are
>> very helpful. I would also like to thank you for putting effort into
>> this feature. Keeping it at the syntax level is also a very smart
>> decision. However, the current API might not exploit the full power of
>> the basic idea.
>>
>> 1) Requiring either an anonymous function or a function call, but not
>> allowing for symbols which point to functions is inconsistent and will
>> be misleading for non-experts.
>>
>> foo <- function(x) x
>> identical(foo, function(x) x)
>>
>> mtcars |> foo   #bang!
>> mtcars |> function(x) x #fine?
>
> You are missing the point.  The value of the RHS is irrelevant to the
> transformation.  All that matters is its form.  So "foo" and
> "function(x) x" are completely different things, even if identical()
> thinks their value is the same.

We are at the syntax level, so of course we do not know the value of the 
RHS when the parsing occurs. I *do* understand that the *form* is 
important here, but how do you explain this to a rookie R user? He will 
see that he entered two expressions which he thinks are identical, even 
though they are not identical at the level when the parsing occurs.


Also think of the potential users of this syntax. There are at least two 
groups:
1) ~95% of the users: active users of `%>%`. My experience is that the 
vast majority of them do not use the "advanced" features of magrittr; 
however, they are got used to things like mtcars |> print. Provide them 
with the RHS-as-symbol syntax and they will be happy - they have a 
plug-and-forget replacement. Or do enforce a function call - they will 
be unhappy, and will not adopt the new syntax.
2) ~5% of the users (including me): have not used magrittr or any other 
(probably better) implementations (e.g., pipeR, wrapr) of the pipe 
operator because it could lead to nasty performance issues, bugs, and 
debugging problems. However, from the functional-programming-style of 
view, these users might prefer the new syntax and as few typing as 
possible.


>
> It's also true that "foo()" and "function(x) x" are completely
> different, but they are well-defined forms:  one is a call, the other is
> an anonymous function definition.
>
> Accepting a plain "foo" would add a third form (a name), which might
> make sense, but hardly gains anything:

I would reverse the argumentation: Luke has a working implementation for 
the case if the RHS is a single symbol. What do we loose if we keep it?


Best,
Denes

> whereas dropping the anonymous
> function definition costs quite a bit.  Without special-casing anonymous
> function definitions you'd need to enter
>
> mtcars |> (function(x) x)()
>
> or
>
> mtcars |> (\(x) x)()
>
> which are both quite difficult to read.
>
> Duncan Murdoch
>
>>
>> You stated in :
>> "
>> Another variation supported by the implementation is that a symbol on
>> the RHS is interpreted as the name of a function to call with the LHS
>> as argument:
>>
>> ```r
>>   > quote(x |> f)
>> f(x)
>> ```
>> "
>>
>> So clearly this is not an implementation issue but a design decision.
>>
>> As a remedy, two different pipe operators could be introduced:
>>
>> LHS |> RHS-> RHS is treated as a function call
>> LHS |>> RHS   -> RHS is treated as a function
>>
>> If |>> is used, it would not matter which notation is used for the RHS
>> expression; the parser would assume it evaluates to a function.
>>
>> 2) Simplified lambda expression:
>> IMHO in the vast majority of use cases, this is used for single-argument
>> functions, so parenthesis would not be required. Hence, both forms would
>> be valid and equivalent:
>>
>> \x x + 1
>> \(x) x + 1
>>
>>
>> 3) Function composition:
>> Allowing for concise composition of functions would be a great feature.
>> E.g., instead of
>>
>> foo <- function(x) print(mean(sqrt(x), na.rm = TRUE), digits = 2)
>>
>> or
>>
>> foo <- \x {x |> sqrt() |> mean(na.rm = TRUE) |> print(digits = 2)}
>>
>> one could write
>>
>> foo <- \x |> sqrt() |> mean(na.rm = TRUE) |> print(digits = 2)
>>
>> So basically if the lambda argument is followed by a pipe operator, the
>> pipe chain is transformed to a function body where the first lambda
>> argument is inserted into the first position of the pipeline.
>>
>>
>> Best,
>> Denes
>>
>>
>> On 12/5/20 7:10 PM, luke-tier...@uiowa.edu wrote:
>>> We went back and forth on this several times. The key advantage of
>>> requiring parentheses is to keep things simple and consistent.  Let's
>>> get some experience with that. If experience shows requiring
>>> parentheses creates too many issues then we can add the option of
>>> dropping them later (with special handling of :: and :::). It's easier
>>> to add flexibility and complexity than to restrict it after the fact.
>>>
>>> Best,
>>>
>>> luke
>>>
>>> On Sat, 5 Dec 2020, Hugh Parsonage wrote:
>>>
 I'm surpri

Re: [Rd] [External] Re: New pipe operator

2020-12-06 Thread Duncan Murdoch

On 06/12/2020 11:34 a.m., Dénes Tóth wrote:



On 12/6/20 4:32 PM, Duncan Murdoch wrote:
  > On 06/12/2020 9:43 a.m., Dénes Tóth wrote:
  >> Dear Luke,
  >>
  >> In the meantime I checked the R-syntax branch and the docs; they are
  >> very helpful. I would also like to thank you for putting effort into
  >> this feature. Keeping it at the syntax level is also a very smart
  >> decision. However, the current API might not exploit the full power of
  >> the basic idea.
  >>
  >> 1) Requiring either an anonymous function or a function call, but not
  >> allowing for symbols which point to functions is inconsistent and will
  >> be misleading for non-experts.
  >>
  >> foo <- function(x) x
  >> identical(foo, function(x) x)
  >>
  >> mtcars |> foo   #bang!
  >> mtcars |> function(x) x #fine?
  >
  > You are missing the point.  The value of the RHS is irrelevant to the
  > transformation.  All that matters is its form.  So "foo" and
  > "function(x) x" are completely different things, even if identical()
  > thinks their value is the same.

We are at the syntax level, so of course we do not know the value of the
RHS when the parsing occurs. I *do* understand that the *form* is
important here, but how do you explain this to a rookie R user? 


I would explain that you almost always need the parens.  Rookies don't 
need to learn about anonymous functions instantly.  Once they get to the 
point of learning about anonymous functions, I'd say they are the sole 
exception to needing parens.


He will

see that he entered two expressions which he thinks are identical, even
though they are not identical at the level when the parsing occurs.


Allowing a name will confuse people who think stats::sd is a name of a 
function in the stats package.  stats::sd() works with the current 
design, allowing stats::sd will add another special case (and then of 
course you'd need stats:::sd, object$method, etc.)


Duncan Murdoch



Also think of the potential users of this syntax. There are at least two
groups:
1) ~95% of the users: active users of `%>%`. My experience is that the
vast majority of them do not use the "advanced" features of magrittr;
however, they are got used to things like mtcars |> print. Provide them
with the RHS-as-symbol syntax and they will be happy - they have a
plug-and-forget replacement. Or do enforce a function call - they will
be unhappy, and will not adopt the new syntax.
2) ~5% of the users (including me): have not used magrittr or any other
(probably better) implementations (e.g., pipeR, wrapr) of the pipe
operator because it could lead to nasty performance issues, bugs, and
debugging problems. However, from the functional-programming-style of
view, these users might prefer the new syntax and as few typing as
possible.

  >
  > It's also true that "foo()" and "function(x) x" are completely
  > different, but they are well-defined forms:  one is a call, the other is
  > an anonymous function definition.
  >
  > Accepting a plain "foo" would add a third form (a name), which might
  > make sense, but hardly gains anything:

I would reverse the argumentation: Luke has a working implementation for
the case if the RHS is a single symbol. What do we loose if we keep it?

Best,
Denes

  > whereas dropping the anonymous
  > function definition costs quite a bit.  Without special-casing anonymous
  > function definitions you'd need to enter
  >
  > mtcars |> (function(x) x)()
  >
  > or
  >
  > mtcars |> (\(x) x)()
  >
  > which are both quite difficult to read.
  >
  > Duncan Murdoch
  >
  >>
  >> You stated in :
  >> "
  >> Another variation supported by the implementation is that a symbol on
  >> the RHS is interpreted as the name of a function to call with the LHS
  >> as argument:
  >>
  >> ```r
  >>   > quote(x |> f)
  >> f(x)
  >> ```
  >> "
  >>
  >> So clearly this is not an implementation issue but a design decision.
  >>
  >> As a remedy, two different pipe operators could be introduced:
  >>
  >> LHS |> RHS-> RHS is treated as a function call
  >> LHS |>> RHS   -> RHS is treated as a function
  >>
  >> If |>> is used, it would not matter which notation is used for the RHS
  >> expression; the parser would assume it evaluates to a function.
  >>
  >> 2) Simplified lambda expression:
  >> IMHO in the vast majority of use cases, this is used for single-argument
  >> functions, so parenthesis would not be required. Hence, both forms would
  >> be valid and equivalent:
  >>
  >> \x x + 1
  >> \(x) x + 1
  >>
  >>
  >> 3) Function composition:
  >> Allowing for concise composition of functions would be a great feature.
  >> E.g., instead of
  >>
  >> foo <- function(x) print(mean(sqrt(x), na.rm = TRUE), digits = 2)
  >>
  >> or
  >>
  >> foo <- \x {x |> sqrt() |> mean(na.rm = TRUE) |> print(digits = 2)}
  >>
  >> one could write
  >>
  >> foo <- \x |> sqrt() |> mean(na.rm = TRUE) |> print(digits = 2)
  >>
  >> So basically if the lambda argument is followed by a pipe operator, the
 

Re: [Rd] [External] Re: New pipe operator

2020-12-06 Thread Gabor Grothendieck
The following gives an error.

   1 |> `+`(2)
   ## Error: function '+' is not supported in RHS call of a pipe

   1 |> `+`()
   ## Error: function '+' is not supported in RHS call of a pipe

but this does work:

   1 |> (`+`)(2)
   ## [1] 3

   1 |> (`+`)()
   ## [1] 1

The error message suggests that this was intentional.
It isn't mentioned in ?"|>"

On Sat, Dec 5, 2020 at 1:19 PM  wrote:
>
> We went back and forth on this several times. The key advantage of
> requiring parentheses is to keep things simple and consistent.  Let's
> get some experience with that. If experience shows requiring
> parentheses creates too many issues then we can add the option of
> dropping them later (with special handling of :: and :::). It's easier
> to add flexibility and complexity than to restrict it after the fact.
>
> Best,
>
> luke
>
> On Sat, 5 Dec 2020, Hugh Parsonage wrote:
>
> > I'm surprised by the aversion to
> >
> > mtcars |> nrow
> >
> > over
> >
> > mtcars |> nrow()
> >
> > and I think the decision to disallow the former should be
> > reconsidered.  The pipe operator is only going to be used when the rhs
> > is a function, so there is no ambiguity with omitting the parentheses.
> > If it's disallowed, it becomes inconsistent with other treatments like
> > sapply(mtcars, typeof) where sapply(mtcars, typeof()) would just be
> > noise.  I'm not sure why this decision was taken
> >
> > If the only issue is with the double (and triple) colon operator, then
> > ideally `mtcars |> base::head` should resolve to `base::head(mtcars)`
> > -- in other words, demote the precedence of |>
> >
> > Obviously (looking at the R-Syntax branch) this decision was
> > considered, put into place, then dropped, but I can't see why
> > precisely.
> >
> > Best,
> >
> >
> > Hugh.
> >
> >
> >
> >
> >
> >
> >
> > On Sat, 5 Dec 2020 at 04:07, Deepayan Sarkar  
> > wrote:
> >>
> >> On Fri, Dec 4, 2020 at 7:35 PM Duncan Murdoch  
> >> wrote:
> >>>
> >>> On 04/12/2020 8:13 a.m., Hiroaki Yutani wrote:
> >   Error: function '::' not supported in RHS call of a pipe
> 
>  To me, this error looks much more friendly than magrittr's error.
>  Some of them got too used to specify functions without (). This
>  is OK until they use `::`, but when they need to use it, it takes
>  hours to figure out why
> 
>  mtcars %>% base::head
>  #> Error in .::base : unused argument (head)
> 
>  won't work but
> 
>  mtcars %>% head
> 
>  works. I think this is a too harsh lesson for ordinary R users to
>  learn `::` is a function. I've been wanting for magrittr to drop the
>  support for a function name without () to avoid this confusion,
>  so I would very much welcome the new pipe operator's behavior.
>  Thank you all the developers who implemented this!
> >>>
> >>> I agree, it's an improvement on the corresponding magrittr error.
> >>>
> >>> I think the semantics of not evaluating the RHS, but treating the pipe
> >>> as purely syntactical is a good decision.
> >>>
> >>> I'm not sure I like the recommended way to pipe into a particular 
> >>> argument:
> >>>
> >>>mtcars |> subset(cyl == 4) |> \(d) lm(mpg ~ disp, data = d)
> >>>
> >>> or
> >>>
> >>>mtcars |> subset(cyl == 4) |> function(d) lm(mpg ~ disp, data = d)
> >>>
> >>> both of which are equivalent to
> >>>
> >>>mtcars |> subset(cyl == 4) |> (function(d) lm(mpg ~ disp, data = d))()
> >>>
> >>> It's tempting to suggest it should allow something like
> >>>
> >>>mtcars |> subset(cyl == 4) |> lm(mpg ~ disp, data = .)
> >>
> >> Which is really not that far off from
> >>
> >> mtcars |> subset(cyl == 4) |> \(.) lm(mpg ~ disp, data = .)
> >>
> >> once you get used to it.
> >>
> >> One consequence of the implementation is that it's not clear how
> >> multiple occurrences of the placeholder would be interpreted. With
> >> magrittr,
> >>
> >> sort(runif(10)) %>% ecdf(.)(.)
> >> ## [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
> >>
> >> This is probably what you would expect, if you expect it to work at all, 
> >> and not
> >>
> >> ecdf(sort(runif(10)))(sort(runif(10)))
> >>
> >> There would be no such ambiguity with anonymous functions
> >>
> >> sort(runif(10)) |> \(.) ecdf(.)(.)
> >>
> >> -Deepayan
> >>
> >>> which would be expanded to something equivalent to the other versions:
> >>> but that makes it quite a bit more complicated.  (Maybe _ or \. should
> >>> be used instead of ., since those are not legal variable names.)
> >>>
> >>> I don't think there should be an attempt to copy magrittr's special
> >>> casing of how . is used in determining whether to also include the
> >>> previous value as first argument.
> >>>
> >>> Duncan Murdoch
> >>>
> >>>
> 
>  Best,
>  Hiroaki Yutani
> 
>  2020年12月4日(金) 20:51 Duncan Murdoch :
> >
> > Just saw this on the R-devel news:
> >
> >
> > R now provides a simple native pipe syntax ‘|>’ as well as a shorthand
> > notation for creating functions, e.g

Re: [Rd] [External] Re: New pipe operator

2020-12-06 Thread luke-tierney

On Sun, 6 Dec 2020, Gabor Grothendieck wrote:


The following gives an error.

  1 |> `+`(2)
  ## Error: function '+' is not supported in RHS call of a pipe

  1 |> `+`()
  ## Error: function '+' is not supported in RHS call of a pipe

but this does work:

  1 |> (`+`)(2)
  ## [1] 3

  1 |> (`+`)()
  ## [1] 1

The error message suggests that this was intentional.
It isn't mentioned in ?"|>"


?"|>" says:

 To avoid ambiguities, functions in ‘rhs’ calls may not
 be syntactically special, such as ‘+’ or ‘if’.

(used to say lhs; fixed now).

Best,

luke



On Sat, Dec 5, 2020 at 1:19 PM  wrote:


We went back and forth on this several times. The key advantage of
requiring parentheses is to keep things simple and consistent.  Let's
get some experience with that. If experience shows requiring
parentheses creates too many issues then we can add the option of
dropping them later (with special handling of :: and :::). It's easier
to add flexibility and complexity than to restrict it after the fact.

Best,

luke

On Sat, 5 Dec 2020, Hugh Parsonage wrote:


I'm surprised by the aversion to

mtcars |> nrow

over

mtcars |> nrow()

and I think the decision to disallow the former should be
reconsidered.  The pipe operator is only going to be used when the rhs
is a function, so there is no ambiguity with omitting the parentheses.
If it's disallowed, it becomes inconsistent with other treatments like
sapply(mtcars, typeof) where sapply(mtcars, typeof()) would just be
noise.  I'm not sure why this decision was taken

If the only issue is with the double (and triple) colon operator, then
ideally `mtcars |> base::head` should resolve to `base::head(mtcars)`
-- in other words, demote the precedence of |>

Obviously (looking at the R-Syntax branch) this decision was
considered, put into place, then dropped, but I can't see why
precisely.

Best,


Hugh.







On Sat, 5 Dec 2020 at 04:07, Deepayan Sarkar  wrote:


On Fri, Dec 4, 2020 at 7:35 PM Duncan Murdoch  wrote:


On 04/12/2020 8:13 a.m., Hiroaki Yutani wrote:

  Error: function '::' not supported in RHS call of a pipe


To me, this error looks much more friendly than magrittr's error.
Some of them got too used to specify functions without (). This
is OK until they use `::`, but when they need to use it, it takes
hours to figure out why

mtcars %>% base::head
#> Error in .::base : unused argument (head)

won't work but

mtcars %>% head

works. I think this is a too harsh lesson for ordinary R users to
learn `::` is a function. I've been wanting for magrittr to drop the
support for a function name without () to avoid this confusion,
so I would very much welcome the new pipe operator's behavior.
Thank you all the developers who implemented this!


I agree, it's an improvement on the corresponding magrittr error.

I think the semantics of not evaluating the RHS, but treating the pipe
as purely syntactical is a good decision.

I'm not sure I like the recommended way to pipe into a particular argument:

   mtcars |> subset(cyl == 4) |> \(d) lm(mpg ~ disp, data = d)

or

   mtcars |> subset(cyl == 4) |> function(d) lm(mpg ~ disp, data = d)

both of which are equivalent to

   mtcars |> subset(cyl == 4) |> (function(d) lm(mpg ~ disp, data = d))()

It's tempting to suggest it should allow something like

   mtcars |> subset(cyl == 4) |> lm(mpg ~ disp, data = .)


Which is really not that far off from

mtcars |> subset(cyl == 4) |> \(.) lm(mpg ~ disp, data = .)

once you get used to it.

One consequence of the implementation is that it's not clear how
multiple occurrences of the placeholder would be interpreted. With
magrittr,

sort(runif(10)) %>% ecdf(.)(.)
## [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

This is probably what you would expect, if you expect it to work at all, and not

ecdf(sort(runif(10)))(sort(runif(10)))

There would be no such ambiguity with anonymous functions

sort(runif(10)) |> \(.) ecdf(.)(.)

-Deepayan


which would be expanded to something equivalent to the other versions:
but that makes it quite a bit more complicated.  (Maybe _ or \. should
be used instead of ., since those are not legal variable names.)

I don't think there should be an attempt to copy magrittr's special
casing of how . is used in determining whether to also include the
previous value as first argument.

Duncan Murdoch




Best,
Hiroaki Yutani

2020年12月4日(金) 20:51 Duncan Murdoch :


Just saw this on the R-devel news:


R now provides a simple native pipe syntax ‘|>’ as well as a shorthand
notation for creating functions, e.g. ‘\(x) x + 1’ is parsed as
‘function(x) x + 1’. The pipe implementation as a syntax transformation
was motivated by suggestions from Jim Hester and Lionel Henry. These
features are experimental and may change prior to release.


This is a good addition; by using "|>" instead of "%>%" there should be
a chance to get operator precedence right.  That said, the ?Syntax help
topic hasn't been updated, so I'm not sure where it fits in.

There are some choices t

Re: [Rd] [External] Re: New pipe operator

2020-12-06 Thread Gabor Grothendieck
Why is that ambiguous?  It works in magrittr.

> library(magrittr)
> 1 %>% `+`()
[1] 1

On Sun, Dec 6, 2020 at 1:09 PM  wrote:
>
> On Sun, 6 Dec 2020, Gabor Grothendieck wrote:
>
> > The following gives an error.
> >
> >   1 |> `+`(2)
> >   ## Error: function '+' is not supported in RHS call of a pipe
> >
> >   1 |> `+`()
> >   ## Error: function '+' is not supported in RHS call of a pipe
> >
> > but this does work:
> >
> >   1 |> (`+`)(2)
> >   ## [1] 3
> >
> >   1 |> (`+`)()
> >   ## [1] 1
> >
> > The error message suggests that this was intentional.
> > It isn't mentioned in ?"|>"
>
> ?"|>" says:
>
>   To avoid ambiguities, functions in ‘rhs’ calls may not
>   be syntactically special, such as ‘+’ or ‘if’.
>
> (used to say lhs; fixed now).
>
> Best,
>
> luke
>
> >
> > On Sat, Dec 5, 2020 at 1:19 PM  wrote:
> >>
> >> We went back and forth on this several times. The key advantage of
> >> requiring parentheses is to keep things simple and consistent.  Let's
> >> get some experience with that. If experience shows requiring
> >> parentheses creates too many issues then we can add the option of
> >> dropping them later (with special handling of :: and :::). It's easier
> >> to add flexibility and complexity than to restrict it after the fact.
> >>
> >> Best,
> >>
> >> luke
> >>
> >> On Sat, 5 Dec 2020, Hugh Parsonage wrote:
> >>
> >>> I'm surprised by the aversion to
> >>>
> >>> mtcars |> nrow
> >>>
> >>> over
> >>>
> >>> mtcars |> nrow()
> >>>
> >>> and I think the decision to disallow the former should be
> >>> reconsidered.  The pipe operator is only going to be used when the rhs
> >>> is a function, so there is no ambiguity with omitting the parentheses.
> >>> If it's disallowed, it becomes inconsistent with other treatments like
> >>> sapply(mtcars, typeof) where sapply(mtcars, typeof()) would just be
> >>> noise.  I'm not sure why this decision was taken
> >>>
> >>> If the only issue is with the double (and triple) colon operator, then
> >>> ideally `mtcars |> base::head` should resolve to `base::head(mtcars)`
> >>> -- in other words, demote the precedence of |>
> >>>
> >>> Obviously (looking at the R-Syntax branch) this decision was
> >>> considered, put into place, then dropped, but I can't see why
> >>> precisely.
> >>>
> >>> Best,
> >>>
> >>>
> >>> Hugh.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Sat, 5 Dec 2020 at 04:07, Deepayan Sarkar  
> >>> wrote:
> 
>  On Fri, Dec 4, 2020 at 7:35 PM Duncan Murdoch  
>  wrote:
> >
> > On 04/12/2020 8:13 a.m., Hiroaki Yutani wrote:
> >>>   Error: function '::' not supported in RHS call of a pipe
> >>
> >> To me, this error looks much more friendly than magrittr's error.
> >> Some of them got too used to specify functions without (). This
> >> is OK until they use `::`, but when they need to use it, it takes
> >> hours to figure out why
> >>
> >> mtcars %>% base::head
> >> #> Error in .::base : unused argument (head)
> >>
> >> won't work but
> >>
> >> mtcars %>% head
> >>
> >> works. I think this is a too harsh lesson for ordinary R users to
> >> learn `::` is a function. I've been wanting for magrittr to drop the
> >> support for a function name without () to avoid this confusion,
> >> so I would very much welcome the new pipe operator's behavior.
> >> Thank you all the developers who implemented this!
> >
> > I agree, it's an improvement on the corresponding magrittr error.
> >
> > I think the semantics of not evaluating the RHS, but treating the pipe
> > as purely syntactical is a good decision.
> >
> > I'm not sure I like the recommended way to pipe into a particular 
> > argument:
> >
> >mtcars |> subset(cyl == 4) |> \(d) lm(mpg ~ disp, data = d)
> >
> > or
> >
> >mtcars |> subset(cyl == 4) |> function(d) lm(mpg ~ disp, data = d)
> >
> > both of which are equivalent to
> >
> >mtcars |> subset(cyl == 4) |> (function(d) lm(mpg ~ disp, data = 
> > d))()
> >
> > It's tempting to suggest it should allow something like
> >
> >mtcars |> subset(cyl == 4) |> lm(mpg ~ disp, data = .)
> 
>  Which is really not that far off from
> 
>  mtcars |> subset(cyl == 4) |> \(.) lm(mpg ~ disp, data = .)
> 
>  once you get used to it.
> 
>  One consequence of the implementation is that it's not clear how
>  multiple occurrences of the placeholder would be interpreted. With
>  magrittr,
> 
>  sort(runif(10)) %>% ecdf(.)(.)
>  ## [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
> 
>  This is probably what you would expect, if you expect it to work at all, 
>  and not
> 
>  ecdf(sort(runif(10)))(sort(runif(10)))
> 
>  There would be no such ambiguity with anonymous functions
> 
>  sort(runif(10)) |> \(.) ecdf(.)(.)
> 
>  -Deepayan
> 
> > which would be expanded to somet

Re: [Rd] New pipe operator

2020-12-06 Thread Avi Gross via R-devel
Topic is more about anonymous functions but also pipes.

Rui thought the proposed syntax was a bit ugly. I assume the \(x) ... was what 
he means, not the function(x)... version.

Many current languages have played games on adding some form of anonymous 
function that is defined and used in place. Some go to great pains to make 
various parts optional to the point where there are many valid way to create a 
function that takes no arguments so you can leave out almost everything else as 
optional.

I admit having to type "lambda" all the time (in some languages)  is not 
preferable but in English, something shorter like fun(...) or func(...) instead 
of function(...) might be more readable than the weird choice of \(. Yes. You 
can view the combo to bring attention to the fact the "(" is meant not as any 
old paren for other uses but specifically for function invocation/definition 
purposes. But overuse of the backslash to mean other things such as in regular 
expressions and the parentheses for so many things, makes parsing for humans 
harder. So does "|>" for the new pipe symbol as it can also look like "or 
greater than" and since some humans do not insert spaces to make code even 
shorter, it can be a challenge to rapidly see a line of code as tokens.

If programming were being invented today with a larger set of symbols, it might 
use more of them and perhaps look more like APL. We might have all of the 
built-in to the language tokens be single symbols including real arrows instead 
of -> and a not-equals symbol like  ≠ instead of != or ~= s some languages use. 
In that system, what might the pipe symbol look like?

ǂ

But although making things concise is nice, sometimes there is clarity in using 
enough room, to make things clear or we might as well code in binary.

-Original Message-
From: R-devel  On Behalf Of Rui Barradas
Sent: Sunday, December 6, 2020 2:51 AM
To: Gregory Warnes ; Abby Spurdle 
Cc: r-devel 
Subject: Re: [Rd] New pipe operator

Hello,

If Hilbert liked beer, I like "pipe".

More seriously, a new addition like this one is going to cause problems yet 
unknown. But it's a good idea to have a pipe operator available. As someone 
used to magrittr's data pipelines, I will play with this base one before making 
up my mind. I don't expect its behavior to be exactly like magrittr "%>%" (and 
it's not). For the moment all I can say is that it is something R users are 
used to and that it now avoids loading a package.
As for the new way to define anonymous functions, I am less sure. Too much 
syntatic sugar? Or am I finding the syntax ugly?

Hope this helps,

Rui Barradas


Às 03:22 de 06/12/20, Gregory Warnes escreveu:
> If we’re being mathematically pedantic, the “pipe” operator is 
> actually function composition > That being said, pipes are a simple 
> and well-known idiom. While being less
> than mathematically exact, it seems a reasonable   label for the (very
> useful) behavior.
> 
> On Sat, Dec 5, 2020 at 9:43 PM Abby Spurdle  wrote:
> 
>>> This is a good addition
>>
>> I can't understand why so many people are calling this a "pipe".
>> Pipes connect processes, via their I/O streams.
>> Arguably, a more general interpretation would include sockets and files.
>>
>> https://en.wikipedia.org/wiki/Pipeline_(Unix)
>> https://en.wikipedia.org/wiki/Named_pipe
>> https://en.wikipedia.org/wiki/Anonymous_pipe
>>
>> As far as I can tell, the magrittr-like operators are functions (not 
>> pipes), with nonstandard syntax.
>> This is not consistent with R's original design philosophy, building 
>> on C, Lisp and S, along with lots of *important* math and stats.
>>
>> It's possible that some parties are interested in creating a kind of 
>> "data pipeline".
>> I'm interested in this myself, and I think we could discuss this more.
>> But I'm not convinced the magrittr-like operators help to achieve 
>> this goal.
>> Which, in my opinion, would require one to model programs as directed 
>> graphs, along with some degree of asynchronous input.
>>
>> Presumably, these operators will be added to R anyway, and (almost) 
>> no one will listen to me.
>>
>> So, I would like to make one suggestion:
>> Is it possible for these operators to *not* be named:
>>  The R Pipe
>>  The S Pipe
>>  Or anything with a similar meaning.
>>
>> Maybe tidy pipe, or something else that links it to its proponents?
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Scanned by McAfee and confirmed virus-free. 
Find out more here: https://bit.ly/2zCJMrO

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [External] Re: New pipe operator

2020-12-06 Thread luke-tierney

On Sun, 6 Dec 2020, Gabor Grothendieck wrote:


Why is that ambiguous?  It works in magrittr.


For now, all functions marked internally as syntactically special are
disallowed. Not all of these lead to ambiguities.

Best,

luke




library(magrittr)
1 %>% `+`()

[1] 1

On Sun, Dec 6, 2020 at 1:09 PM  wrote:


On Sun, 6 Dec 2020, Gabor Grothendieck wrote:


The following gives an error.

  1 |> `+`(2)
  ## Error: function '+' is not supported in RHS call of a pipe

  1 |> `+`()
  ## Error: function '+' is not supported in RHS call of a pipe

but this does work:

  1 |> (`+`)(2)
  ## [1] 3

  1 |> (`+`)()
  ## [1] 1

The error message suggests that this was intentional.
It isn't mentioned in ?"|>"


?"|>" says:

  To avoid ambiguities, functions in ‘rhs’ calls may not
  be syntactically special, such as ‘+’ or ‘if’.

(used to say lhs; fixed now).

Best,

luke



On Sat, Dec 5, 2020 at 1:19 PM  wrote:


We went back and forth on this several times. The key advantage of
requiring parentheses is to keep things simple and consistent.  Let's
get some experience with that. If experience shows requiring
parentheses creates too many issues then we can add the option of
dropping them later (with special handling of :: and :::). It's easier
to add flexibility and complexity than to restrict it after the fact.

Best,

luke

On Sat, 5 Dec 2020, Hugh Parsonage wrote:


I'm surprised by the aversion to

mtcars |> nrow

over

mtcars |> nrow()

and I think the decision to disallow the former should be
reconsidered.  The pipe operator is only going to be used when the rhs
is a function, so there is no ambiguity with omitting the parentheses.
If it's disallowed, it becomes inconsistent with other treatments like
sapply(mtcars, typeof) where sapply(mtcars, typeof()) would just be
noise.  I'm not sure why this decision was taken

If the only issue is with the double (and triple) colon operator, then
ideally `mtcars |> base::head` should resolve to `base::head(mtcars)`
-- in other words, demote the precedence of |>

Obviously (looking at the R-Syntax branch) this decision was
considered, put into place, then dropped, but I can't see why
precisely.

Best,


Hugh.







On Sat, 5 Dec 2020 at 04:07, Deepayan Sarkar  wrote:


On Fri, Dec 4, 2020 at 7:35 PM Duncan Murdoch  wrote:


On 04/12/2020 8:13 a.m., Hiroaki Yutani wrote:

  Error: function '::' not supported in RHS call of a pipe


To me, this error looks much more friendly than magrittr's error.
Some of them got too used to specify functions without (). This
is OK until they use `::`, but when they need to use it, it takes
hours to figure out why

mtcars %>% base::head
#> Error in .::base : unused argument (head)

won't work but

mtcars %>% head

works. I think this is a too harsh lesson for ordinary R users to
learn `::` is a function. I've been wanting for magrittr to drop the
support for a function name without () to avoid this confusion,
so I would very much welcome the new pipe operator's behavior.
Thank you all the developers who implemented this!


I agree, it's an improvement on the corresponding magrittr error.

I think the semantics of not evaluating the RHS, but treating the pipe
as purely syntactical is a good decision.

I'm not sure I like the recommended way to pipe into a particular argument:

   mtcars |> subset(cyl == 4) |> \(d) lm(mpg ~ disp, data = d)

or

   mtcars |> subset(cyl == 4) |> function(d) lm(mpg ~ disp, data = d)

both of which are equivalent to

   mtcars |> subset(cyl == 4) |> (function(d) lm(mpg ~ disp, data = d))()

It's tempting to suggest it should allow something like

   mtcars |> subset(cyl == 4) |> lm(mpg ~ disp, data = .)


Which is really not that far off from

mtcars |> subset(cyl == 4) |> \(.) lm(mpg ~ disp, data = .)

once you get used to it.

One consequence of the implementation is that it's not clear how
multiple occurrences of the placeholder would be interpreted. With
magrittr,

sort(runif(10)) %>% ecdf(.)(.)
## [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

This is probably what you would expect, if you expect it to work at all, and not

ecdf(sort(runif(10)))(sort(runif(10)))

There would be no such ambiguity with anonymous functions

sort(runif(10)) |> \(.) ecdf(.)(.)

-Deepayan


which would be expanded to something equivalent to the other versions:
but that makes it quite a bit more complicated.  (Maybe _ or \. should
be used instead of ., since those are not legal variable names.)

I don't think there should be an attempt to copy magrittr's special
casing of how . is used in determining whether to also include the
previous value as first argument.

Duncan Murdoch




Best,
Hiroaki Yutani

2020年12月4日(金) 20:51 Duncan Murdoch :


Just saw this on the R-devel news:


R now provides a simple native pipe syntax ‘|>’ as well as a shorthand
notation for creating functions, e.g. ‘\(x) x + 1’ is parsed as
‘function(x) x + 1’. The pipe implementation as a syntax transformation
was motivated by suggestions from Jim Hester a

Re: [Rd] [External] Re: New pipe operator

2020-12-06 Thread Gabriel Becker
Hi Denes,

On Sun, Dec 6, 2020 at 6:43 AM Dénes Tóth  wrote:

> Dear Luke,
>
> In the meantime I checked the R-syntax branch and the docs; they are
> very helpful. I would also like to thank you for putting effort into
> this feature. Keeping it at the syntax level is also a very smart
> decision. However, the current API might not exploit the full power of
> the basic idea.
>
> 1) Requiring either an anonymous function or a function call, but not
> allowing for symbols which point to functions is inconsistent and will
> be misleading for non-experts.
>
> foo <- function(x) x
> identical(foo, function(x) x)
>
> mtcars |> foo   #bang!
> mtcars |> function(x) x #fine?
>
> You stated in :
> "
> Another variation supported by the implementation is that a symbol on
> the RHS is interpreted as the name of a function to call with the LHS
> as argument:
>
> ```r
>  > quote(x |> f)
> f(x)
> ```
> "
>
> So clearly this is not an implementation issue but a design decision.
>
> As a remedy, two different pipe operators could be introduced:
>
> LHS |> RHS-> RHS is treated as a function call
> LHS |>> RHS   -> RHS is treated as a function
>
> If |>> is used, it would not matter which notation is used for the RHS
> expression; the parser would assume it evaluates to a function.
>

I think multiplying the operators would not be a net positive. You'd then
have to remember and mix them when you mix anonymous functions and
non-anonymous functions.  It would result in

LHS |> RHS1() |>> \(x,y) blablabla |> RHS3()

I think thats too much intricacy. Better to be a little more restrictive
in way that (honestly doesnt' really hurt anything afaics, and) guarantees
consistency.

>
> 2) Simplified lambda expression:
> IMHO in the vast majority of use cases, this is used for single-argument
> functions, so parenthesis would not be required. Hence, both forms would
> be valid and equivalent:
>
> \x x + 1
> \(x) x + 1
>
>
Why special case something here when soemtimes you'll want more than one
argument. The parentheses seem really not a big deal. So I don't understand
the motivation here, if I'm being honest.


>
> 3) Function composition:
> Allowing for concise composition of functions would be a great feature.
> E.g., instead of
>
> foo <- function(x) print(mean(sqrt(x), na.rm = TRUE), digits = 2)
>
> or
>
> foo <- \x {x |> sqrt() |> mean(na.rm = TRUE) |> print(digits = 2)}
>
> one could write
>
> foo <- \x |> sqrt() |> mean(na.rm = TRUE) |> print(digits = 2)
>
> So basically if the lambda argument is followed by a pipe operator, the
> pipe chain is transformed to a function body where the first lambda
> argument is inserted into the first position of the pipeline.
>

This one I disagree with very strongly. Reading pipelines would suddenly
require a *much* higher cognitive load than before because you have to
model that complexity just to read it and know what it says. The brackets
there seem like an extremely low price to pay to avoid that. Operator
precedence should be extremely and easily predictable.


>
>
> Best,
> Denes
>
>
> On 12/5/20 7:10 PM, luke-tier...@uiowa.edu wrote:
> > We went back and forth on this several times. The key advantage of
> > requiring parentheses is to keep things simple and consistent.  Let's
> > get some experience with that. If experience shows requiring
> > parentheses creates too many issues then we can add the option of
> > dropping them later (with special handling of :: and :::). It's easier
> > to add flexibility and complexity than to restrict it after the fact.
> >
> > Best,
> >
> > luke
> >
> > On Sat, 5 Dec 2020, Hugh Parsonage wrote:
> >
> >> I'm surprised by the aversion to
> >>
> >> mtcars |> nrow
> >>
> >> over
> >>
> >> mtcars |> nrow()
> >>
> >> and I think the decision to disallow the former should be
> >> reconsidered.  The pipe operator is only going to be used when the rhs
> >> is a function, so there is no ambiguity with omitting the parentheses.
> >> If it's disallowed, it becomes inconsistent with other treatments like
> >> sapply(mtcars, typeof) where sapply(mtcars, typeof()) would just be
> >> noise.  I'm not sure why this decision was taken
> >>
> >> If the only issue is with the double (and triple) colon operator, then
> >> ideally `mtcars |> base::head` should resolve to `base::head(mtcars)`
> >> -- in other words, demote the precedence of |>
> >>
> >> Obviously (looking at the R-Syntax branch) this decision was
> >> considered, put into place, then dropped, but I can't see why
> >> precisely.
> >>
> >> Best,
> >>
> >>
> >> Hugh.
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> On Sat, 5 Dec 2020 at 04:07, Deepayan Sarkar
> >>  wrote:
> >>>
> >>> On Fri, Dec 4, 2020 at 7:35 PM Duncan Murdoch
> >>>  wrote:
> 
>  On 04/12/2020 8:13 a.m., Hiroaki Yutani wrote:
> >>   Error: function '::' not supported in RHS call of a pipe
> >
> > To me, this error looks much more friendly than magrittr's error.
> > Some of them got too used to specify functi

Re: [Rd] New pipe operator and gg plotz

2020-12-06 Thread Avi Gross via R-devel
As someone who switches back and forth between using standard R methods and 
those of the tidyverse, depending on the problem, my mood and whether Jupiter 
aligns with Saturn in the new age of Aquarius, I have a question about the 
forthcoming built-in pipe. Will it motivate anyone to eventually change or 
enhance the ggplot functionality to have a version that gets rid of the odd use 
of the addition symbol?

I mean I now sometimes have a pipeline that looks like:

Data %>%
Do_this %>%
Do_that(whatever) %>%
ggplot(...) +
geom_whatever(...) +
...

My understanding is this is a bit of a historical anomaly that might someday be 
modified back.

As I understand it, the call to ggplot() creates a partially filled-in object 
that holds all kinds of useful info. The additional calls to geom_point() and 
so on will add/change that hidden object. Nothing much happens till the object 
is implicitly or explicitly given to print() which switches to the print 
function for objects of that type and creates a graph based on the contents of 
the object at that time. So, in theory, you could have a pipelined version of 
ggplot where the first function accepts something like a  data.frame or tibble 
as the default first argument and at the end returns the object we have been 
describing. All additional functions would then accept such an object as the 
(hidden?) first argument and return the modified object. The final function in 
the pipe would either have the value captured in a variable for later use or 
print implicitly generating a graph.

So the above silly example might become:

Data %>%
Do_this %>%
Do_that(whatever) %>%
ggplot(...) %>%
geom_whatever(...) %>%
...

Or, am I missing something here? 

The language and extensions such as are now in the tidyverse might be more 
streamlined and easier to read when using consistent notation. If we now build 
a reasonable version of the pipeline in, might we encourage other uses to 
gradually migrate back closer to the mainstream?

-Original Message-
From: R-devel  On Behalf Of Rui Barradas
Sent: Sunday, December 6, 2020 2:51 AM
To: Gregory Warnes ; Abby Spurdle 
Cc: r-devel 
Subject: Re: [Rd] New pipe operator

Hello,

If Hilbert liked beer, I like "pipe".

More seriously, a new addition like this one is going to cause problems yet 
unknown. But it's a good idea to have a pipe operator available. As someone 
used to magrittr's data pipelines, I will play with this base one before making 
up my mind. I don't expect its behavior to be exactly like magrittr "%>%" (and 
it's not). For the moment all I can say is that it is something R users are 
used to and that it now avoids loading a package.
As for the new way to define anonymous functions, I am less sure. Too much 
syntatic sugar? Or am I finding the syntax ugly?

Hope this helps,

Rui Barradas


Às 03:22 de 06/12/20, Gregory Warnes escreveu:
> If we’re being mathematically pedantic, the “pipe” operator is 
> actually function composition > That being said, pipes are a simple 
> and well-known idiom. While being less
> than mathematically exact, it seems a reasonable   label for the (very
> useful) behavior.
> 
> On Sat, Dec 5, 2020 at 9:43 PM Abby Spurdle  wrote:
> 
>>> This is a good addition
>>
>> I can't understand why so many people are calling this a "pipe".
>> Pipes connect processes, via their I/O streams.
>> Arguably, a more general interpretation would include sockets and files.
>>
>> https://en.wikipedia.org/wiki/Pipeline_(Unix)
>> https://en.wikipedia.org/wiki/Named_pipe
>> https://en.wikipedia.org/wiki/Anonymous_pipe
>>
>> As far as I can tell, the magrittr-like operators are functions (not 
>> pipes), with nonstandard syntax.
>> This is not consistent with R's original design philosophy, building 
>> on C, Lisp and S, along with lots of *important* math and stats.
>>
>> It's possible that some parties are interested in creating a kind of 
>> "data pipeline".
>> I'm interested in this myself, and I think we could discuss this more.
>> But I'm not convinced the magrittr-like operators help to achieve 
>> this goal.
>> Which, in my opinion, would require one to model programs as directed 
>> graphs, along with some degree of asynchronous input.
>>
>> Presumably, these operators will be added to R anyway, and (almost) 
>> no one will listen to me.
>>
>> So, I would like to make one suggestion:
>> Is it possible for these operators to *not* be named:
>>  The R Pipe
>>  The S Pipe
>>  Or anything with a similar meaning.
>>
>> Maybe tidy pipe, or something else that links it to its proponents?
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Scanned by McAfee and confirmed virus-fr

Re: [Rd] New pipe operator and gg plotz

2020-12-06 Thread Duncan Murdoch
Hadley's answer (#7 here: 
https://community.rstudio.com/t/why-cant-ggplot2-use/4372) makes it 
pretty clear that he thinks it would have been nice now if he had made 
that choice when ggplot2 came out, but it's not worth the effort now to 
change it.


Duncan Murdoch

On 06/12/2020 2:34 p.m., Avi Gross via R-devel wrote:

As someone who switches back and forth between using standard R methods and 
those of the tidyverse, depending on the problem, my mood and whether Jupiter 
aligns with Saturn in the new age of Aquarius, I have a question about the 
forthcoming built-in pipe. Will it motivate anyone to eventually change or 
enhance the ggplot functionality to have a version that gets rid of the odd use 
of the addition symbol?

I mean I now sometimes have a pipeline that looks like:

Data %>%
Do_this %>%
Do_that(whatever) %>%
ggplot(...) +
geom_whatever(...) +
...

My understanding is this is a bit of a historical anomaly that might someday be 
modified back.

As I understand it, the call to ggplot() creates a partially filled-in object 
that holds all kinds of useful info. The additional calls to geom_point() and 
so on will add/change that hidden object. Nothing much happens till the object 
is implicitly or explicitly given to print() which switches to the print 
function for objects of that type and creates a graph based on the contents of 
the object at that time. So, in theory, you could have a pipelined version of 
ggplot where the first function accepts something like a  data.frame or tibble 
as the default first argument and at the end returns the object we have been 
describing. All additional functions would then accept such an object as the 
(hidden?) first argument and return the modified object. The final function in 
the pipe would either have the value captured in a variable for later use or 
print implicitly generating a graph.

So the above silly example might become:

Data %>%
Do_this %>%
Do_that(whatever) %>%
ggplot(...) %>%
geom_whatever(...) %>%
...

Or, am I missing something here?

The language and extensions such as are now in the tidyverse might be more 
streamlined and easier to read when using consistent notation. If we now build 
a reasonable version of the pipeline in, might we encourage other uses to 
gradually migrate back closer to the mainstream?

-Original Message-
From: R-devel  On Behalf Of Rui Barradas
Sent: Sunday, December 6, 2020 2:51 AM
To: Gregory Warnes ; Abby Spurdle 
Cc: r-devel 
Subject: Re: [Rd] New pipe operator

Hello,

If Hilbert liked beer, I like "pipe".

More seriously, a new addition like this one is going to cause problems yet unknown. But 
it's a good idea to have a pipe operator available. As someone used to magrittr's data 
pipelines, I will play with this base one before making up my mind. I don't expect its 
behavior to be exactly like magrittr "%>%" (and it's not). For the moment all I 
can say is that it is something R users are used to and that it now avoids loading a package.
As for the new way to define anonymous functions, I am less sure. Too much 
syntatic sugar? Or am I finding the syntax ugly?

Hope this helps,

Rui Barradas


Às 03:22 de 06/12/20, Gregory Warnes escreveu:

If we’re being mathematically pedantic, the “pipe” operator is
actually function composition > That being said, pipes are a simple
and well-known idiom. While being less
than mathematically exact, it seems a reasonable   label for the (very
useful) behavior.

On Sat, Dec 5, 2020 at 9:43 PM Abby Spurdle  wrote:


This is a good addition


I can't understand why so many people are calling this a "pipe".
Pipes connect processes, via their I/O streams.
Arguably, a more general interpretation would include sockets and files.

https://en.wikipedia.org/wiki/Pipeline_(Unix)
https://en.wikipedia.org/wiki/Named_pipe
https://en.wikipedia.org/wiki/Anonymous_pipe

As far as I can tell, the magrittr-like operators are functions (not
pipes), with nonstandard syntax.
This is not consistent with R's original design philosophy, building
on C, Lisp and S, along with lots of *important* math and stats.

It's possible that some parties are interested in creating a kind of
"data pipeline".
I'm interested in this myself, and I think we could discuss this more.
But I'm not convinced the magrittr-like operators help to achieve
this goal.
Which, in my opinion, would require one to model programs as directed
graphs, along with some degree of asynchronous input.

Presumably, these operators will be added to R anyway, and (almost)
no one will listen to me.

So, I would like to make one suggestion:
Is it possible for these operators to *not* be named:
  The R Pipe
  The S Pipe
  Or anything with a similar meaning.

Maybe tidy pipe, or something else that links it to its proponents?

__
R-devel@r-project.org mailing list
https://stat.ethz.c

Re: [Rd] [External] Re: New pipe operator

2020-12-06 Thread Dénes Tóth

Hi Gabriel,

Thanks for the comments. See inline.

On 12/6/20 8:16 PM, Gabriel Becker wrote:

Hi Denes,

On Sun, Dec 6, 2020 at 6:43 AM Dénes Tóth > wrote:


Dear Luke,

In the meantime I checked the R-syntax branch and the docs; they are
very helpful. I would also like to thank you for putting effort into
this feature. Keeping it at the syntax level is also a very smart
decision. However, the current API might not exploit the full power of
the basic idea.

1) Requiring either an anonymous function or a function call, but not
allowing for symbols which point to functions is inconsistent and will
be misleading for non-experts.

foo <- function(x) x
identical(foo, function(x) x)

mtcars |> foo               #bang!
mtcars |> function(x) x     #fine?

You stated in :
"
Another variation supported by the implementation is that a symbol on
the RHS is interpreted as the name of a function to call with the LHS
as argument:

```r
  > quote(x |> f)
f(x)
```
"

So clearly this is not an implementation issue but a design decision.

As a remedy, two different pipe operators could be introduced:

LHS |> RHS    -> RHS is treated as a function call
LHS |>> RHS   -> RHS is treated as a function

If |>> is used, it would not matter which notation is used for the RHS
expression; the parser would assume it evaluates to a function.


I think multiplying the operators would not be a net positive. You'd 
then have to remember and mix them when you mix anonymous functions and 
non-anonymous functions.  It would result in


LHS |> RHS1() |>> \(x,y) blablabla |> RHS3()

I think thats too much intricacy. Better to be a little more 
restrictive  in way that (honestly doesnt' really hurt anything afaics, 
and) guarantees consistency.




That was just a secondary option for the case if pure symbols are 
disallowed on the RHS. The point is that one can not avoid inconsistency 
here because of practical considerations; let us admit, R has tons of 
inconsistencies which are usually motivated by making interactive data 
analysis more convenient. To me it seems more inconsistent to allow for 
function calls and functions but not symbols - either allow all of them 
or be strict and enforce function calls.




2) Simplified lambda expression:
IMHO in the vast majority of use cases, this is used for
single-argument
functions, so parenthesis would not be required. Hence, both forms
would
be valid and equivalent:

\x x + 1
\(x) x + 1


Why special case something here when soemtimes you'll want more than one 
argument. The parentheses seem really not a big deal. So I don't 
understand the motivation here, if I'm being honest.


Just as I told before: because of practical considerations. In a 
Hungarian keyboard layout, this is how one types the backslash 
character: RightAlt+Q. Parenthesis: Shift+8 (left), Shift+9 (Right). 
This is how you type 'function' in the R terminal: fu+TAB. I do not 
really see the point of the new notation as it is now.





3) Function composition:
Allowing for concise composition of functions would be a great feature.
E.g., instead of

foo <- function(x) print(mean(sqrt(x), na.rm = TRUE), digits = 2)

or

foo <- \x {x |> sqrt() |> mean(na.rm = TRUE) |> print(digits = 2)}

one could write

foo <- \x |> sqrt() |> mean(na.rm = TRUE) |> print(digits = 2)

So basically if the lambda argument is followed by a pipe operator, the
pipe chain is transformed to a function body where the first lambda
argument is inserted into the first position of the pipeline.


This one I disagree with very strongly. Reading pipelines would suddenly 
require a /much/ higher cognitive load than before because you have to 
model that complexity just to read it and know what it says. The 
brackets there seem like an extremely low price to pay to avoid that. 
Operator precedence should be extremely and easily predictable.




Unfortunately I could not come up with a better solution to approximate 
a function composition operator (supporting tacit/pointfree-style 
programming) which avoids the introduction of a separate function (like 
e.g. purrr::compose).


In Haskell:
floor . sqrt

In Julia (looks nice but requires \circTAB or custom keybinding):
floor ∘ sqrt

In R: ?


Best,
Denes






Best,
Denes


On 12/5/20 7:10 PM, luke-tier...@uiowa.edu
 wrote:
 > We went back and forth on this several times. The key advantage of
 > requiring parentheses is to keep things simple and consistent.  Let's
 > get some experience with that. If experience shows requiring
 > parentheses creates too many issues then we can add the option of
 > dropping them later (with special handling of :: and :::). It's
easier
 > to add flexibility and complexity than to restrict it af

Re: [Rd] New pipe operator

2020-12-06 Thread Gabor Grothendieck
I think the real issue here is that functions are supposed to be
first class objects in R
or are supposed to be and |> would break that if if is possible
to write function(x) x + 1 on the RHS but not foo (assuming foo
was defined as that function).

I don't think getting experience with using it can change that
inconsistency which seems serious to me and needs to
be addressed even if it complicates the implementation
since it drives to the heart of what R is.

On Sat, Dec 5, 2020 at 1:08 PM Gabor Grothendieck
 wrote:
>
> The construct utils::head  is not that common but bare functions are
> very common and to make it harder to use the common case so that
> the uncommon case is slightly easier is not desirable.
>
> Also it is trivial to write this which does work:
>
> mtcars %>% (utils::head)
>
> On Sat, Dec 5, 2020 at 11:59 AM Hugh Parsonage  
> wrote:
> >
> > I'm surprised by the aversion to
> >
> > mtcars |> nrow
> >
> > over
> >
> > mtcars |> nrow()
> >
> > and I think the decision to disallow the former should be
> > reconsidered.  The pipe operator is only going to be used when the rhs
> > is a function, so there is no ambiguity with omitting the parentheses.
> > If it's disallowed, it becomes inconsistent with other treatments like
> > sapply(mtcars, typeof) where sapply(mtcars, typeof()) would just be
> > noise.  I'm not sure why this decision was taken
> >
> > If the only issue is with the double (and triple) colon operator, then
> > ideally `mtcars |> base::head` should resolve to `base::head(mtcars)`
> > -- in other words, demote the precedence of |>
> >
> > Obviously (looking at the R-Syntax branch) this decision was
> > considered, put into place, then dropped, but I can't see why
> > precisely.
> >
> > Best,
> >
> >
> > Hugh.
> >
> >
> >
> >
> >
> >
> >
> > On Sat, 5 Dec 2020 at 04:07, Deepayan Sarkar  
> > wrote:
> > >
> > > On Fri, Dec 4, 2020 at 7:35 PM Duncan Murdoch  
> > > wrote:
> > > >
> > > > On 04/12/2020 8:13 a.m., Hiroaki Yutani wrote:
> > > > >>   Error: function '::' not supported in RHS call of a pipe
> > > > >
> > > > > To me, this error looks much more friendly than magrittr's error.
> > > > > Some of them got too used to specify functions without (). This
> > > > > is OK until they use `::`, but when they need to use it, it takes
> > > > > hours to figure out why
> > > > >
> > > > > mtcars %>% base::head
> > > > > #> Error in .::base : unused argument (head)
> > > > >
> > > > > won't work but
> > > > >
> > > > > mtcars %>% head
> > > > >
> > > > > works. I think this is a too harsh lesson for ordinary R users to
> > > > > learn `::` is a function. I've been wanting for magrittr to drop the
> > > > > support for a function name without () to avoid this confusion,
> > > > > so I would very much welcome the new pipe operator's behavior.
> > > > > Thank you all the developers who implemented this!
> > > >
> > > > I agree, it's an improvement on the corresponding magrittr error.
> > > >
> > > > I think the semantics of not evaluating the RHS, but treating the pipe
> > > > as purely syntactical is a good decision.
> > > >
> > > > I'm not sure I like the recommended way to pipe into a particular 
> > > > argument:
> > > >
> > > >mtcars |> subset(cyl == 4) |> \(d) lm(mpg ~ disp, data = d)
> > > >
> > > > or
> > > >
> > > >mtcars |> subset(cyl == 4) |> function(d) lm(mpg ~ disp, data = d)
> > > >
> > > > both of which are equivalent to
> > > >
> > > >mtcars |> subset(cyl == 4) |> (function(d) lm(mpg ~ disp, data = 
> > > > d))()
> > > >
> > > > It's tempting to suggest it should allow something like
> > > >
> > > >mtcars |> subset(cyl == 4) |> lm(mpg ~ disp, data = .)
> > >
> > > Which is really not that far off from
> > >
> > > mtcars |> subset(cyl == 4) |> \(.) lm(mpg ~ disp, data = .)
> > >
> > > once you get used to it.
> > >
> > > One consequence of the implementation is that it's not clear how
> > > multiple occurrences of the placeholder would be interpreted. With
> > > magrittr,
> > >
> > > sort(runif(10)) %>% ecdf(.)(.)
> > > ## [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
> > >
> > > This is probably what you would expect, if you expect it to work at all, 
> > > and not
> > >
> > > ecdf(sort(runif(10)))(sort(runif(10)))
> > >
> > > There would be no such ambiguity with anonymous functions
> > >
> > > sort(runif(10)) |> \(.) ecdf(.)(.)
> > >
> > > -Deepayan
> > >
> > > > which would be expanded to something equivalent to the other versions:
> > > > but that makes it quite a bit more complicated.  (Maybe _ or \. should
> > > > be used instead of ., since those are not legal variable names.)
> > > >
> > > > I don't think there should be an attempt to copy magrittr's special
> > > > casing of how . is used in determining whether to also include the
> > > > previous value as first argument.
> > > >
> > > > Duncan Murdoch
> > > >
> > > >
> > > > >
> > > > > Best,
> > > > > Hiroaki Yutani
> > > > >
> > > > > 2020年12月4日(金) 20:51 Duncan Murdoch :
> > > > >>
> > > > >> 

Re: [Rd] New pipe operator

2020-12-06 Thread Gabriel Becker
Hi Gabor,

On Sun, Dec 6, 2020 at 12:52 PM Gabor Grothendieck 
wrote:

> I think the real issue here is that functions are supposed to be
> first class objects in R
> or are supposed to be and |> would break that if if is possible
> to write function(x) x + 1 on the RHS but not foo (assuming foo
> was defined as that function).
>
> I don't think getting experience with using it can change that
> inconsistency which seems serious to me and needs to
> be addressed even if it complicates the implementation
> since it drives to the heart of what R is.
>
>
With respect I think this is a misunderstanding of what is happening here.

Functions are first class citizens. |> is, for all intents and purposes, a
*macro. *

LHS |> RHS(arg2=5)

*parses to*

RHS(LHS, arg2 = 5)

There are no functions at the point in time when the pipe transformation
happens, because no code has been evaluated. To know if a symbol is going
to evaluate to a function requires evaluation which is a step entirely
after the one where the |> pipe is implemented.

Another way to think about it is that

LHS |> RHS(arg2 = 5)

is another way of *writing* RHS(LHS, arg2 = 5), NOT R code that is (or even
can be) evaluated.


Now this is a subtle point that only really has implications in as much as
it is not the case for magrittr pipes, but its relevant for discussions
like this, I think.

~G

On Sat, Dec 5, 2020 at 1:08 PM Gabor Grothendieck
>  wrote:
> >
> > The construct utils::head  is not that common but bare functions are
> > very common and to make it harder to use the common case so that
> > the uncommon case is slightly easier is not desirable.
> >
> > Also it is trivial to write this which does work:
> >
> > mtcars %>% (utils::head)
> >
> > On Sat, Dec 5, 2020 at 11:59 AM Hugh Parsonage 
> wrote:
> > >
> > > I'm surprised by the aversion to
> > >
> > > mtcars |> nrow
> > >
> > > over
> > >
> > > mtcars |> nrow()
> > >
> > > and I think the decision to disallow the former should be
> > > reconsidered.  The pipe operator is only going to be used when the rhs
> > > is a function, so there is no ambiguity with omitting the parentheses.
> > > If it's disallowed, it becomes inconsistent with other treatments like
> > > sapply(mtcars, typeof) where sapply(mtcars, typeof()) would just be
> > > noise.  I'm not sure why this decision was taken
> > >
> > > If the only issue is with the double (and triple) colon operator, then
> > > ideally `mtcars |> base::head` should resolve to `base::head(mtcars)`
> > > -- in other words, demote the precedence of |>
> > >
> > > Obviously (looking at the R-Syntax branch) this decision was
> > > considered, put into place, then dropped, but I can't see why
> > > precisely.
> > >
> > > Best,
> > >
> > >
> > > Hugh.
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Sat, 5 Dec 2020 at 04:07, Deepayan Sarkar <
> deepayan.sar...@gmail.com> wrote:
> > > >
> > > > On Fri, Dec 4, 2020 at 7:35 PM Duncan Murdoch <
> murdoch.dun...@gmail.com> wrote:
> > > > >
> > > > > On 04/12/2020 8:13 a.m., Hiroaki Yutani wrote:
> > > > > >>   Error: function '::' not supported in RHS call of a pipe
> > > > > >
> > > > > > To me, this error looks much more friendly than magrittr's error.
> > > > > > Some of them got too used to specify functions without (). This
> > > > > > is OK until they use `::`, but when they need to use it, it takes
> > > > > > hours to figure out why
> > > > > >
> > > > > > mtcars %>% base::head
> > > > > > #> Error in .::base : unused argument (head)
> > > > > >
> > > > > > won't work but
> > > > > >
> > > > > > mtcars %>% head
> > > > > >
> > > > > > works. I think this is a too harsh lesson for ordinary R users to
> > > > > > learn `::` is a function. I've been wanting for magrittr to drop
> the
> > > > > > support for a function name without () to avoid this confusion,
> > > > > > so I would very much welcome the new pipe operator's behavior.
> > > > > > Thank you all the developers who implemented this!
> > > > >
> > > > > I agree, it's an improvement on the corresponding magrittr error.
> > > > >
> > > > > I think the semantics of not evaluating the RHS, but treating the
> pipe
> > > > > as purely syntactical is a good decision.
> > > > >
> > > > > I'm not sure I like the recommended way to pipe into a particular
> argument:
> > > > >
> > > > >mtcars |> subset(cyl == 4) |> \(d) lm(mpg ~ disp, data = d)
> > > > >
> > > > > or
> > > > >
> > > > >mtcars |> subset(cyl == 4) |> function(d) lm(mpg ~ disp, data =
> d)
> > > > >
> > > > > both of which are equivalent to
> > > > >
> > > > >mtcars |> subset(cyl == 4) |> (function(d) lm(mpg ~ disp, data
> = d))()
> > > > >
> > > > > It's tempting to suggest it should allow something like
> > > > >
> > > > >mtcars |> subset(cyl == 4) |> lm(mpg ~ disp, data = .)
> > > >
> > > > Which is really not that far off from
> > > >
> > > > mtcars |> subset(cyl == 4) |> \(.) lm(mpg ~ disp, data = .)
> > > >
> > > > once you get used to it.
> > > >
> > > > One consequenc

Re: [Rd] New pipe operator

2020-12-06 Thread Gabor Grothendieck
I understand very well that it is implemented at the syntax level;
however, in any case the implementation is irrelevant to the principles.

Here a similar example to the one I gave before but this time written out:

This works:

  3 |> function(x) x + 1

but this does not:

  foo <- function(x) x + 1
  3 |> foo

so it breaks the principle of functions being first class objects.  foo and its
definition are not interchangeable.  You have
to write 3 |> foo() but don't have to write 3 |> (function(x) x + 1)().

This isn't just a matter of notation, i.e. foo vs foo(), but is a
matter of breaking
the way R works as a functional language with first class functions.

On Sun, Dec 6, 2020 at 4:06 PM Gabriel Becker  wrote:
>
> Hi Gabor,
>
> On Sun, Dec 6, 2020 at 12:52 PM Gabor Grothendieck  
> wrote:
>>
>> I think the real issue here is that functions are supposed to be
>> first class objects in R
>> or are supposed to be and |> would break that if if is possible
>> to write function(x) x + 1 on the RHS but not foo (assuming foo
>> was defined as that function).
>>
>> I don't think getting experience with using it can change that
>> inconsistency which seems serious to me and needs to
>> be addressed even if it complicates the implementation
>> since it drives to the heart of what R is.
>>
>
> With respect I think this is a misunderstanding of what is happening here.
>
> Functions are first class citizens. |> is, for all intents and purposes, a 
> macro.
>
> LHS |> RHS(arg2=5)
>
> parses to
>
> RHS(LHS, arg2 = 5)
>
> There are no functions at the point in time when the pipe transformation 
> happens, because no code has been evaluated. To know if a symbol is going to 
> evaluate to a function requires evaluation which is a step entirely after the 
> one where the |> pipe is implemented.
>
> Another way to think about it is that
>
> LHS |> RHS(arg2 = 5)
>
> is another way of writing RHS(LHS, arg2 = 5), NOT R code that is (or even can 
> be) evaluated.
>
>
> Now this is a subtle point that only really has implications in as much as it 
> is not the case for magrittr pipes, but its relevant for discussions like 
> this, I think.
>
> ~G
>
>> On Sat, Dec 5, 2020 at 1:08 PM Gabor Grothendieck
>>  wrote:
>> >
>> > The construct utils::head  is not that common but bare functions are
>> > very common and to make it harder to use the common case so that
>> > the uncommon case is slightly easier is not desirable.
>> >
>> > Also it is trivial to write this which does work:
>> >
>> > mtcars %>% (utils::head)
>> >
>> > On Sat, Dec 5, 2020 at 11:59 AM Hugh Parsonage  
>> > wrote:
>> > >
>> > > I'm surprised by the aversion to
>> > >
>> > > mtcars |> nrow
>> > >
>> > > over
>> > >
>> > > mtcars |> nrow()
>> > >
>> > > and I think the decision to disallow the former should be
>> > > reconsidered.  The pipe operator is only going to be used when the rhs
>> > > is a function, so there is no ambiguity with omitting the parentheses.
>> > > If it's disallowed, it becomes inconsistent with other treatments like
>> > > sapply(mtcars, typeof) where sapply(mtcars, typeof()) would just be
>> > > noise.  I'm not sure why this decision was taken
>> > >
>> > > If the only issue is with the double (and triple) colon operator, then
>> > > ideally `mtcars |> base::head` should resolve to `base::head(mtcars)`
>> > > -- in other words, demote the precedence of |>
>> > >
>> > > Obviously (looking at the R-Syntax branch) this decision was
>> > > considered, put into place, then dropped, but I can't see why
>> > > precisely.
>> > >
>> > > Best,
>> > >
>> > >
>> > > Hugh.
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > On Sat, 5 Dec 2020 at 04:07, Deepayan Sarkar  
>> > > wrote:
>> > > >
>> > > > On Fri, Dec 4, 2020 at 7:35 PM Duncan Murdoch 
>> > > >  wrote:
>> > > > >
>> > > > > On 04/12/2020 8:13 a.m., Hiroaki Yutani wrote:
>> > > > > >>   Error: function '::' not supported in RHS call of a pipe
>> > > > > >
>> > > > > > To me, this error looks much more friendly than magrittr's error.
>> > > > > > Some of them got too used to specify functions without (). This
>> > > > > > is OK until they use `::`, but when they need to use it, it takes
>> > > > > > hours to figure out why
>> > > > > >
>> > > > > > mtcars %>% base::head
>> > > > > > #> Error in .::base : unused argument (head)
>> > > > > >
>> > > > > > won't work but
>> > > > > >
>> > > > > > mtcars %>% head
>> > > > > >
>> > > > > > works. I think this is a too harsh lesson for ordinary R users to
>> > > > > > learn `::` is a function. I've been wanting for magrittr to drop 
>> > > > > > the
>> > > > > > support for a function name without () to avoid this confusion,
>> > > > > > so I would very much welcome the new pipe operator's behavior.
>> > > > > > Thank you all the developers who implemented this!
>> > > > >
>> > > > > I agree, it's an improvement on the corresponding magrittr error.
>> > > > >
>> > > > > I think the semantics of not evaluating the RHS, but treating the 
>> > > > > 

Re: [Rd] New pipe operator

2020-12-06 Thread Bravington, Mark (Data61, Hobart)
Seems like this *could* be a good thing, and thanks to R core for considering 
it. But, FWIW:

 - I agree with Gabor G that consistency of "syntax" should be paramount here. 
Enough problems have been caused by earlier superficially-convenient 
non-standard features in R.  In particular:

 -- there should not be any discrepancy between an in-place 
function-definition, and a predefined function attached to a symbol (as per 
Gabor's point). 
 
 -- Hence, the ability to say x |> foo  ie without parentheses, seems bound to 
lead to inconsistency, because x |> foo is allowed, x |> base::foo isn't 
allowed without tricks, but x |> function( y) foo( y) isn't... So, x |> foo is 
not worth keeping. Parentheses are a price well worth paying.
 
 -- it is still inconsistent and confusing to (apparently) invoke a function in 
some places--- normally--- via 'foo(x)', yet in others--- pipily--- via 
'foo()'. Especially if 'foo' already has a default value for its first argument.

 - I don't see the problem with a placeholder--- doesn't it remove all 
ambiguity? Sure there needs to be a standard unclashable name and people can 
argue about what that should be, but the following seems clear and flexible... 
to me, anyway:
 
 thing |> 
   foo( _PIPE_) |>   # standard
   bah( arg1, _PIPE_) |>   # multi-arg function
   _ANON_({ x <- sum( _PIPE_); _PIPE_/x + x/_PIPE_ })   # anon function
  
where '_PIPE_' is the ordained name of the placeholder, and '_ANON_' 
constructs-and-calls a function with single argument '_PIPE_'. There is just 
one rule (I think...): each pipe-stage must be a *call* involving the argument 
'_PIPE_'.


 - The proposed anonymous-function syntax looks quite ugly to me, diminishing 
readability and inviting errors. The new pipe symbol |> already looks scarily 
like quantum mechanics; adding \( just puts fishbones into the symbolic soup.

 - IMO it's not worth going too far to try to lure magritter-etc fans to swap 
to the new; my experience is that many people keep using older inferior R 
syntax for years after better replacements become available (even if they are 
aware of replacements), for various reasons. Just provide a good framework, and 
let nature take its course.
 
 - Disclaimer: personally I'm not much of a pipehead anyway, so maybe I'm not 
the audience. But if I was to consider piping, I wouldn't be very tempted by 
the current proposal. OTOH, I might even be tempted to write--- and use!--- my 
own version of '%|>%' as above (maybe someone already has). And if R did it for 
me, that'd be great :)
 
[*] Definition of _ANON_ could be something like this--- almost certainly won't 
work as-is, this is just to point out that it could be done in standard R.

`_ANON_` <- function( expr) { 
  #1. Construct a function with arg '_PIPE_' and body 'expr'
  #2. Construct a call() to that function
  #3. Do the call

  f <- function( `_PIPE_`) NULL
  body( f) <- expr
  environment( f) <- parent.frame() # or something... yes these details are 
almost certainly wrong
  expr2 <- substitute( f( `_PIPE_`)) # or something...
  eval.parent( expr2) # or something... 
}

cheers
Mark

Mark Bravington
CSIRO Marine Lab
Hobart
Australia



From: R-devel  on behalf of Gabor Grothendieck 

Sent: Monday, 7 December 2020 10:21
To: Gabriel Becker
Cc: r-devel@r-project.org
Subject: Re: [Rd] New pipe operator

I understand very well that it is implemented at the syntax level;
however, in any case the implementation is irrelevant to the principles.

Here a similar example to the one I gave before but this time written out:

This works:

  3 |> function(x) x + 1

but this does not:

  foo <- function(x) x + 1
  3 |> foo

so it breaks the principle of functions being first class objects.  foo and its
definition are not interchangeable.  You have
to write 3 |> foo() but don't have to write 3 |> (function(x) x + 1)().

This isn't just a matter of notation, i.e. foo vs foo(), but is a
matter of breaking
the way R works as a functional language with first class functions.

On Sun, Dec 6, 2020 at 4:06 PM Gabriel Becker  wrote:
>
> Hi Gabor,
>
> On Sun, Dec 6, 2020 at 12:52 PM Gabor Grothendieck  
> wrote:
>>
>> I think the real issue here is that functions are supposed to be
>> first class objects in R
>> or are supposed to be and |> would break that if if is possible
>> to write function(x) x + 1 on the RHS but not foo (assuming foo
>> was defined as that function).
>>
>> I don't think getting experience with using it can change that
>> inconsistency which seems serious to me and needs to
>> be addressed even if it complicates the implementation
>> since it drives to the heart of what R is.
>>
>
> With respect I think this is a misunderstanding of what is happening here.
>
> Functions are first class citizens. |> is, for all intents and purposes, a 
> macro.
>
> LHS |> RHS(arg2=5)
>
> parses to
>
> RHS(LHS, arg2 = 5)
>
> There are no functions at the point in time when the pipe transfor

Re: [Rd] New pipe operator and gg plotz

2020-12-06 Thread Avi Gross via R-devel
Thanks, Duncan. That answers my question fairly definitively.

Although it can be DONE it likely won't be for the reasons Hadley mentioned 
until we get some other product that replaces it entirely. There are some 
interesting work-arounds mentioned. 

I was thinking of one that has overhead but might be a pain. Hadley mentioned a 
slight variant. The first argument to a function now is expected to be the data 
argument. The second might be the mapping. Now if the function is called with a 
new first argument that is a ggplot object, it could be possible to test the 
type and if it is a ggplot object than slide over carefully any additional 
matched arguments that were not explicitly named. Not sure that is at all easy 
to do.

Alternately, you can ask that when used in such a pipeline that the user call 
all other arguments using names like data=whatever, mapping=aes(whatever) so no 
other args need to be adjusted by position.

But all this is academic and I concede will likely not be done. I can live with 
the plus signs.


-Original Message-
From: Duncan Murdoch  
Sent: Sunday, December 6, 2020 2:50 PM
To: Avi Gross ; 'r-devel' 
Subject: Re: [Rd] New pipe operator and gg plotz

Hadley's answer (#7 here: 
https://community.rstudio.com/t/why-cant-ggplot2-use/4372) makes it pretty 
clear that he thinks it would have been nice now if he had made that choice 
when ggplot2 came out, but it's not worth the effort now to change it.

Duncan Murdoch

On 06/12/2020 2:34 p.m., Avi Gross via R-devel wrote:
> As someone who switches back and forth between using standard R methods and 
> those of the tidyverse, depending on the problem, my mood and whether Jupiter 
> aligns with Saturn in the new age of Aquarius, I have a question about the 
> forthcoming built-in pipe. Will it motivate anyone to eventually change or 
> enhance the ggplot functionality to have a version that gets rid of the odd 
> use of the addition symbol?
> 
> I mean I now sometimes have a pipeline that looks like:
> 
> Data %>%
>   Do_this %>%
>   Do_that(whatever) %>%
>   ggplot(...) +
>   geom_whatever(...) +
>   ...
> 
> My understanding is this is a bit of a historical anomaly that might someday 
> be modified back.
> 
> As I understand it, the call to ggplot() creates a partially filled-in object 
> that holds all kinds of useful info. The additional calls to geom_point() and 
> so on will add/change that hidden object. Nothing much happens till the 
> object is implicitly or explicitly given to print() which switches to the 
> print function for objects of that type and creates a graph based on the 
> contents of the object at that time. So, in theory, you could have a 
> pipelined version of ggplot where the first function accepts something like a 
>  data.frame or tibble as the default first argument and at the end returns 
> the object we have been describing. All additional functions would then 
> accept such an object as the (hidden?) first argument and return the modified 
> object. The final function in the pipe would either have the value captured 
> in a variable for later use or print implicitly generating a graph.
> 
> So the above silly example might become:
> 
> Data %>%
>   Do_this %>%
>   Do_that(whatever) %>%
>   ggplot(...) %>%
>   geom_whatever(...) %>%
>   ...
> 
> Or, am I missing something here?
> 
> The language and extensions such as are now in the tidyverse might be more 
> streamlined and easier to read when using consistent notation. If we now 
> build a reasonable version of the pipeline in, might we encourage other uses 
> to gradually migrate back closer to the mainstream?
> 
> -Original Message-
> From: R-devel  On Behalf Of Rui 
> Barradas
> Sent: Sunday, December 6, 2020 2:51 AM
> To: Gregory Warnes ; Abby Spurdle 
> 
> Cc: r-devel 
> Subject: Re: [Rd] New pipe operator
> 
> Hello,
> 
> If Hilbert liked beer, I like "pipe".
> 
> More seriously, a new addition like this one is going to cause problems yet 
> unknown. But it's a good idea to have a pipe operator available. As someone 
> used to magrittr's data pipelines, I will play with this base one before 
> making up my mind. I don't expect its behavior to be exactly like magrittr 
> "%>%" (and it's not). For the moment all I can say is that it is something R 
> users are used to and that it now avoids loading a package.
> As for the new way to define anonymous functions, I am less sure. Too much 
> syntatic sugar? Or am I finding the syntax ugly?
> 
> Hope this helps,
> 
> Rui Barradas
> 
> 
> Às 03:22 de 06/12/20, Gregory Warnes escreveu:
>> If we’re being mathematically pedantic, the “pipe” operator is 
>> actually function composition > That being said, pipes are a simple 
>> and well-known idiom. While being less
>> than mathematically exact, it seems a reasonable   label for the (very
>> useful) behavior.
>>
>> On Sat, Dec 5, 2020 at 9:43 PM Abby Spurdle  wrote:
>>
 This is a go

Re: [Rd] New pipe operator

2020-12-06 Thread Gabriel Becker
Hi Gabor,

On Sun, Dec 6, 2020 at 3:22 PM Gabor Grothendieck 
wrote:

> I understand very well that it is implemented at the syntax level;
> however, in any case the implementation is irrelevant to the principles.
>
> Here a similar example to the one I gave before but this time written out:
>
> This works:
>
>   3 |> function(x) x + 1
>
> but this does not:
>
>   foo <- function(x) x + 1
>   3 |> foo
>
> so it breaks the principle of functions being first class objects.  foo
> and its
> definition are not interchangeable.


I understood what you meant as well.

The issue is that neither foo nor its definition are being operated on, or
even exist within the scope of what |> is defined to do. You are used to
magrittr's %>% where arguably what you are saying would be true. But its
not here, in my view.

Again, I think the issue is that |>, in as much as it "operates" on
anything at all (it not being a function, regardless of appearances),
operates on call expression objects, NOT on functions, ever.

function(x) x *parses to a call expression *as does RHSfun(), while RHSfun does
not, it parses to a name, *regardless of whether that symbol will
eventually evaluate to a closure or not.*

So in fact, it seems to me that, technically, all name symbols are being
treated exactly the same (none are allowed, including those which will
lookup to functions during evaluation), while all* call expressions are
also being treated the same. And again, there are no functions anywhere in
either case.

* except those that include that the parser flags as syntactically special.


> You have
> to write 3 |> foo() but don't have to write 3 |> (function(x) x + 1)().
>

I think you should probably be careful what you wish for here. I'm not
involved with this work and do not speak for any of those who were, but the
principled way to make that consistent while remaining entirely in the
parser seems very likely to be to require the latter, rather than not
require the former.


> This isn't just a matter of notation, i.e. foo vs foo(), but is a
> matter of breaking
> the way R works as a functional language with first class functions.
>

I don't agree. Consider `+`

Having

foo <- get("+") ## note no `` here
foo(x,y)

parse and work correctly while

+(x,y)

 does not does not mean + isn't a function or that it is a "second class
citizen", it simply means that the parser has constraints on the syntax for
writing code that calls it that calling other functions are not subject to.
The fact that such *syntactic* constraints can exist proves that there is
not some overarching inviolable principle being violated here, I think. Now
you may say "well thats just the parser, it has to parse + specially
because its an operator with specific precedence etc". Well, the same exact
thing is true of |> I think.

Best,
~G

>
> On Sun, Dec 6, 2020 at 4:06 PM Gabriel Becker 
> wrote:
> >
> > Hi Gabor,
> >
> > On Sun, Dec 6, 2020 at 12:52 PM Gabor Grothendieck <
> ggrothendi...@gmail.com> wrote:
> >>
> >> I think the real issue here is that functions are supposed to be
> >> first class objects in R
> >> or are supposed to be and |> would break that if if is possible
> >> to write function(x) x + 1 on the RHS but not foo (assuming foo
> >> was defined as that function).
> >>
> >> I don't think getting experience with using it can change that
> >> inconsistency which seems serious to me and needs to
> >> be addressed even if it complicates the implementation
> >> since it drives to the heart of what R is.
> >>
> >
> > With respect I think this is a misunderstanding of what is happening
> here.
> >
> > Functions are first class citizens. |> is, for all intents and purposes,
> a macro.
> >
> > LHS |> RHS(arg2=5)
> >
> > parses to
> >
> > RHS(LHS, arg2 = 5)
> >
> > There are no functions at the point in time when the pipe transformation
> happens, because no code has been evaluated. To know if a symbol is going
> to evaluate to a function requires evaluation which is a step entirely
> after the one where the |> pipe is implemented.
> >
> > Another way to think about it is that
> >
> > LHS |> RHS(arg2 = 5)
> >
> > is another way of writing RHS(LHS, arg2 = 5), NOT R code that is (or
> even can be) evaluated.
> >
> >
> > Now this is a subtle point that only really has implications in as much
> as it is not the case for magrittr pipes, but its relevant for discussions
> like this, I think.
> >
> > ~G
> >
> >> On Sat, Dec 5, 2020 at 1:08 PM Gabor Grothendieck
> >>  wrote:
> >> >
> >> > The construct utils::head  is not that common but bare functions are
> >> > very common and to make it harder to use the common case so that
> >> > the uncommon case is slightly easier is not desirable.
> >> >
> >> > Also it is trivial to write this which does work:
> >> >
> >> > mtcars %>% (utils::head)
> >> >
> >> > On Sat, Dec 5, 2020 at 11:59 AM Hugh Parsonage <
> hugh.parson...@gmail.com> wrote:
> >> > >
> >> > > I'm surprised by the aversion to
> >> > >
> >> > > mtcars |> nrow

Re: [Rd] New pipe operator

2020-12-06 Thread Gabor Grothendieck
This is really irrelevant.

On Sun, Dec 6, 2020 at 9:23 PM Gabriel Becker  wrote:
>
> Hi Gabor,
>
> On Sun, Dec 6, 2020 at 3:22 PM Gabor Grothendieck  
> wrote:
>>
>> I understand very well that it is implemented at the syntax level;
>> however, in any case the implementation is irrelevant to the principles.
>>
>> Here a similar example to the one I gave before but this time written out:
>>
>> This works:
>>
>>   3 |> function(x) x + 1
>>
>> but this does not:
>>
>>   foo <- function(x) x + 1
>>   3 |> foo
>>
>> so it breaks the principle of functions being first class objects.  foo and 
>> its
>> definition are not interchangeable.
>
>
> I understood what you meant as well.
>
> The issue is that neither foo nor its definition are being operated on, or 
> even exist within the scope of what |> is defined to do. You are used to 
> magrittr's %>% where arguably what you are saying would be true. But its not 
> here, in my view.
>
> Again, I think the issue is that |>, in as much as it "operates" on anything 
> at all (it not being a function, regardless of appearances), operates on call 
> expression objects, NOT on functions, ever.
>
> function(x) x parses to a call expression as does RHSfun(), while RHSfun does 
> not, it parses to a name, regardless of whether that symbol will eventually 
> evaluate to a closure or not.
>
> So in fact, it seems to me that, technically, all name symbols are being 
> treated exactly the same (none are allowed, including those which will lookup 
> to functions during evaluation), while all* call expressions are also being 
> treated the same. And again, there are no functions anywhere in either case.
>
> * except those that include that the parser flags as syntactically special.
>
>>
>> You have
>> to write 3 |> foo() but don't have to write 3 |> (function(x) x + 1)().
>
>
> I think you should probably be careful what you wish for here. I'm not 
> involved with this work and do not speak for any of those who were, but the 
> principled way to make that consistent while remaining entirely in the parser 
> seems very likely to be to require the latter, rather than not require the 
> former.
>
>>
>> This isn't just a matter of notation, i.e. foo vs foo(), but is a
>> matter of breaking
>> the way R works as a functional language with first class functions.
>
>
> I don't agree. Consider `+`
>
> Having
>
> foo <- get("+") ## note no `` here
> foo(x,y)
>
> parse and work correctly while
>
> +(x,y)
>
>  does not does not mean + isn't a function or that it is a "second class 
> citizen", it simply means that the parser has constraints on the syntax for 
> writing code that calls it that calling other functions are not subject to. 
> The fact that such syntactic constraints can exist proves that there is not 
> some overarching inviolable principle being violated here, I think. Now you 
> may say "well thats just the parser, it has to parse + specially because its 
> an operator with specific precedence etc". Well, the same exact thing is true 
> of |> I think.
>
> Best,
> ~G
>>
>>
>> On Sun, Dec 6, 2020 at 4:06 PM Gabriel Becker  wrote:
>> >
>> > Hi Gabor,
>> >
>> > On Sun, Dec 6, 2020 at 12:52 PM Gabor Grothendieck 
>> >  wrote:
>> >>
>> >> I think the real issue here is that functions are supposed to be
>> >> first class objects in R
>> >> or are supposed to be and |> would break that if if is possible
>> >> to write function(x) x + 1 on the RHS but not foo (assuming foo
>> >> was defined as that function).
>> >>
>> >> I don't think getting experience with using it can change that
>> >> inconsistency which seems serious to me and needs to
>> >> be addressed even if it complicates the implementation
>> >> since it drives to the heart of what R is.
>> >>
>> >
>> > With respect I think this is a misunderstanding of what is happening here.
>> >
>> > Functions are first class citizens. |> is, for all intents and purposes, a 
>> > macro.
>> >
>> > LHS |> RHS(arg2=5)
>> >
>> > parses to
>> >
>> > RHS(LHS, arg2 = 5)
>> >
>> > There are no functions at the point in time when the pipe transformation 
>> > happens, because no code has been evaluated. To know if a symbol is going 
>> > to evaluate to a function requires evaluation which is a step entirely 
>> > after the one where the |> pipe is implemented.
>> >
>> > Another way to think about it is that
>> >
>> > LHS |> RHS(arg2 = 5)
>> >
>> > is another way of writing RHS(LHS, arg2 = 5), NOT R code that is (or even 
>> > can be) evaluated.
>> >
>> >
>> > Now this is a subtle point that only really has implications in as much as 
>> > it is not the case for magrittr pipes, but its relevant for discussions 
>> > like this, I think.
>> >
>> > ~G
>> >
>> >> On Sat, Dec 5, 2020 at 1:08 PM Gabor Grothendieck
>> >>  wrote:
>> >> >
>> >> > The construct utils::head  is not that common but bare functions are
>> >> > very common and to make it harder to use the common case so that
>> >> > the uncommon case is slightly easier is not desirable.
>> >>