On Wed, Mar 13, 2019, at 6:30 PM, Rowan Collins wrote:
> On 13/03/2019 21:10, Dik Takken wrote:
> > So in practice, I expect that
> > using comprehensions as proposed in the new RFC will also require doing
> > a lot of iterator_to_array(). A dual comprehension syntax could fix that.
> 
> 
> At risk of complicating things further, might the solution to that be to 
> have a shorter syntax for iterator_to_array in general?
> 
> It's a shame array-casts are defined for arbitrary objects, else we 
> could have (array)$iterator - and therefore (array)[foreach ($users as 
> $user) yield $user->firstName]

I am again going to reply to a bunch of people at once here...


If I can summarize the responses so far, they seem to fall into one of two 
categories:

1) Love the idea, but wouldn't short-closures be close enough?

2) Love the idea, but hate the particular syntax proposed.

On the plus side, it seems almost everyone is on board in concept, so yay.  
That of course just leaves the syntax bikeshedding, which is always the fun 
part.

As an aside, someone up-thread said that comprehensions were "an easier way to 
write foreach loops", which is only true by accident.  Comprehensions are more 
formally a way of defining one set in relation to another set.  That is, they 
are a declarative relationship between one set and another.  While in PHP that 
ends up effectively being a short-hand for foreach loops, that's more an 
accidental implementation detail.  The syntax used by many other languages to 
achieve the same thing doesn't look at all like loop syntax.

To the question of having both a generator and array version, I would have to 
say no.  As noted in the RFC, most cases where you'd want to use a 
comprehension are not places where you'd be feeding the result into an array 
function.  On the off chance that you are converting the iterable into an array 
is trivial enough that supporting, documenting, and learning two slightly 
different syntaxes seems a net negative.

To Rowan's point, I would be fully in favor of an easier syntax alternative to 
iterator_to_array().  I think that's rather similar (although not identical) to 
the "run out an iterator" add-on mentioned in the RFC.  I would support that, 
but I think it's a bit orthogonal and should not be a blocker for 
short-closures or for comprehensions.

As for the specific syntax, I see a couple of options.

1) Assuming that short-lambdas get adopted and they can transparently support 
generators, the following syntax becomes automatically possible:

$gen = (fn() => foreach($arr as $k => $v) if ($k % 2) yield $v;)();

While that does work, there's an awful lot of symbol salad there: (fn() => and 
;)(); are both just gross and hard to type.  I would consider that not a full 
solution for comprehensions because of how clumsy it is.

2) We could include an even-shorter-lambda syntax, potentially, or perhaps a 
short-lambda-based comprehension syntax.  For example (and this may not be 
parser friendly but it's just to demonstrate the idea):

$gen = fn{ foreach($arr as $k => $v) if ($k % 2) yield $v };

That would be a short-hand for a short-closure that has no parameters, and we 
could detect the yield and self-execute.  The language inside the function body 
would still be a bit verbose, but it would technically be any legal single 
statement, which would offer some potentially interesting (scary?) options.  I 
would consider this an acceptable solution for comprehensions-ish in PHP.

3) The specific syntax proposed in the RFC is Python-inspired and PHP-ified, 
but there's no reason we need to stick to that.  There are a myriad of other 
syntaxes for comprehensions in other languages that we could steal if they fit 
better, some of which wouldn't at all resemble foreach loops and thus avoid the 
for/foreach confusion.

Wikipedia of course has a large index of them we can mine:

https://en.wikipedia.org/wiki/Comparison_of_programming_languages_(list_comprehension)

It appears that the most common syntax involves [] of some variety, which pose 
parsing problems for PHP, but a few other options jump out at me as possible 
syntaxes to pilfer:

C# has this SQL-esque syntax (which may involve too many additional language 
keywords):

var ns = from x in Enumerable.Range(0,100)
         where x*x > 3
         select x*2;

Elixr, Erlang, and Haskell use the <- symbol, which... I don't think we use 
anywhere else currently?  In Elixir:

for x <- 0..100, x * x > 3, do: x * 2

Java 8, Ruby, Rust, and Swift are very very similar, and use a fluent syntax.  
The Rust example:

(0..100).filter(|x| x * x > 3).map(|x| 2 * x).collect();

While that could not be taken as-is, of course, it does propose an interesting 
alternative approach, if we limit comprehensions to Traversable objects rather 
than any iterable (that is, exclude arrays):

$t->filter(fn($v) => expression)->filter(fn($k, $v) => expression)->map(fn($v) 
=> expression);

Which would, in turn, each produce a generator that reduces the set or finally 
yields.  I am not sure I fully like this one, to be honest, as the multiple 
inline short closures make it rather verbose and harder to follow with the 
proposed short-closure syntax (and it would involve more function calls 
internally), but it's an option.  (collect() in these languages seems like it's 
the equivalent of iterator_to_array(); maybe that's another alternative there 
as well?)

Nemerle, which I've never heard of before, has this:

$[x*2 | x in [0 .. 100], x*x > 3]

Which, while $ is obviously already used, does suggest using one of the other 
not-yet-used sigils that Nikita identified, which would let us reorder the 
parameters to put the expression first if we wanted.  For example:

^[$x *2 | $k => $v in $arr if $k %2]


In general, I see two alternatives:

1) Pass short closures and then include a special case of that special case 
that effectively gives us comprehensions over foreach, if, and yield, but with 
fewer seemingly-stray characters.

2) Steal a completely different syntax from some other language that is still 
terse but less confusing.  The main alternatives to "square brackets around a 
for loop" syntax seem to be:

A) Chained filter() and map() methods
B) SQL-like keywords
C) Use <- somehow.
D) Use a different starting character before the [] so that the parser knows 
some new funky order of stuff is coming.

I am open to both options, of course contingent on someone willing and able to 
code it.  

--Larry Garfield

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to