On Thu, Apr 3, 2025, at 4:06 PM, Rowan Tommins [IMSoP] wrote:
> On 03/04/2025 18:06, Larry Garfield wrote:
>> So if we expect higher order functions to be common (and I would probably
>> mainly use them myself), then it would be wise to figure out some way to
>> make them more efficient. Auto-first-arg is one way.
>
> From this angle, auto-first-arg is a very limited compiler optimisation
> for partial application.
I'd say it has the dual benefit of optimization and ergonomics. (Though see
discussion below.)
> With PFA and one-arg-callable pipes, you could add a parser rule that
> matches this, with the same output:
>
> $foo |> bar(?, $baz);
>
> But you'd also be able to do this:
>
> $baz |> bar($foo, ?);
>
> And maybe the compiler could optimise that case too.
>From what Arnaud has told me, any PFA that has a single, fixed-position-number
>argument remaining should be optimizable. (Though that's a task for whenever
>PFA is next worked on, if it is next worked on.)
> Neither helps with the performance of higher order functions which are
> doing more than partial application, like map and filter themselves. I
> understand there's a high cost to context-switching between C and PHP;
> presumably if there was an easy solution for that someone would have
> done it already.
> On 03/04/2025 18:39, Ilija Tovilo wrote:
>> To me, pipes improve readability when they behave like methods, i.e.
>> they perform some operation on a subject. This resembles Swift's
>> protocol extensions or Rust's trait default implementations, except
>> using a different "method" call operator.
>> [...]
>> If we decide not to add an iterator API that works well with
>> first-arg, then I agree that this is not the right approach. But if we
>> do, then neither of your examples are problematic.
>
>
> I guess those two things go together quite well as a mental model:
> pipes as a way to implement extension methods, and new functions
> designed for use as extension methods.
>
> I think I'd be more welcoming of it if we actually implemented
> extension methods instead of pipes, and then the new iterator API was
> extension-method-only. It feels less like "one of the arguments is
> missing" if that argument is *always* expressed as the left-hand side
> of an arrow or some sort.
As I've noted, classic pipes (current RFC, unary function only) and extension
functions are not mutually exclusive, and I see no reason we couldn't add both.
Auto-partialing first-arg pipes and dedicated extension functions step on each
other's toes a bit more, however.
To address both this and Ilija's email, I was toying with extension functions
as a concept a while back. I also did extensive research into "collections" in
other languages last year with Derick. (See discussion in a previous PHP
Foundation report[1]). That led me to a number of conclusions that I still
hold to:
* A new iterable API is absolutely a good thing and we should do it.
* That said, we *need* to split Sequence, Set, and Dictionary into separate
types. We are the only language I reviewed that didn't have them as separate
constructs with their own APIs.
* The use of the same construct (arrays and iterables) for all three types is a
fundamental and core flaw in PHP's design that we should not double-down on.
It's ergonomically awful, it's bad for performance, and it invites major
security holes. (The "Drupageddon" remote exploit was caused by using an array
and assuming it was sequential when it was actually a map.)
So while I want a new iterable API, the more I think on it, the more I think a
bunch of map(iterable $it, callable $fn) style functions would not be the right
way to do it. That would be easy, but also ineffective.
The behavior of even basic operations like map and filter are subtly different
depending on which type you're dealing with. Whether the input is lazy or not
is the least of the concerns. The bigger issue is when to pass keys to the
$fn; probably always in Dict, probably never in Seq, and certainly never in Set
(as there are no meaningful keys). Similarly, when filtering a Dict, you would
want keys preserved. When filtering a Seq, you'd want the indexes re-zeroed.
(Or to seem like it, given or take implementation details.) And then, yes,
there's the laziness question.
So we'd effectively want three different versions of map(), filter(), etc. if
we didn't want to perpetuate and further entrench the design flaw and security
hole that is "sequences and hashes are the same thing if you squint." And...
frankly I'd probably vote against an interable/collections API that didn't
address that issue.
However, a simple "first arg" pipe wouldn't allow for that. Or rather, we'd
need to implement seqMap(iterable $it, callable $fn), setMap(iterable $it,
callable $fn), and dictMap(iterable $it, callable $fn). And the same split for
filter, and probably a few other things. That seems ergonomically suspect, at
best, and still wouldn't really address the issue since you would have no way
to ensure you're using the "right" version of each function. Similarly, a dict
version of implode() would likely need to take 2 separators, whereas the other
types would take only one.
So the more I think on it, the more I think the sort of iterable API that
first-arg pipes would make easy is... probably not the iterable API we want
anyway. There may well be other cases for Elixir-style first-arg pipes, but a
new iterable API isn't one of them, at least not in this form.
Which brings us then to extension functions. Pipes and higher order functions,
or first-arg pipes, can act as a sort of "junior" extension functions, but for
the reasons listed above fall short of being real extension functions.
For comparison, extension functions in Kotlin look like this:
fun SomeType.foo(a: Int) {
// a is a variable. "this" is the SomeType the function was called on.
// However, this is still "external" scope so only public members are usable.
}
val s = SomeType()
s->foo(5)
(Kotlin doesn't have a "new" keyword; the above is how you instantiate an
object.)
Arguably, Go is entirely built as extension functions. It looks like this:
func (st SomeType) foo(a int) {
// st and a are both variables here. Do as you will.
}
Notably for us, the same function can be defined multiple times against
different types. That allows the system to differentiate between A.foo() and
B.foo(). You can also attach extension functions to interfaces. In fact, most
of Kotlin's collections (list, set, map) API is implemented as extension
functions on interfaces, of which they have many.
However, both Go and Kotlin are compiled languages, which means the compiler
has a complete view of the code at compile time, and can sort out which
extension function to use in a given situation statically. That is, of course,
not the case in PHP.
That means even if we figure out a way to define multiple foo() functions that
apply to different types, and can agree that doing so is not evil (some have
argued it's too close to function/method overloading, which they claim is evil;
I disagree with both points), there is still a very non-trivial task of
figuring out how to resolve the function to call at runtime, probably somehow
leveraging autoloading, which also then runs us up against function
autoloading, etc. I hope that is a solvable problem, but I don't currently
know how to solve it.
So "real" extension functions are an epic unto themselves, even though I really
really want them. (They are fantastically ergonomic for converting from one
representation to another, like from an ORM entity to a minimal struct to
serialize as JSON, and vice versa. I quite miss them from Kotlin).
It would be really nice if we could follow Kotlin's example and build 3
different collection types (likely via objects), and then build most of the API
for them in extension functions rather than as methods. However, that sounds
harder every time I dig into it.
As a side note to Yakov[2], a Uniform Function Call Syntax in PHP would have
all the same problems as extension functions, even before we get into the issue
that Rowan, Tim, and others have brought up that PHP is wildly inconsistent in
having the "subject" first in a function call. Without that UFCS doesn't make
much sense. While I appreciate the elegance of it, in practice, figuring out
extension functions as a dedicated syntax (akin to Kotlin or Go above) is
probably the best we could do, if we can even do that.
All of which is to say... I think I may have talked myself back around to just
using basic unary function pipes and "suck it up" on the extra call for higher
order functions for now, unless someone can show a fair number of non-iterable
use cases where it would be helpful. That then would unblock the other
incremental improvements listed in the RFC (compose, PFA, and $$->foo()). True
extension functions could then be explored later (likely by people with way
more engine knowledge than me) as their own thing, whether using ->, +>, or
something else entirely. We just need to agree that the existence of pipes
does not render extension functions moot.
Thoughts?
--Larry Garfield
[1] https://thephp.foundation/blog/2024/08/19/state-of-generics-and-collections/
[2] https://externals.io/message/127037