On Thu, Jul 25, 2019 at 4:41 PM Rowan Collins <rowan.coll...@gmail.com> wrote:
> On Thu, 25 Jul 2019 at 14:48, Nikita Popov <nikita....@gmail.com> wrote: > > > > I think nowadays it is well known that by-reference passing is to be > > avoided and I don't see it particularly commonly in user code. > By-reference > > passing is mainly used when it is needed to interact with existing > > by-reference functions such as preg_match(). We can hardly switch these > > functions to use out/inout if we require the corresponding keyword on the > > call-site. > > > > > I guess the call-site syntax would still need to be opt-in for > compatibility reasons, but we could definitely mark the parameters as "out" > in internal functions, even if that was mainly a documentation / reflection > change. > > That would stop people having to write `$matches = []; preg_match($foo, > $bar, $matches);` to ensure that the output parameter is initialised. I > have been annoyed by that more often than I've encountered a function where > I wasn't sure if the parameter was by-reference or not. > Eww, please don't write code like that... This proposal (in conjunction with the option to make it required) would > > solve the main issues I have with the by-reference passing implementation > > > > > If this remains optional, I wouldn't have much appetite for using it, > because the benefit feels very slight. The fact that it wouldn't always be > mandatory makes the benefit even slighter, since you still couldn't look at > foo($bar) and know whether it was by-reference without also knowing what > declare options were in scope. > For a drive-by contribution to an open-source project? Maybe not. For anything that you want to work on seriously (say your own code or your employers code), you'll want to check the declares and then have guarantees on how the language behaves. It's somewhat off-topic, but as George mentioned in his email, this doesn't have to be an agglomeration of individual declares that are randomly flipped on and off: It could also be something like a "language level" (like editions in rust). To sick with the analogy of Rust editions, think of it as switching your project to PHP 2020 -- where passing by-reference requires a call-site annotation, use of dynamic properties throws, operators have stricter type requirements, etc... So if you see foo($x) in your code, that's definitely a by-value pass! > > the out/inout approach is a refinement over that, but I'm not convinced > > that it a worthwhile refinement relative to the language and engine > > complexity it will introduce. > > > > > Would it really be that complex? The only real difference between "out" and > "&" would be automatically setting the variable to null when it was passed > to the function. > That depends on how the feature is supposed to work. For me, the main point of having out/inout would be a move away from references, so this would require the implementation of an entirely new calling convention for out and inout parameters. I would expect that $z = foo($x, out $y) would translate (in terms of behavior, not actual implementation) to something like [$z, $y] = foo($x) and $z = foo($x, inout $y) to [$z, $y] = foo($x, $y) The behavior of type annotations should also change, "out T $x" should check that the value assigned to $x on function exist (or possibly on every write to the variable?) is T. "inout T $x" should check that $x is T on entry, and also on exit (or on every assignment). This would be a pretty non-trivial change. The technically hardest parts would be the changes to type-hint behavior (depending on semantic details) and making this work if call-site annotations are missing (very hard). I think we should only do this if the call-site annotations are required (still leaving the problem of old functions). Of course, that's just what I have in mind ... the alternative (and likely what you have in mind) is to make out/inout parameters basically normal by-reference parameters, with the only difference that inout uses an RW fetch instead of a W fetch and thus throws a notice if the referenced variable does not exist. That's certainly a possibility (and technically much simpler), but I also think that it squanders most of the potential behind out/inout parameters. Nikita