Great catch Ilija! Do you mind sharing how did you stumble upon it?
Thank you! On Wed, Jul 5, 2023, 06:31 Dmitry Stogov <dmitrysto...@gmail.com> wrote: > On Wed, Jul 5, 2023 at 1:15 AM Ilija Tovilo <tovilo.il...@gmail.com> > wrote: > > > Hi everyone > > > > I recently discovered some unfortunate behavior of the coalesce > > assignment operator (??=) in combination with function calls. Here's > > the TL;DR: > > > > foo()['bar'] ??= 42; > > > > Currently, this code calls foo() twice. This seems rather unexpected. > > The technical reason as to why this happens is not straight-forward, > > but I will attempt to explain below. The behavior was not specified in > > the RFC (https://wiki.php.net/rfc/null_coalesce_equal_operator) and is > > completely untested, and as such I don't believe it is by design. My > > proposal is to change it so that foo() is only called once. > > > > This is what is happening in detail. > > > > ??= is special in that it needs to evaluate the lhs (left hand side) > > twice. At first, we need to check if the offset exists, then > > conditionally execute the rhs (right hand side), re-fetch the offset > > and assign the rhs value to it. The reason for the re-fetching of the > > offset is that the evaluation of the rhs may invalidate the offset. > > This is explained in the following blog post: > > > > > https://www.npopov.com/2017/04/14/PHP-7-Virtual-machine.html#writes-and-memory-safety > > Essentially, the offset may be a pointer into an array element or > > object property. If the rhs frees the array or object, or grows the > > array causing a reallocation (meaning it is moved to some other place > > in memory), the pointer is no longer valid. For this reason, PHP makes > > sure no user code may execute between the fetching of an offset and > > the assignment to it. Normally, that just means evaluating the rhs > > before fetching the offset. In this case, we need to evaluate the lhs > > first to know if we even should evaluate the rhs. > > > > Naively evaluating the lhs again poses a problem for expressions with > > side-effects. For example: > > > > $array[$x++] ??= 42; > > > > We do not want to re-evaluate the entire expression because $x++ will > > lead to a different array offset the second time around. The way this > > is solved is by "memoizing" any compiled expression in the lhs that is > > *not* a variable, meaning not part of the offset that may be > > invalidated. Internally, a variable is considered anything that may be > > written to, i.e. local variables ($foo), properties ($foo->bar, > > Foo::$bar), array offsets ($foo['bar']), and function calls (foo(), > > $foo->bar(), Foo::bar(), $foo(), as they may return a modifiable > > reference). The fact that function calls are included in that list > > leads to the problem presented above. It is not actually necessary to > > exclude them from memoization because their result may not be > > invalidated. > > > > Another inconsistency is that function call arguments will be > > re-evaluated, but only if they are not part of some other expression. > > > > a. foo(bar())['baz'] ??= 42; > > b. foo(bar() + 0)['baz'] ??= 42; > > > > a calls both foo() and bar() twice. b however calls foo() twice but > > bar() only once. That is because the expression bar() + 0 is *not* > > considered a variable and as such gets memoized. > > > > > This is definitely a bug in the original implementation. > In case a function is evaluated twice and returns different values, we > check one value, but assign to another. > > > > I propose to unconditionally memoize calls (in all forms) when they > > appear in the lhs of a coalesce expression. This will ensure that > > calls are only executed once, including function arguments and the lhs > > of method calls. Consequently, the assignment will be performed on the > > same offset that was previously tested, even if the expression > > contains a function call with side-effects. > > > > The implementation for this change is simple: > > https://github.com/php/php-src/pull/11592 > > > > Let me know if you have any concerns. I'm planning on merging this for > > master if there is consensus on the semantics. > > > > +1 > > Thanks. Dmitry. > > > > Ilija > > > > -- > > PHP Internals - PHP Runtime Development Mailing List > > To unsubscribe, visit: https://www.php.net/unsub.php > > > > >