Great catch Ilija!

Do you mind sharing how did you stumble upon it?

Thank you!

On Wed, Jul 5, 2023, 06:31 Dmitry Stogov <dmitrysto...@gmail.com> wrote:

> On Wed, Jul 5, 2023 at 1:15 AM Ilija Tovilo <tovilo.il...@gmail.com>
> wrote:
>
> > Hi everyone
> >
> > I recently discovered some unfortunate behavior of the coalesce
> > assignment operator (??=) in combination with function calls. Here's
> > the TL;DR:
> >
> > foo()['bar'] ??= 42;
> >
> > Currently, this code calls foo() twice. This seems rather unexpected.
> > The technical reason as to why this happens is not straight-forward,
> > but I will attempt to explain below. The behavior was not specified in
> > the RFC (https://wiki.php.net/rfc/null_coalesce_equal_operator) and is
> > completely untested, and as such I don't believe it is by design. My
> > proposal is to change it so that foo() is only called once.
> >
> > This is what is happening in detail.
> >
> > ??= is special in that it needs to evaluate the lhs (left hand side)
> > twice. At first, we need to check if the offset exists, then
> > conditionally execute the rhs (right hand side), re-fetch the offset
> > and assign the rhs value to it. The reason for the re-fetching of the
> > offset is that the evaluation of the rhs may invalidate the offset.
> > This is explained in the following blog post:
> >
> >
> https://www.npopov.com/2017/04/14/PHP-7-Virtual-machine.html#writes-and-memory-safety
> > Essentially, the offset may be a pointer into an array element or
> > object property. If the rhs frees the array or object, or grows the
> > array causing a reallocation (meaning it is moved to some other place
> > in memory), the pointer is no longer valid. For this reason, PHP makes
> > sure no user code may execute between the fetching of an offset and
> > the assignment to it. Normally, that just means evaluating the rhs
> > before fetching the offset. In this case, we need to evaluate the lhs
> > first to know if we even should evaluate the rhs.
> >
> > Naively evaluating the lhs again poses a problem for expressions with
> > side-effects. For example:
> >
> > $array[$x++] ??= 42;
> >
> > We do not want to re-evaluate the entire expression because $x++ will
> > lead to a different array offset the second time around. The way this
> > is solved is by "memoizing" any compiled expression in the lhs that is
> > *not* a variable, meaning not part of the offset that may be
> > invalidated. Internally, a variable is considered anything that may be
> > written to, i.e. local variables ($foo), properties ($foo->bar,
> > Foo::$bar), array offsets ($foo['bar']), and function calls (foo(),
> > $foo->bar(), Foo::bar(), $foo(), as they may return a modifiable
> > reference). The fact that function calls are included in that list
> > leads to the problem presented above. It is not actually necessary to
> > exclude them from memoization because their result may not be
> > invalidated.
> >
> > Another inconsistency is that function call arguments will be
> > re-evaluated, but only if they are not part of some other expression.
> >
> > a. foo(bar())['baz'] ??= 42;
> > b. foo(bar() + 0)['baz'] ??= 42;
> >
> > a calls both foo() and bar() twice. b however calls foo() twice but
> > bar() only once. That is because the expression bar() + 0 is *not*
> > considered a variable and as such gets memoized.
> >
> >
> This is definitely a bug in the original implementation.
> In case a function is evaluated twice and returns different values, we
> check one value, but assign to another.
>
>
> > I propose to unconditionally memoize calls (in all forms) when they
> > appear in the lhs of a coalesce expression. This will ensure that
> > calls are only executed once, including function arguments and the lhs
> > of method calls. Consequently, the assignment will be performed on the
> > same offset that was previously tested, even if the expression
> > contains a function call with side-effects.
> >
> > The implementation for this change is simple:
> > https://github.com/php/php-src/pull/11592
> >
> > Let me know if you have any concerns. I'm planning on merging this for
> > master if there is consensus on the semantics.
> >
>
> +1
>
> Thanks. Dmitry.
>
>
> > Ilija
> >
> > --
> > PHP Internals - PHP Runtime Development Mailing List
> > To unsubscribe, visit: https://www.php.net/unsub.php
> >
> >
>

Reply via email to