> On 25 Aug 2024, at 00:01, Ilija Tovilo <tovilo.il...@gmail.com> wrote:
>
> Hi Stephen
>
> On Sat, Aug 24, 2024 at 1:54 PM Stephen Reay <php-li...@koalephant.com> wrote:
>>
>> Thanks for clarifying. Out of curiosity, how much optimisation do you
>> imagine would be possible if the lookups were done the same was as classes
>> (ie no fallback, names must be local, qualified or imported with `use`)?
>
> I haven't measured this case specifically, but if unqualified calls to
> local functions are indeed rare (which the last analysis seems to
> indicate), then it should make barely any difference. Of course, if
> your code makes lots of use of them, then the story might be
> different. That said, the penalty of an ambiguous internal call is
> much higher than that of a user, local call, given that internal calls
> sometimes have special optimizations or can even be entirely executed
> at compile time. For local calls, it will simply lead to a double
> lookup on first execution.
>
>> I am aware this is a BC break. But if it's kosher to discuss introducing a
>> never ending BC break I don't see why this isn't a valid discussion either.
>> It would give *everyone* that elusive 2-4% performance boost, would resolve
>> any ambiguity about which function a person intended to call (the claimed
>> security issue) and would bring consistency with the way classes/etc are
>> referenced.
>
>> From my analysis, there were 2 967 unqualified calls to local
> functions in the top 1 000 repositories. (Disclaimer: There might be a
> "use function" at the top for some of these, the analysis isn't that
> sophisticated.)
>
> I also ran the script to check for unqualified calls to global
> functions (or at least functions that weren't statically visible in
> that scope in any of the repositories files), and there were ~139 000
> of them. It seems like this is quite a different beast. To summarize:
>
> 1. Flipping lookup order: ~a few dozens of changes
> 2. Global only: ~3 000 changes
> 3. Local only: ~139 000 changes
>
> While much of this can be automated, huge diffs still require
> reviewing time, and can lead to many merge conflicts which also take
> time to resolve. I would definitely prefer to go with 1. or
> potentially 2.
>
> Ilija
>
Hi Ilija,
I understand that a change like (3) is a huge BC break, and as I said earlier,
I wasn't actually suggesting that is the action to take, because I don't think
there is sufficient reason to take *any* action. But given that some people in
this thread seem convinced that *a* change to functionality is apparently
required, I do think every potential change, and it's pros and cons, should be
discussed.
As I've said numerous times, and been either outright dismissed or ignored:
there has been a consistent push from a non-trivial number of internals members
that userland developers should make better use of regular functions, rather
than using classes as fancy namespaces. There was a recent RFC vote that
implicitly endorsed this opinion.
Right now, the lookup rules make namespaced regular functions a consistent
experience for developers, but the lack of autoload makes it unpopular, and the
lack of visibility for such symbols can be problematic.
With the change you're proposing, there will be *another* hurdle that makes the
use of regular namespaced functions harder/less intuitive, or potentially (with
option 1) unpredictable over PHP versions, due to the constant threat of BC
breaks due to new builtin functions - right when we have not one but two RFCs
for function autoloading (arguably the biggest barrier to their increased usage
in userland).
So the whole reason I asked about (3) is because it would
- (a) bring consistency with class/interface/trait symbols;
- (b) inherently bring the much desired 2% performance boost for function
calls, because people would be forced to qualify the names;
- (c) have zero risk of of future WTF BC break when a new global function
interrupting local function lookups;
- (d) have no need for a new "simpler" qualifying syntax (you can't get shorter
than 1 character);
- (e) presumably simplify function autoloading, because there's no longer any
"fallback" step to worry about before triggering an autoloader;
- (e) even solve the "security" concerns John raised, because the developer
would be forced to qualify their usage if they wanted to use the builtin
function - their intent is always explicit, never guessed.
Yes, it is a huge BC break in terms of the amount of code that's affected. But
it's almost certainly one of the simplest BC break to "fix" in the history of
PHP BC breaks.
How much code was affected when register globals was removed? Or when magic
quotes was removed? Or when short codes were removed?
Surely any change being proposed here would mean a deprecation notice in the
next release after 8.4, and then whatever actual change is proposed, in the
next major version after that. So possibly 8.5 and then 9.0, but potentially
9.0 and then 10.0.
If either of (1) or (2) is chosen, and the "acceptability" of such a choice
depends on something less verbose than "namespace\" to qualify a local
function, projects literally can't future proof (or
no-deprecation-notice-proof, if you prefer) their code against the eventual
flip until that change is implemented - in a scenario where a deprecation (and
new local qualifier) goes out in 2025 as part of 8.5, and a flip happens in
2026 as part of 9.0, that would cuts the time projects have to effectively
adapt, in half, and it means any code that's updated for it, can't make use of
the new "less verbose" local qualifier if they also need to support versions
prior to it being available.
If it happened to be 9.0 and 10.0 being the deprecation and "change" versions,
obviously people have longer to make the required change - but that argument
cuts both ways. If you have 5 years to change every `strlen` to `\strlen` it's
hardly going to cause a huge and sudden swath of noise in revision history. I
would imagine most projects would just adopt a new code style, and prefixing
with `\` would occur automatically whenever a file is otherwise modified by a
developer.
There's also an impact on internals development/RFC with either (1) or (2):
*any* proposed new global function in the standard library now has a BC barrier
to pass if it *might* conflict with one defined by anyone in userland, in any
namespace. JS is a living embodiment of of this problem: see String#includes,
Array#includes, and Array#flat - and that's with people doing the "wrong thing"
(extending builtin JS prototypes is arguably the same as using the `\php`
namespace)
Multiple people have lamented the way function fallbacks were originally
implemented. If you're going to insist on making a change, let's at least aim
for a change that brings *MORE* consistency, and fixes a previous mistake,
rather than adding a brand new inconsistency and who knows how many years of
unexpected BC breaks for unsuspecting userland developers - who apparently
*already* stuggle to understand the way symbol lookup happens - into the
future, and adding yet *another* reason for people to not use namespaced
functions.
Cheers
Stephen