> On 25 Aug 2024, at 00:01, Ilija Tovilo <tovilo.il...@gmail.com> wrote:
> 
> Hi Stephen
> 
> On Sat, Aug 24, 2024 at 1:54 PM Stephen Reay <php-li...@koalephant.com> wrote:
>> 
>> Thanks for clarifying. Out of curiosity, how much optimisation do you 
>> imagine would be possible if the lookups were done the same was as classes 
>> (ie no fallback, names must be local, qualified or imported with `use`)?
> 
> I haven't measured this case specifically, but if unqualified calls to
> local functions are indeed rare (which the last analysis seems to
> indicate), then it should make barely any difference. Of course, if
> your code makes lots of use of them, then the story might be
> different. That said, the penalty of an ambiguous internal call is
> much higher than that of a user, local call, given that internal calls
> sometimes have special optimizations or can even be entirely executed
> at compile time. For local calls, it will simply lead to a double
> lookup on first execution.
> 
>> I am aware this is a BC break. But if it's kosher to discuss introducing a 
>> never ending BC break I don't see why this isn't a valid discussion either. 
>> It would give *everyone* that elusive 2-4% performance boost, would resolve 
>> any ambiguity about which function a person intended to call (the claimed 
>> security issue) and would bring consistency with the way classes/etc are 
>> referenced.
> 
>> From my analysis, there were 2 967 unqualified calls to local
> functions in the top 1 000 repositories. (Disclaimer: There might be a
> "use function" at the top for some of these, the analysis isn't that
> sophisticated.)
> 
> I also ran the script to check for unqualified calls to global
> functions (or at least functions that weren't statically visible in
> that scope in any of the repositories files), and there were ~139 000
> of them. It seems like this is quite a different beast. To summarize:
> 
> 1. Flipping lookup order: ~a few dozens of changes
> 2. Global only: ~3 000 changes
> 3. Local only: ~139 000 changes
> 
> While much of this can be automated, huge diffs still require
> reviewing time, and can lead to many merge conflicts which also take
> time to resolve. I would definitely prefer to go with 1. or
> potentially 2.
> 
> Ilija
> 


Hi Ilija,

I understand that a change like (3) is a huge BC break, and as I said earlier, 
I wasn't actually suggesting that is the action to take, because I don't think 
there is sufficient reason to take *any* action. But given that some people in 
this thread seem convinced that *a* change to functionality is apparently 
required, I do think every potential change, and it's pros and cons, should be 
discussed.


As I've said numerous times, and been either outright dismissed or ignored: 
there has been a consistent push from a non-trivial number of internals members 
that userland developers should make better use of regular functions, rather 
than using classes as fancy namespaces. There was a recent RFC vote that 
implicitly endorsed this opinion.

Right now, the lookup rules make namespaced regular functions a consistent 
experience for developers, but the lack of autoload makes it unpopular, and the 
lack of visibility for such symbols can be problematic. 

With the change you're proposing, there will be *another* hurdle that makes the 
use of regular namespaced functions harder/less intuitive, or potentially (with 
option 1) unpredictable over PHP versions, due to the constant threat of BC 
breaks due to new builtin functions - right when we have not one but two RFCs 
for function autoloading (arguably the biggest barrier to their increased usage 
in userland).



So the whole reason I asked about (3) is because it would
- (a) bring consistency with class/interface/trait symbols;
- (b) inherently bring the much desired 2% performance boost for function 
calls, because people would be forced to qualify the names;
- (c) have zero risk of of future WTF BC break when a new global function 
interrupting local function lookups;
- (d) have no need for a new "simpler" qualifying syntax (you can't get shorter 
than 1 character);
- (e) presumably simplify function autoloading, because there's no longer any 
"fallback" step to worry about before triggering an autoloader;
- (e) even solve the "security" concerns John raised, because the developer 
would be forced to qualify their usage if they wanted to use the builtin 
function - their intent is always explicit, never guessed.



Yes, it is a huge BC break in terms of the amount of code that's affected. But 
it's almost certainly one of the simplest BC break to "fix" in the history of 
PHP BC breaks.


How much code was affected when register globals was removed? Or when magic 
quotes was removed? Or when short codes were removed? 

Surely any change being proposed here would mean a deprecation notice in the 
next release after 8.4, and then whatever actual change is proposed, in the 
next major version after that. So possibly 8.5 and then 9.0, but potentially 
9.0 and then 10.0.


If either of (1) or (2) is chosen, and the "acceptability" of such a choice 
depends on something less verbose than "namespace\" to qualify a local 
function, projects literally can't future proof (or 
no-deprecation-notice-proof, if you prefer) their code against the eventual 
flip until that change is implemented - in a scenario where a deprecation (and 
new local qualifier) goes out in 2025 as part of 8.5, and a flip happens in 
2026 as part of 9.0, that would cuts the time projects have to effectively 
adapt, in half, and it means any code that's updated for it, can't make use of 
the new "less verbose" local qualifier if they also need to support versions 
prior to it being available.

If it happened to be 9.0 and 10.0 being the deprecation and "change" versions, 
obviously people have longer to make the required change - but that argument 
cuts both ways. If you have 5 years to change every `strlen` to `\strlen` it's 
hardly going to cause a huge and sudden swath of noise in revision history. I 
would imagine most projects would just adopt a new code style, and prefixing 
with `\` would occur automatically whenever a file is otherwise modified by a 
developer.


There's also an impact on internals development/RFC with either (1) or (2): 
*any* proposed new global function in the standard library now has a BC barrier 
to pass if it *might* conflict with one defined by anyone in userland, in any 
namespace. JS is a living embodiment of of this problem: see String#includes, 
Array#includes, and Array#flat - and that's with people doing the "wrong thing" 
(extending builtin JS prototypes is arguably the same as using the `\php` 
namespace)



Multiple people have lamented the way function fallbacks were originally 
implemented. If you're going to insist on making a change, let's at least aim 
for a change that brings *MORE* consistency, and fixes a previous mistake, 
rather than adding a brand new inconsistency and who knows how many years of 
unexpected BC breaks for unsuspecting userland developers - who apparently 
*already* stuggle to understand the way symbol lookup happens -  into the 
future, and adding yet *another* reason for people to not use namespaced 
functions.



Cheers


Stephen 

Reply via email to