On Sat, Aug 24, 2024, at 20:16, Stephen Reay wrote: > > > > On 25 Aug 2024, at 00:01, Ilija Tovilo <tovilo.il...@gmail.com> wrote: > > > > Hi Stephen > > > > On Sat, Aug 24, 2024 at 1:54 PM Stephen Reay <php-li...@koalephant.com> > > wrote: > >> > >> Thanks for clarifying. Out of curiosity, how much optimisation do you > >> imagine would be possible if the lookups were done the same was as classes > >> (ie no fallback, names must be local, qualified or imported with `use`)? > > > > I haven't measured this case specifically, but if unqualified calls to > > local functions are indeed rare (which the last analysis seems to > > indicate), then it should make barely any difference. Of course, if > > your code makes lots of use of them, then the story might be > > different. That said, the penalty of an ambiguous internal call is > > much higher than that of a user, local call, given that internal calls > > sometimes have special optimizations or can even be entirely executed > > at compile time. For local calls, it will simply lead to a double > > lookup on first execution. > > > >> I am aware this is a BC break. But if it's kosher to discuss introducing a > >> never ending BC break I don't see why this isn't a valid discussion > >> either. It would give *everyone* that elusive 2-4% performance boost, > >> would resolve any ambiguity about which function a person intended to call > >> (the claimed security issue) and would bring consistency with the way > >> classes/etc are referenced. > > > >> From my analysis, there were 2 967 unqualified calls to local > > functions in the top 1 000 repositories. (Disclaimer: There might be a > > "use function" at the top for some of these, the analysis isn't that > > sophisticated.) > > > > I also ran the script to check for unqualified calls to global > > functions (or at least functions that weren't statically visible in > > that scope in any of the repositories files), and there were ~139 000 > > of them. It seems like this is quite a different beast. To summarize: > > > > 1. Flipping lookup order: ~a few dozens of changes > > 2. Global only: ~3 000 changes > > 3. Local only: ~139 000 changes > > > > While much of this can be automated, huge diffs still require > > reviewing time, and can lead to many merge conflicts which also take > > time to resolve. I would definitely prefer to go with 1. or > > potentially 2. > > > > Ilija > > > > > Hi Ilija, > > I understand that a change like (3) is a huge BC break, and as I said > earlier, I wasn't actually suggesting that is the action to take, because I > don't think there is sufficient reason to take *any* action. But given that > some people in this thread seem convinced that *a* change to functionality is > apparently required, I do think every potential change, and it's pros and > cons, should be discussed. > > > As I've said numerous times, and been either outright dismissed or ignored: > there has been a consistent push from a non-trivial number of internals > members that userland developers should make better use of regular functions, > rather than using classes as fancy namespaces. There was a recent RFC vote > that implicitly endorsed this opinion. > > Right now, the lookup rules make namespaced regular functions a consistent > experience for developers, but the lack of autoload makes it unpopular, and > the lack of visibility for such symbols can be problematic. > > With the change you're proposing, there will be *another* hurdle that makes > the use of regular namespaced functions harder/less intuitive, or potentially > (with option 1) unpredictable over PHP versions, due to the constant threat > of BC breaks due to new builtin functions - right when we have not one but > two RFCs for function autoloading (arguably the biggest barrier to their > increased usage in userland). > > > > So the whole reason I asked about (3) is because it would > - (a) bring consistency with class/interface/trait symbols; > - (b) inherently bring the much desired 2% performance boost for function > calls, because people would be forced to qualify the names; > - (c) have zero risk of of future WTF BC break when a new global function > interrupting local function lookups; > - (d) have no need for a new "simpler" qualifying syntax (you can't get > shorter than 1 character); > - (e) presumably simplify function autoloading, because there's no longer any > "fallback" step to worry about before triggering an autoloader; > - (e) even solve the "security" concerns John raised, because the developer > would be forced to qualify their usage if they wanted to use the builtin > function - their intent is always explicit, never guessed. > > > > Yes, it is a huge BC break in terms of the amount of code that's affected. > But it's almost certainly one of the simplest BC break to "fix" in the > history of PHP BC breaks. > > > How much code was affected when register globals was removed? Or when magic > quotes was removed? Or when short codes were removed? > > Surely any change being proposed here would mean a deprecation notice in the > next release after 8.4, and then whatever actual change is proposed, in the > next major version after that. So possibly 8.5 and then 9.0, but potentially > 9.0 and then 10.0. > > > If either of (1) or (2) is chosen, and the "acceptability" of such a choice > depends on something less verbose than "namespace\" to qualify a local > function, projects literally can't future proof (or > no-deprecation-notice-proof, if you prefer) their code against the eventual > flip until that change is implemented - in a scenario where a deprecation > (and new local qualifier) goes out in 2025 as part of 8.5, and a flip happens > in 2026 as part of 9.0, that would cuts the time projects have to effectively > adapt, in half, and it means any code that's updated for it, can't make use > of the new "less verbose" local qualifier if they also need to support > versions prior to it being available. > > If it happened to be 9.0 and 10.0 being the deprecation and "change" > versions, obviously people have longer to make the required change - but that > argument cuts both ways. If you have 5 years to change every `strlen` to > `\strlen` it's hardly going to cause a huge and sudden swath of noise in > revision history. I would imagine most projects would just adopt a new code > style, and prefixing with `\` would occur automatically whenever a file is > otherwise modified by a developer. > > > There's also an impact on internals development/RFC with either (1) or (2): > *any* proposed new global function in the standard library now has a BC > barrier to pass if it *might* conflict with one defined by anyone in > userland, in any namespace. JS is a living embodiment of of this problem: see > String#includes, Array#includes, and Array#flat - and that's with people > doing the "wrong thing" (extending builtin JS prototypes is arguably the same > as using the `\php` namespace) > > > > Multiple people have lamented the way function fallbacks were originally > implemented. If you're going to insist on making a change, let's at least aim > for a change that brings *MORE* consistency, and fixes a previous mistake, > rather than adding a brand new inconsistency and who knows how many years of > unexpected BC breaks for unsuspecting userland developers - who apparently > *already* stuggle to understand the way symbol lookup happens - into the > future, and adding yet *another* reason for people to not use namespaced > functions. > > > > Cheers > > > Stephen
It may be worth waiting for function autoloading, to be honest. One of the nice things about it is that you get called when using non-qualified globals. This makes it very easy for an autoloader to start forcing qualified globals and emitting warnings/exceptions. I have a feeling that, eventually, if function autoloading gets more use and accepted into php, we will see people using more and more qualified globals. Ergo, I suspect option (3) will become the default, eventually. Unless (2) is chosen, of course. — Rob