On Mon, Jun 10, 2024 at 9:40 PM Ben Ramsey <ram...@php.net> wrote: > > > On Jun 10, 2024, at 20:35, Valentin Udaltsov <udaltsov.valen...@gmail.com> > > wrote: > > > > Hi, internals! > > > > 9 years have passed since the last discussions of case sensitive PHP: > > https://externals.io/message/79824 and https://externals.io/message/83640. > > Here I would like to revisit this topic. > > > > What is case-sensitive in PHP 8.3: > > - variables > > - constants (all since > > https://wiki.php.net/rfc/case_insensitive_constant_deprecation) > > - class constants > > - properties > > > > What is case-insensitive in PHP 8.3: > > - namespaces > > - functions > > - classes (including self, parent and static relative class types) > > - methods (including the magic ones) > > > > Pros: > > 1. no need to convert strings to lowercase inside the engine for name > > lookups (a small performance and memory gain) > > 2. better fit for case sensitive platforms that PHP code is mostly run on > > (Linux) > > 3. uniform handling of ASCII and non-ASCII symbols (currently non-ASCII > > symbols in names are case sensitive: https://3v4l.org/PWkvG) > > 4. PSR-4 compatibility > > (https://www.php-fig.org/psr/psr-4/#:~:text=All%20class%20names%20MUST%20be%20referenced%20in%20a%20case%2Dsensitive%20fashion) > > > > Cons: > > 1. pain for users, obviously > > 2. a backward compatibility layer might be difficult to implement and/or > > have a performance penalty > > > > On con 1. I think today PHP users are much more prepared for the change: > > - more and more projects adopted namespaces and PSR-4 autoloading via > > Composer that never supported case-insensitivity > > (https://github.com/composer/composer/issues/1803, > > https://github.com/composer/composer/issues/8906) which forced to mind > > casing > > - static analyzers became more popular and they do complain about the wrong > > casing (see https://psalm.dev/r/fbdeee2f38 and > > https://phpstan.org/r/1789a32d-d928-4311-b02e-155dd98afbd4) > > - Rector appeared (it can be used to automatically prepare the codebase for > > the next PHP version) > > > > On con 2. While considering different transition options proposed in prior > > discussions (compilation flag, ini option, deprecation notice) I stumbled > > upon Nikita's comment (https://externals.io/message/79824#79939): > > May I recommend to only target class and class-like names for an initial > > RFC? Those have the strongest argument in favor of case-sensitivity given > > how current autoloader implementations work - essentially the > > case-insensitivity doesn't properly work anyway in modern code....I'd also > > appreciate having a voting option for removing case-insensitivity right > > away, as opposed to throwing E_STRICT/E_DEPRECATED. If we want to change > > this, I personally would rather drop it right away than start throwing > > E_STRICT warnings that would make the case-insensitive usage impossible > > anyway. > > It makes a lot of sense to me: a fairly simple change in the core and no > > performance penalty. At the same time, a gradual approach will reduce the > > stress. > > > > So the plan for 8.4 might be to just drop case insensitivity for class > > names and that's it... Let's discuss that! > > > I’m not saying I agree with or support this, but I think your proposal has a > better chance of being accepted if you target PHP 9.0 instead of 8.4. > > Cheers, > Ben >
In fact, it's definitely a BC break I would not personally vote for in 8.4. This isn't some minor thing squirreled away in a library--this is the core language, with wide impact. For this reason, I believe it should target 9.0. I will happily vote for this feature, as long as the patch is reasonable. The most obvious implementation is not very good, though. The engine uses lowercase names for case insensitivity. Namespaces are embedded into the type names. To lowercase the namespace but not the type name, one could do a reverse scan for a namespace separator on the type name, and then lowercase from the start to the index of the namespace separator. For example, " Psr\Log\LoggerInterface" needs to become "psr\log\LoggerInterface". The problem with this is that it's not really going to save CPU nor memory because it still has to lowercase the namespace. We could refactor the engine to store the namespace separately from the type name. This is a lot more work and will increase the size of some types, which might be difficult at a technical level. I can't think of other implementations right now. If nobody can come up with a better implementation, I think we should consider going with split-sensitivity on namespaces where it matches the sensitivity of the thing it is attached to. A namespaced class would have a case sensitive namespace but a namesped function would still have a case insensitive one.