On 16 December 2022 13:55:02 GMT, Derick Rethans <der...@php.net> wrote: >I do not want a polyfill. These already exist for intl and friends.
I think you misunderstood what I meant by "polyfill"; I meant in the sense that once the real implementation gets included in, say PHP 8.3, users needing to support, say, PHP 8.0, will have a drop-in implementation with exactly the same interface. Anyway, that was just an aside; my main point is that a single-page RFC, and a single mailing list thread, are probably not sufficient to iterate on this design. A prototype, or even just a repo with stubs for the methods, would give us better ways to track all the different details and ideas. >I disgree. Users should not care what is used in the implementation. >It's only UTF-16 because that is what ICU's API use. I do not want the >complexity of having different in/ex encodings. Perhaps 15 years ago >that was useful to have, but right now, everything should be UTF-8 on >the interface layer, that is, if you care about internationalisation. UTF-8 should definitely be the default, but I disagree that all other encodings can simply be ignored, and that users should be punished for using them with extra CPU time spent converting to UTF-8 and back again. All it would need is an optional argument on a couple of methods to specify that you want some other encoding. >A locale/collator is an inherent property of Text (we're dealing with >Text here, not strings). Is it though? It makes some sense to say "this is a Turkish Text, so treat 'i' specially whenever upper-casing". But is there such a thing as a "case insensitive piece of text"? If locale is an "inherent property", does it make sense to discard it when joining Texts together? At the moment, Text::join([$a, $b])->toUpper() can give a different result from Text::join([$a->toUpper(), $b->toUpper()]). An implementation that truly treated locale as inherent would have to track segments within a larger Text, subject to separate locales. (Similar to how HTML allows a lang attribute on individual elements.) For comparisons, I don't see the value at all - if I'm sorting a list of Texts, the sort order is a property of the sort operation, not of the individual items. If I have a French Text, a Spanish Text, and an English Text, there's no meaningful way to use all three sort orders at once, and no particular reason to choose one over the others. In the current proposal, using compareWith in a usort callback without specifying the collation would result in unstable results, because it's not symmetrical - $a->compareWith($b) can use a different collation than $b->compareWith($a). >> the worrying sentence "This will require extensive documentation". > >This phrase is meant to mean that the *format of the locale/collator >name* needs extensive documentation. I know, and I think that's a bad sign - why are we exposing this complexity to users in a class that otherwise holds their hand at every step of the way? I think the parameters should always be a user-friendly collation/locale object, with the ICU strings an optional way for experts to create such an object. Regards, -- Rowan Tommins [IMSoP] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php