Hi Rowan On 5/20/23 17:13, Rowan Tommins wrote: > On 20 May 2023 13:53:20 BST, Niels Dossche <dossche.ni...@gmail.com> wrote: >> RFC: https://wiki.php.net/rfc/mb_str_pad > > Hi Niels, > > This seems like a reasonable addition. My only hesitation is that it will > share with other mbstring functions the slightly dubious definition of > "character" as "code point", rather than "grapheme", when dealing with > Unicode strings. > > This is most easily demonstrated using combining diacritics, e.g. > "Franc\u{0327}ais" is 9 code points long, but visually identical to the 8 > code point "Fran\u{00E7}ais" used in your examples. Unicode defines > "graphemes" or "grapheme clusters" to better match the common intuition of > what a "character" means. >
Thanks for your insight. This is a good point. I've added a clarification in the RFC text to make clear the definition of character is code point in this case, consistent with mbstring. > Perhaps we should instead, or also, add a "grapheme_strpad" function to > ext/intl? > > Regards, > I've added this suggestion to the future scope section. Kind regards Niels -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php