--- Andrei Zmievski <[EMAIL PROTECTED]> wrote: > > >> 4) The string can be truncated to the user's requested character > >> length. The string will be trimmed from the right one unicode > >> utf-8 character (not grapheme, not byte) at a time until the length > >> limit is met. (So a combining character is one character for this > >> purpose.) > > > > Shouldn't characters/codepoints be trimmed at both ends, rather > > than just at the right end ? > > Why would you trim it from the left?
OK, my Q wasn't really clear. Assuming pad string == "abcdefg", the end result would be something like: abcdefgXXXXabcdefg. But if the result string is being trimmed because of length constraints, I understand that Tex says the end result could be something like: abcdefgXXXXabc. The current non-Unicode impl would return something like: abcdeXXXXabcde - shouldn't this "symmetry" be retained in the Unicode impl too ? Unless Tex is talking about just trimming the pad string, rather than the result string, in which case, its alright. Another Q regarding Tex's proposal: why deal with UTF-8 codepoints when trimming, when the inputs will be UTF-16 ? -- Rolland -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php