--- Andrei Zmievski <[EMAIL PROTECTED]> wrote:
> 
> >> 4) The string can be truncated to the user's requested character
> >> length. The string will be trimmed from the right one unicode
> >> utf-8 character (not grapheme, not byte) at a time until the
length
> >> limit is met. (So a combining character is one character for this
> >> purpose.)
> >
> > Shouldn't characters/codepoints be trimmed at both ends, rather
> > than just at the right end ?
> 
> Why would you trim it from the left?

OK, my Q wasn't really clear.

Assuming pad string == "abcdefg", the end result would be something
like: abcdefgXXXXabcdefg. But if the result string is being trimmed
because of length constraints, I understand that Tex says the end
result could be something like: abcdefgXXXXabc. The current non-Unicode
impl would return something like: abcdeXXXXabcde - shouldn't this
"symmetry" be retained in the Unicode impl too ?

Unless Tex is talking about just trimming the pad string, rather than
the result string, in which case, its alright.

Another Q regarding Tex's proposal: why deal with UTF-8 codepoints when
trimming, when the inputs will be UTF-16 ?

-- 
Rolland

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to