RE: [PHP-DEV] Re: strlen() under unicode.semantics

Andi Gutmans Thu, 22 Jun 2006 23:45:16 -0700

Hmm, I was thinking we might have some binary write function which would do
that automagically.  I think it'd be worth it.


> -----Original Message-----
> From: Andrei Zmievski [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, June 22, 2006 11:38 PM
> To: Andi Gutmans
> Cc: 'Sara Golemon'; '"Ron Korving"'; internals@lists.php.net
> Subject: Re: [PHP-DEV] Re: strlen() under unicode.semantics
> 
> The only way they can get at the internal UTF-16 
> representation is via unicode_encode($uni, 'UTF-16') which 
> will return a binary UTF-16 string. In that case, strlen() 
> will work just as well.
> 
> -Andrei
> 
> 
> On Jun 22, 2006, at 11:30 PM, Andi Gutmans wrote:
> 
> > I don't quite agree. I think there's a good chance people 
> will want to 
> > save Unicode strings in a binary format for performance 
> reasons. Save 
> > it the way it looks in memory, and put it back... Why 
> convert to UTF-8 
> > or any other encoding if it's just about storage?
> >
> > Andi
> >
> >> -----Original Message-----
> >> From: Sara Golemon [mailto:[EMAIL PROTECTED]
> >> Sent: Thursday, June 22, 2006 9:15 PM
> >> To: "Ron Korving"
> >> Cc: internals@lists.php.net
> >> Subject: Re: [PHP-DEV] Re: strlen() under unicode.semantics
> >>
> >>> Still, it's gotta be useful to be know how many bytes it occupies.
> >>> Perhaps for Content-length headers or something. There are
> >> plenty of
> >>> low level concepts to think of where one might need this.
> >> And even if
> >>> you can't think of any reason now, you don't wanna get hit
> >> in the face
> >>> by it and have to implement such a function for PHP 6.0.1.
> >>>
> >> For this type of usage, I'd think it'd be relevant to know 
> how many 
> >> bytes the string will occupy in a given output encoding 
> moreso that 
> >> what it happens to occupy in the underlying 
> implementation.  In the 
> >> example you cited, string contents will more typically be sent as 
> >> utf8 rather than the
> >> utf16 of php's internal encoding.
> >>
> >> $utf8str = unicode_encode($unistr, 'utf8');
> >>
> >> header('Content-type: text/html; encoding="utf8"');
> >> header('Content-length: ' . strlen($utf8str)); echo $utf8str;
> >>
> >> I'm not saying it's impossible that a legitimate use will 
> come up to 
> >> know the internal byte-usage of a unicode string, there's 
> certainly 
> >> no harm in adding such a function (apart from the tired shot-foot 
> >> argument).  I just doubt you (or
> >> anyone) will come up such a case anytime soon.
> >>
> >> -Sara
> >>
> >> --
> >> PHP Internals - PHP Runtime Development Mailing List To 
> unsubscribe, 
> >> visit: http://www.php.net/unsub.php
> >>
> >
> > --
> > PHP Internals - PHP Runtime Development Mailing List To 
> unsubscribe, 
> > visit: http://www.php.net/unsub.php
> 

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

RE: [PHP-DEV] Re: strlen() under unicode.semantics

Reply via email to