Re: [PHP-DEV] Re: strlen() under unicode.semantics

2006-06-24 Thread Sara Golemon
On Thu, Jun 22, 2006 at 09:15:23PM -0700, Sara Golemon wrote: utf16 of php's internal encoding. Big or Little Endian? Yes. By that I of course mean that the endianness of U16 data points used internally by PHP are dependent on the architecture's endianness. For example: If it's x86, then

Re: [PHP-DEV] Re: strlen() under unicode.semantics

2006-06-23 Thread Daniel Convissor
On Thu, Jun 22, 2006 at 09:15:23PM -0700, Sara Golemon wrote: > utf16 of php's internal encoding. Big or Little Endian? Thanks, --Dan -- T H E A N A L Y S I S A N D S O L U T I O N S C O M P A N Y data intensive web and database programming http://www.Analy

Re: [PHP-DEV] Re: strlen() under unicode.semantics

2006-06-23 Thread Andrei Zmievski
Especially since the UTF-16 internal representation may be little- or big-endian, depending on the platform. -Andrei On Jun 23, 2006, at 11:31 AM, Andi Gutmans wrote: Nah I didn't mean to get back to that discussion. I was thinking more of a binary dump of info (e.g. session-like stuff) or s

RE: [PHP-DEV] Re: strlen() under unicode.semantics

2006-06-23 Thread Andi Gutmans
om: Sara Golemon [mailto:[EMAIL PROTECTED] > Sent: Friday, June 23, 2006 1:16 AM > To: Andi Gutmans; 'Andrei Zmievski' > Cc: '"Ron Korving"'; internals@lists.php.net > Subject: Re: [PHP-DEV] Re: strlen() under unicode.semantics > > >> The on

Re: [PHP-DEV] Re: strlen() under unicode.semantics

2006-06-23 Thread Sara Golemon
The only way they can get at the internal UTF-16 representation is via unicode_encode($uni, 'UTF-16') which will return a binary UTF-16 string. In that case, strlen() will work just as well. Hmm, I was thinking we might have some binary write function which would do that automagically. I think

Re: [PHP-DEV] Re: strlen() under unicode.semantics

2006-06-23 Thread Andrei Zmievski
7;d be worth it. -Original Message- From: Andrei Zmievski [mailto:[EMAIL PROTECTED] Sent: Thursday, June 22, 2006 11:38 PM To: Andi Gutmans Cc: 'Sara Golemon'; '"Ron Korving"'; internals@lists.php.net Subject: Re: [PHP-DEV] Re: strlen() under unicode.semantics The

Re: [PHP-DEV] Re: strlen() under unicode.semantics

2006-06-23 Thread Andrei Zmievski
#x27;s just about storage? Andi -Original Message- From: Sara Golemon [mailto:[EMAIL PROTECTED] Sent: Thursday, June 22, 2006 9:15 PM To: "Ron Korving" Cc: internals@lists.php.net Subject: Re: [PHP-DEV] Re: strlen() under unicode.semantics Still, it's gotta be useful to be

Re: [PHP-DEV] Re: strlen() under unicode.semantics

2006-06-23 Thread Ron Korving
>> it looks in memory, and put it back... Why convert to UTF-8 or any other >> encoding if it's just about storage? >> >> Andi >> >>> -Original Message- >>> From: Sara Golemon [mailto:[EMAIL PROTECTED] >>> Sent: Thursday, June

Re: [PHP-DEV] Re: strlen() under unicode.semantics

2006-06-23 Thread Johannes Schlueter
Hi, in my opinion that name is bad since most of the time the string won't be stored using the internal encoding but stored using some implicit converted encoding like the encoding of the stream being used or the one from the database. So the size needed to store the string would most likley be

RE: [PHP-DEV] Re: strlen() under unicode.semantics

2006-06-22 Thread Andi Gutmans
#x27;Sara Golemon'; '"Ron Korving"'; internals@lists.php.net > Subject: Re: [PHP-DEV] Re: strlen() under unicode.semantics > > The only way they can get at the internal UTF-16 > representation is via unicode_encode($uni, 'UTF-16') which > will retu

Re: [PHP-DEV] Re: strlen() under unicode.semantics

2006-06-22 Thread Andrei Zmievski
ra Golemon [mailto:[EMAIL PROTECTED] Sent: Thursday, June 22, 2006 9:15 PM To: "Ron Korving" Cc: internals@lists.php.net Subject: Re: [PHP-DEV] Re: strlen() under unicode.semantics Still, it's gotta be useful to be know how many bytes it occupies. Perhaps for Content-length headers

RE: [PHP-DEV] Re: strlen() under unicode.semantics

2006-06-22 Thread Andi Gutmans
--Original Message- > From: Sara Golemon [mailto:[EMAIL PROTECTED] > Sent: Thursday, June 22, 2006 9:15 PM > To: "Ron Korving" > Cc: internals@lists.php.net > Subject: Re: [PHP-DEV] Re: strlen() under unicode.semantics > > > Still, it's gotta be useful to be

RE: [PHP-DEV] Re: strlen() under unicode.semantics

2006-06-22 Thread Andi Gutmans
Fine with me. > -Original Message- > From: Andrei Zmievski [mailto:[EMAIL PROTECTED] > Sent: Thursday, June 22, 2006 11:08 PM > To: Andi Gutmans > Cc: 'Johannes Schlueter'; internals@lists.php.net; 'Ron Korving' > Subject: Re: [PHP-DEV] Re: str

Re: [PHP-DEV] Re: strlen() under unicode.semantics

2006-06-22 Thread Andrei Zmievski
How about str_storage_size()? It is explicit enough that people will be wary of using it. -Andrei On Jun 22, 2006, at 10:56 PM, Andi Gutmans wrote: Oops, senile me :) How about str_size()? -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/un

RE: [PHP-DEV] Re: strlen() under unicode.semantics

2006-06-22 Thread Andi Gutmans
rating on > > Unicode...? > > > > Andi > > > > > -Original Message- > > > From: Andrei Zmievski [mailto:[EMAIL PROTECTED] > > > Sent: Thursday, June 22, 2006 3:14 PM > > > To: Ron Korving > > > Cc: internals@lists.php.n

Re: [PHP-DEV] Re: strlen() under unicode.semantics

2006-06-22 Thread Johannes Schlueter
ki [mailto:[EMAIL PROTECTED] > > Sent: Thursday, June 22, 2006 3:14 PM > > To: Ron Korving > > Cc: internals@lists.php.net > > Subject: Re: [PHP-DEV] Re: strlen() under unicode.semantics > > > > It'll be there. strlen_bytes() perhaps? > > > > -

RE: [PHP-DEV] Re: strlen() under unicode.semantics

2006-06-22 Thread Andi Gutmans
Maybe sizeof() should not be an alias for strlen() when operating on Unicode...? Andi > -Original Message- > From: Andrei Zmievski [mailto:[EMAIL PROTECTED] > Sent: Thursday, June 22, 2006 3:14 PM > To: Ron Korving > Cc: internals@lists.php.net > Subject: Re: [P

Re: [PHP-DEV] Re: strlen() under unicode.semantics

2006-06-22 Thread Sara Golemon
Still, it's gotta be useful to be know how many bytes it occupies. Perhaps for Content-length headers or something. There are plenty of low level concepts to think of where one might need this. And even if you can't think of any reason now, you don't wanna get hit in the face by it and have to i

Re: [PHP-DEV] Re: strlen() under unicode.semantics

2006-06-22 Thread Andrei Zmievski
It'll be there. strlen_bytes() perhaps? -Andrei On Jun 22, 2006, at 2:55 PM, Ron Korving wrote: Still, it's gotta be useful to be know how many bytes it occupies. Perhaps for Content-length headers or something. There are plenty of low level concepts to think of where one might need this. And

Re: [PHP-DEV] Re: strlen() under unicode.semantics

2006-06-22 Thread Ron Korving
Still, it's gotta be useful to be know how many bytes it occupies. Perhaps for Content-length headers or something. There are plenty of low level concepts to think of where one might need this. And even if you can't think of any reason now, you don't wanna get hit in the face by it and have to impl

Re: [PHP-DEV] Re: strlen() under unicode.semantics

2006-06-21 Thread Sara Golemon
What happens with $fp = fopen('foo.bin', 'wb'); $written = fwrite($fp, $str); if (strlen($str) != $written) { echo 'Not written', "\n"; } Assuming $str is a binary string. The above code works just fine. If it's a unicode string: Short version: Don't do that. Writing a unicode string to a

RE: [PHP-DEV] Re: strlen() under unicode.semantics

2006-06-21 Thread Jared Williams
> > > Enjoyed Andrei's talk at the NYPHP Conference last week > about unicode in > > PHP 6. He mentioned that when unicode.semantics is on, > strlen() will > > return the number of characters rather than the number of > bytes, like > > mb_string() does or strlen() if mbstring.func_overload is