Yes, serialization is a problem. I would actually advocate putting a marker in the serialized file that indicates what the value of unicode_semantics switch was during the serialization, and if the value is different during deserialization, refuse to load it or start a new session. One really should not be changing that switch on a whim in-between sessions.

-Andrei


On Sep 9, 2005, at 2:49 AM, Antony Dovgal wrote:

Hello all.

I'm currently working on unicode support in serialize()/unserialize () and stuck with some issues.
Here they are:

1) What to do with unserializing serialized unicode strings when unicode_semantics is Off?
I presume it's safe to create & return IS_UNICODE in this case ?

2) Classnames are serialized without U: or s: prefix, but I can detect unicode string by it's leading "\". It's looks kinda tricky, but on the other hand forward slash can't appear there if it's not unicode. Or should I change it to use U:/s: prefixes? (Didn't try it yet, so I can't say how difficult it would be).

The other problem here is that we can't use unicode class names when unicode_semantics is Off because in this case class_table stores them as IS_STRING and we won't be able to find class entry by it's unicode name (thanks to Val for noticing this).

3) Currently serialize() produces valid \u0000 sequences, which can be parsed/restored perfectly fine when reading them from a file or returning from serialize(). But specifying them as a const string won't work as these sequences get parsed in compile time.

Short example:
<?php
var_dump(unserialize('U:2:"\u0061\u0061";')); // won't work
var_dump(unserialize(serialize("aa"))); // works
var_dump('U:2:"\u0061\u0061";'); //produces unicode(9) "U:2:"aa";"
?>
IMO the best way here is to change serialize() output to produce something else (for example \pu0000 instead of \u0000) - in this case it works just fine.

Comments?

--
Wbr, Antony Dovgal

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to