Thanks Nikita,

Thank you for explanation no I get it.
I only don't know what means serialization "O" and "C" but don't bother.
I'll try to google it.

Cheers
--
Michał

21.04.2017 15:51 "Nikita Popov" <nikita....@gmail.com> napisał(a):

On Fri, Apr 21, 2017 at 2:47 PM, Michał Brzuchalski <
michal.brzuchal...@gmail.com> wrote:

> I know my voice is doesn't mean anything but IMHO interface with magic
> methods could bring more inconsistency.
>
> I know PHP is consistently inconsistent but I would prefer if it is
> posdible to fix an issue with present method naming.
>
> Cheers
>
Magic methods have a distinct backwards compatibility advantage. They allow
you to add __serialize/__unserialize to an existing class that currently
uses Serializable. Older PHP versions will then use the Serializable
implementation, while never versions will use __serialize/__unserialize. An
interface makes this a lot more complicated, because you either have to
bump your PHP version requirement (unlikely), or you have to provide a shim
interface for older PHP versions. This shim interface would then be part of
any library currently implementing Serializable, which seems sub-optimal to
me. That's why I think magic methods are better for this case (though I
don't strongly care).

Also, to answer an OTR question: We cannot simply reuse the Serializable
interface by allowing an array return value from
Serializable::unserialize(). The array return value is only a means to an
end: the important part is that the new serialization mechanism does not
share serialization state -- using arrays instead of strings is just a
convenient way to achieve this. However, Serializable::unserialize()
currently shares the state and we cannot change this without breaking BC --
so we cannot reuse this interface.

Nikita



> 21.04.2017 11:39 "Nikita Popov" <nikita....@gmail.com> napisał(a):
>
>> Hi internals,
>>
>> As you are surely aware, serialization in PHP is a big mess. Said mess is
>> caused by some fundamental issues in the serialization format, and
>> exacerbated by the existence of the Serializable interface. Fixing the
>> serialization format is likely not possible at this point, but we can
>> replace Serializable with a better alternative and I'd like to start a
>> discussion on that.
>>
>> The problem is essentially that Serializable::serialize() is expected to
>> return a string, which is generally obtained by recursively calling
>> serialize() in the Serializable::serialize() implementation. This
>> serialize() call shares state information with the outer serialize(), to
>> ensure that two references to the same object (or the same reference) will
>> continue referring to a single object/reference after serialization.
>>
>> This causes two big issues:
>>
>> First, the implementation is highly order-dependent. If
>> Serializable::serialize() contains multiple calls to serialize(), then
>> calls to unserialize() have to be repeated **in the same order** in
>> Serializable::unserialize(), otherwise unserialization may fail or be
>> corrupted. In particular this means that using parent::serialize() and
>> parent::unserialize() is unsafe. (See also
>> https://bugs.php.net/bug.php?id=66052 and linked bugs.)
>>
>> Second, the existence of Serializable introduces security issues that we
>> cannot fix. Allowing the execution of PHP code during unserialization is
>> unsafe, and even innocuous looking code is easily exploited. We have
>> recently mitigated __wakeup() based attacks by delaying __wakeup() calls
>> until the end of the unserialization. We cannot do the same for
>> Serializable::unserialize() calls, as their design strictly requires the
>> unserialization context to still be active during the call. Similarly,
>> Serializable prevents an up-front validation pass of the serialized
>> string,
>> as the format used for Serializable objects is user-defined.
>>
>> The delayed __wakeup() mitigation mentioned in the previous point also
>> interacts badly with Serializable, because we have to delay __wakeup()
>> calls to the end of the unserialization, which in particular also implies
>> that Serializable::unserialize() sees objects prior to wakeup. (See also
>> https://bugs.php.net/bug.php?id=74436.)
>>
>> In the end, everything comes down to the fact that Serializable requires
>> nested serialization calls with context sharing.
>>
>> The alternative mechanism (__sleep + __wakeup) does not have these issues
>> (anymore), but it is not sufficiently flexible for general use: Notably,
>> __sleep() allows you to limit which properties are serialized, but the
>> properties still have to actually exist on the object.
>>
>> I'd like to propose the addition of a new mechanism which essentially
>> works
>> the same way as Serializable, but uses arrays instead of strings and does
>> not share context. I'm not sure about the naming (RealSerializable,
>> anyone?), so I'll just go with magic methods __serialize() and
>> __unserialize() for now:
>>
>>     public function __serialize() : array;
>>     public function __unserialize(array $data) : void;
>>
>> From a userland perspective the implementation should be the same as for
>> Serializable methods, but with interior serialize()/unserialize() calls
>> stripped out. Right now Serializable implementations already usually work
>> by doing something like "return serialize([ ... ])", this would change it
>> to just "return [ ... ]" and move the serialize()/unserialize() call into
>> the engine, where we can perform it safely and robustly.
>>
>> The new methods should reuse the "O" serialization format, rather than
>> introducing a new one. This allows a measure of interoperability with
>> previous PHP versions, which can still decode serialized strings from
>> newer
>> versions using __wakeup().
>>
>> If an object has both __wakeup() and __unserialize(), then __unserialize()
>> should be called. If an object implements both Serializable::unserialize()
>> and __unserialize(), then we should invoke one or the other based on
>> whether "C" or "O" serialization is used.
>>
>> Thoughts?
>>
>> Nikita
>>
>

Reply via email to