On 2015-06-10 13:08, Marko Rauhamaa wrote:
Robert Kern <robert.k...@gmail.com>:

By the very nature of the stated problem: serializing all language
objects. Being able to construct any object, including instances of
arbitrary classes, means that arbitrary code can be executed. All I
have to do is make a pickle file for an object that claims that its
constructor is shutil.rmtree().

You can't serialize/migrate arbitrary objects. Consider open TCP
connections, open files and other objects that extend outside the Python
VM.

Yes, yes, but that's really beside the point. Yes, there are some objects for which it doesn't even make sense to serialize. But my point is that even in this slightly smaller set of objects that *can* be serialized (and pickle currently does serialize), being able to serialize all of them entails arbitrary code execution to deserialize them. To allow people to write their own types that can be serialized, you have to let them specify arbitrary callables that will do the reconstruction. If you whitelist the possible reconstruction callables, you have greatly restricted the types that can participate in the serialization system.

Also objects hold references to each other, leading to a huge
reference mesh.

For example:

    a.buddy = b
    b.buddy = a
    with open("a", "wb") as f: f.write(serialize(a))
    with open("b", "wb") as f: f.write(serialize(b))

    with open("a", "rb") as f: aa = deserialize(f.read())
    with open("b", "rb") as f: bb = deserialize(f.read())
    assert aa.buddy is bb

Yeah, no one expects that to work. For example, if I deserialize the same string twice, you can't expect to get identical returned objects (as in, "deserialize(pickle) is deserialize(pickle)"). However, pickle does correctly handle fairly arbitrary reference graphs within the context of a single serialization, which is the most that can be asked of a serialization system. That isn't really a concern here.

>>> class A(object):
...     pass
...
>>> a = A()
>>> b = A()
>>> a.buddy = b
>>> b.buddy = a
>>> data = [a, b]
>>> data[0].buddy is data[1]
True
>>> data[1].buddy is data[0]
True
>>> import cPickle
>>> unpickled = cPickle.loads(cPickle.dumps(data))
>>> unpickled[0].buddy is unpickled[1]
True
>>> unpickled[1].buddy is unpickled[0]
True

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco

--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to