New submission from Serhiy Storchaka <storchaka+cpyt...@gmail.com>: plistlib creates a new objects when read references instead of using already read object.
As result it doesn't preserve identity: >>> import plistlib >>> a = [['spam']]*2 >>> a[0] is a[1] True >>> b = plistlib.loads(plistlib.dumps(a, fmt=plistlib.FMT_BINARY)) >>> b == a True >>> b[0] is b[1] False And plistlib.loads() is vulnerable to plists containing cyclic references (as was exposed in issue31897). For example, plistlib.loads(b'bplist00\xa1\x00\x08\x00\x00\x00\x00\x00\x00\x01\x01\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0a') could return a list containing itself, but it is failed with RecursionError. plistlib.dumps() preserves reference in the output, but it saves redundant copies. For example plistlib.dumps([[]]*5, fmt=plistlib.FMT_BINARY) saves a list containing 5 identical empty lists, but it saves an empty list 5 times, and only the last copy is used. The other 4 copies are not referenced and just spent the file volume and the space of reference numbers. Saving [[[[['spam']*100]*100]*100]*100]*100 will result in a multigigabyte, while less than a kilobyte would be enough for saving it. Loading properly saved [[[[['spam']*100]*100]*100]*100]*100 withe the current plistlib.loads() will cause consuming many gigabytes of memory. 1. The issues with plistlib.dumps() are: 1a) Inefficient saving data with references. This is minor resource usage issue. 1b) Impossibility to save a data with cyclic references. This is a lack of a feature. 2. The issues with plistlib.loads() are: 2a) Inefficient loading data with references. This can be not just a resource usage issue, but a security issue. Loading an malicious input data smaller than 100 byte ([[[...]*2]*2]*2) can cause consuming many gigabytes of memory. 2b) Impossibility to load a data with cyclic references. This is a lack of a feature, but can be lesser security issue. Small malicious input can cause RecursionError. If the recursion limit is set high and you are unlucky it can cause a stack overflow. Security issues affect you only when you load plists from untrusted sources. Adding the proper support of references could be considered a new feature, but taking to account security issues it should be backported up to 3.4 when the support of binary plists was added. ---------- assignee: serhiy.storchaka components: Library (Lib) messages: 306493 nosy: ned.deily, ronaldoussoren, serhiy.storchaka priority: normal severity: normal status: open title: Issues with binary plists type: security versions: Python 3.4, Python 3.5, Python 3.6, Python 3.7 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue32072> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com