On Thu, Jun 15, 2017 at 9:47 PM, Rhodri James <rho...@kynesim.co.uk> wrote: >> 1) It is not secure. Check this out: >> https://stackoverflow.com/questions/1906927/xml-vulnerabilities#1907500 > XML and JSON share the vulnerabilities that come from having to parse > untrusted external input. XML then has some extra since it has extra > flexibility, like being able to specify external resources (potential attack > vectors) or entity substitution. If you don't need the extra flexibility, > feel free to use JSON, but don't for one moment think that makes you > inherently safe.
Not sure what you mean about parsing untrusted external input. Suppose you build a web server that receives POST data formatted either JSON or XML. You take a puddle of bytes, and then proceed to decode them. Let's say you also decree that it can't be more than 1MB of content (if it's more than that, you reject it without parsing). Okay. What vulnerabilities are there in JSON? You could have half a million open brackets followed by half a million close brackets, but that quickly terminates Python's parser with a recursion trap. You can segfault Python if you sys.setrecursionlimit() too high, but that's your fault, not JSON's. Within the 1MB limit, this is the most memory I can imagine using: >>> data = b"[" + b"{}," * 349524 + b"{}]" That expands to 349525 empty objects, represented in Python with dictionaries, at 288 bytes apiece (using the Python 3.5 size, before the new compact representation cut that to 240 bytes). Add in the surrounding list, all 3012904 bytes of it, and the original 1MB input has expanded to 103,676,104 bytes. That's a hundred-to-one expansion - significant, but hardly the worst thing an attacker can do. In the SO link above, a demo is given where a 200KB XML payload expands to >2GB, for a more than 10K-to-one expansion. "Inherently safe"? At very least, far FAR safer. Then there are two XML attacks involving external resource access. JSON fundamentally cannot do that, ergo you are inherently safe. And the final attack involves recursion. JSON also fundamentally cannot represent any form of recursion. Winner: JSON, with 3.5 points to XML's 0.5, and that's being generous enough to give each of them half a point for the payload expansion attack. Got any other attacks against JSON? Bear in mind, you have to attack the format itself, not a buggy parser implementation (which can be corrected in a bugfix release without hassles). ChrisA -- https://mail.python.org/mailman/listinfo/python-list