On 30 July 2015 21:35:01 BST, Rob Richards <rricha...@cdatazone.org> wrote: >On 7/30/15 10:30 AM, Rowan Collins wrote: >> Rob Richards wrote on 30/07/2015 14:12: >>> If you are already working with a trusted document then you should >>> safely be able to disable the entity loader. If you aren't then >>> wouldn't you want to do some sort of checking (especially if you >dont >>> have an XML gateway fronting the system) for other malicious things >>> before even opening the document regardless if it has external >>> entities or not. >> >> Can you give any pointers to what kind of checking this would be, and > >> how it would be carried out without parsing the XML document in the >> first place? >> >> According to the bug report, one of the affected uses is the >> SoapClient, which by definition is dealing with remote data. I can >see >> how that could be considered "untrusted", but I can't think of any >> particular action that would make it more trusted (quite apart from >> the lack of an obvious point to intercept the data before it is >parsed). >> >> Would it not make more sense for the parser to operate in an >> "untrusted" mode - disabling external entities, maybe different >limits >> on stack depth, etc? >> >> Regards, > >All depends upon what you are trying to accomplish as this covers tree, > >streaming, different types of schemas, xsl, etc... >For example, you can easily check if there is a DTD, imports/includes, >specific xslt functionality, list goes on and on without ever having to > >load the document. There really is no one size fit all imo so what one >considers untrusted someone else would consider trusted.
So effectively we should all write partial XML parsers to determine the contents of the file, in order to decide if it's the data we expected? Would it not make more sense to leave that to the XML library, with a whitelist of features we actually need, URLs we trust for includes, etc? I never want an XML file to execute system commands on my behalf; do I have to write a regex to make sure they don't? Regards, -- Rowan Collins [IMSoP] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php