anon <[EMAIL PROTECTED]> writes: > So I've encountered a strange behavior that I'm hoping someone can fill > me in on. i've written a simple handler that works with one small > exception, when the parser encounters a line with '&' in it, it > only returns the portion that follows the occurence. > > For example, parsing a file with the line : > <key>mykey</key><value>some%20&%20value</value> > > results in getting "%20value" back from the characters method, rather > than "some%20&%20value". > > After looking into this a bit, I found that SAX supports entities and > that it is probably believing the & to be an entity and processing > it in some way that i'm unware of. I'm using the default > EntityResolver.
Are you sure you're not actually getting three chunks: "some%20", "&", and "%20value"? The xml.sax.handler.ContentHandler.characters method (which I presume you're using for SAX, as you don't mention!) is not guaranteed to get all contiguous character data in one call. Also check if .skippedEntity() methods are firing. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke |cookedm(at)physics(dot)mcmaster(dot)ca -- http://mail.python.org/mailman/listinfo/python-list