On Dec 11, 4:39 pm, Rami Chowdhury <rami.chowdh...@gmail.com> wrote: > On Fri, Dec 11, 2009 at 13:23, nnguyen <nguy...@gmail.com> wrote: > > > Any ideas on any expat tricks I'm missing out on? I'm also inclined to > > try another parser that can keep the string together when there are > > entities, or at least ampersands. > > IIRC expat explicitly does not guarantee that character data will be > handed to the CharacterDataHandler in complete blocks. If you're > certain you want to stay at such a low level, I would just modify your > char_data method to append character data to self.current_data rather > than replacing it. Personally, if I had the option (e.g. Python 2.5+) > I'd use ElementTree... >
Well the appending trick worked. From some logging I figured out that it was reading through those bits of current_data before getting to the subfield ending element (which is kinda obvious when you think about it). So I just used a += and made sure to clear out current_data when it hits a subfield ending element. Thanks! -- http://mail.python.org/mailman/listinfo/python-list