I have this code: import xml.parsers.expat def start_element(name, attrs): print 'Start element:', name, attrs def end_element(name): print 'End element:', name def char_data(data): print 'Character data:', repr(data) p = xml.parsers.expat.ParserCreate() p.StartElementHandler = start_element p.EndElementHandler = end_element p.CharacterDataHandler = char_data fh=open("/home/sbassi/bioinfo/smallUniprot.xml","r") p.ParseFile(fh)
And I get this on the output: ... Start element: sequence {u'checksum': u'E0C0CC2E1F189B8A', u'length': u'393'} Character data: u'\n' Character data: u'MPKKKPTPIQLNPAPDGSAVNGTSSAETNLEALQKKLEELELDEQQRKRL' Character data: u'\n' Character data: u'EAFLTQKQKVGELKDDDFEKISELGAGNGGVVFKVSHKPSGLVMARKLIH' ... End element: sequence ... Is there a way to have the character data together in one string? I guess it should not be difficult, but I can't do it. Each time the parse reads a line, return a line, and I want to have it in one variable. (the file is here: http://sbassi.googlepages.com/smallUniprot.xml) -- http://mail.python.org/mailman/listinfo/python-list