Sebastian Bassi wrote: > I have this code: > > import xml.parsers.expat > def start_element(name, attrs): > print 'Start element:', name, attrs > def end_element(name): > print 'End element:', name > def char_data(data): > print 'Character data:', repr(data) > p = xml.parsers.expat.ParserCreate() > p.StartElementHandler = start_element > p.EndElementHandler = end_element > p.CharacterDataHandler = char_data > fh=open("/home/sbassi/bioinfo/smallUniprot.xml","r") > p.ParseFile(fh) > > And I get this on the output: > > ... > Start element: sequence {u'checksum': u'E0C0CC2E1F189B8A', u'length': > u'393'} > Character data: u'\n' > Character data: u'MPKKKPTPIQLNPAPDGSAVNGTSSAETNLEALQKKLEELELDEQQRKRL' > Character data: u'\n' > Character data: u'EAFLTQKQKVGELKDDDFEKISELGAGNGGVVFKVSHKPSGLVMARKLIH' > ... > End element: sequence > ... > > Is there a way to have the character data together in one string? I > guess it should not be difficult, but I can't do it. Each time the > parse reads a line, return a line, and I want to have it in one > variable.
Any reason you are using expat and not cElementTree's iterparse? Stefan -- http://mail.python.org/mailman/listinfo/python-list