ProvoWallis wrote: <snip>
From what I gather here is a quickie, probably better solutions on the way but this accomplishes the idea I think. Some helpful links: http://docs.python.org/lib/module-sgmllib.html http://docs.python.org/lib/module-HTMLParser.html http://docs.python.org/lib/module-htmllib.html --- from HTMLParser import HTMLParser data = """<main-section no="1"> <form id="graphic_1.tif"> <form id="graphic_2.tif"> <main-section no="2"> <form id="graphic_3.tif"> <main-section no="3"> <form id="graphic_4.tif"> <form id="graphic_5.tif"> <form id="graphic_6.tif"> """ class ParseForms(HTMLParser): def handle_starttag(self, tag, attrs): if tag == "form": # attrs argument is a list of tuples [(attribute, value)] # converted it to a dictionary to access attribute easier print "form id: %s" % dict(attrs).get('id') if __name__ == "__main__": parser = ParseForms() parser.feed(data) -- http://mail.python.org/mailman/listinfo/python-list