I'm trying to parse HTML in a very generic way. So far, I'm using SGMLParser in the sgmllib module. The problem is that it forces you to parse very specific tags through object methods like start_a(), start_p() and the like, forcing you to know exactly which tags you want to handle. I want to be able to handle the start tags of any and all tags, like how one would do in the Xerces C++ XML parser. In other words, I would like a simple start() method that is called whenever any tag is encountered. How may I do this? Thank you...
-- http://mail.python.org/mailman/listinfo/python-list