> > Using regex to remove HTML is usually the wrong approach unless > > Thanks. This is one of those projects I've had in mind for a long > time, decided it was a good way to learn some python.
It's a good way to write increasingly complex regex! Basically because HTML is recursive in nature it is almost impossible to reliably use regex to parse HTML files. (The latest regex syntax can cope with recursion but its horribly complicated) So unless you accept the limitations of the method you may well become more frustrated by the regex stuff than you become experienced in Python. Alan G. "When all you have is a hammer everything looks like a nail" _______________________________________________ Tutor maillist - [email protected] http://mail.python.org/mailman/listinfo/tutor
