In <[EMAIL PROTECTED]>, Thomas Ploch wrote: > Alright, my prof said '... to process documents written in structural > markup languages using regular expressions is a no-no.' (Because of > nested Elements? Can't remember) So I think he wants us to use regexes > to learn them. He is pointing to HTMLParser though.
Problem is that much of the HTML in the wild is written in a structured markup language but it's in many cases broken. If you just search some words or patterns that appear somewhere in the documents then regular expressions are good enough. If you want to actually *parse* HTML "from the wild" better use the BeautifulSoup_ parser. .. _BeautifulSoup: http://www.crummy.com/software/BeautifulSoup/ > You are probably right. For me it boils down to these problems: > - Implementing a stack for large queues of documents which is faster > than list.pop(index) (Is there a lib for this?) If you need a queue then use one: take a look at `collections.deque` or the `Queue` module in the standard library. Ciao, Marc 'BlackJack' Rintsch -- http://mail.python.org/mailman/listinfo/python-list