Tina I: > I have a small, probably trivial even, problem. I have the following HTML:
This is a little data munging problem. If it's a one-shot problem, then you can just load it with a browser, copy and paste it as text, and then process the lines of the text in a simple way (splitting lines according to ":", and using the stripped pairs to feed a dict). If there are more Html files, or you want to automate things more, you can use html2text: http://www.aaronsw.com/2002/html2text/ A little script like this may help you: from html2text import html2text txt = html2text(the_html_data) lines = str(txt).replace("**", "").strip().splitlines() fields = [[field.strip() for field in line.split(":")] for line in lines] print dict(fields) Note that splitlines() is tricky, if you find some problems, then you may want a smarter splitter. Bye, bearophile -- http://mail.python.org/mailman/listinfo/python-list