Hi. I am trying to collapse an html table into a single line. Basically, anytime I see ">" & "<" with nothing but whitespace between them, I'd like to remove all the whitespace, including newlines. I've read the how-to and I have tried a bunch of things, but nothing seems to work for me:
-- table = open(r'D:\path\to\tabletest.txt', 'rb') strTable = table.read() #Below find the different sort of things I have tried, one at a time: strTable = strTable.replace(">\s<", "><") #I got this from the module docs strTable = strTable.replace(">.<", "><") strTable = ">\s+<".join(strTable) strTable = ">\s<".join(strTable) print strTable -- The table in question looks like this: <table width="80%" border="0"> <tr> <td> </td> <td colspan="2">Introduction</td> <td><div align="right">3</div></td> </tr> <tr> <td> </td> </tr> <tr> <td><i>ONE</i></td> <td colspan="2">Childraising for Parrots</td> <td><div align="right">11</div></td> </tr> </table> For extra kudos (and I confess I have been so stuck on the above problem I haven't put much thought into how to do this one) I'd like to be able to measure the number of characters between the <p> & </p> tags, and then insert a newline character at the end of the next word after an arbitrary number of characters..... I am reading in to a script a bunch of paragraphs formatted for a webpage, but they're all on one big long line and I would like to split them for readability. TIA Googleboy -- http://mail.python.org/mailman/listinfo/python-list