Stefan Behnel wrote: > Sérgio Monteiro Basto wrote: >> but is one single error that blocks this. >> Finally I found it , it is : >> <td colspan="2"align="center" >> if I put : >> <td colspan="2" align="center" >> >> p = re.compile('"align') >> content = p.sub('" align', content) >> >> I can parse the html >> I don't know if it a bug of HTMLParser > > Sure, and next time your key doesn't open your neighbours house, please > report to the building company to have them fix the door. >
The question, here, is if <td colspan="2"align="center" is valid HTML or not ? I think is valid , if so it's a bug on HTMLParser if not, we still have a very bad message error (EOF in middle of construct !?) I have to use HTMLParser because I want use only python 2.4 standard , I have to install the scripts in many machines. And I have to parse many different sites, I just want extract the links, so with a clean up before parse solve very quickly my problem. Thanks, -- Sérgio M. B. -- http://mail.python.org/mailman/listinfo/python-list