Ezio Melotti added the comment: Sorry, I misread your code, looks like you want the href *without* 'cve'. In that case change my code to use "'cve' not in attrs['href']" (also avoid using s.find('cve') == -1 , and use the more readable and idiomatic 'cve' not in s ).
I think your original script doesn't work for two reasons: 1) you are looking for a table with class="tablesorter", but in the HTML the table doesn't have that class, so self.is_table is never set to True; 2) you are finding the href of the <a> with a "style" attribute and correctly setting it to self.href_name, but the value is then replaced by "" when the following <a> without "style" is found; That said, I still suggest you to abandon sgmllib and use HTMLParser, or possibly an external module like BeautifulSoup or LXML. ---------- resolution: -> invalid stage: -> committed/rejected status: open -> closed _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue16513> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com