from:"florent"

trying to parse non valid html documents with HTMLParser

2005-08-02 Thread florent

I'm trying to parse html documents from the web, using the HTMLParser class of the HTMLParser module (python 2.3), but some web documents are not fully valids. When the parser finds an invalid tag, he raises an exception. Then it seems impossible to resume the parsing just after where the excep

Re: trying to parse non valid html documents with HTMLParser

2005-08-03 Thread florent

> AFAIK not with HTMLParser or htmllib. You might try (if you haven't done > yet) htmllib and see, which parser is more forgiving. Thanks, I'll try htmllib. In other case, I found a solution. Feeding data to the HTMLParser by chunks extracted from the string using string.split("<"), will allow me

Re: trying to parse non valid html documents with HTMLParser

2005-08-03 Thread florent

> From http://www.crummy.com/software/BeautifulSoup/: > > You didn't write that awful page. You're just trying to get > some data out of it. Right now, you don't really care what > HTML is supposed to look like. > > Neither does this parser. True, I just want to extract some dat

Re: trying to parse non valid html documents with HTMLParser

2005-08-03 Thread florent

> AFAIK not with HTMLParser or htmllib. You might try (if you haven't done > yet) htmllib and see, which parser is more forgiving. You were right, the HTMLParser of htmllib is more permissive. He just ignores the bad tags ! -- http://mail.python.org/mailman/listinfo/python-list

Re: trying to parse non valid html documents with HTMLParser

2005-08-03 Thread florent

> Are you saying that Beautiful Soup can't parse the HTML? If so, I'm > sure the author would like an example so he can "fix" it. I finally use the htmllib module wich is more permissive than the HTMLParser module when parsing bad html documents. Anyway, where can I find the author's contact in

Pylint bug in python35 b2

2015-07-02 Thread Florent Quesselaire

installer is amazing. thanks. Regards, Florent Quesselaire -- https://mail.python.org/mailman/listinfo/python-list

Re: Embedding a binary file in a python script

2006-02-20 Thread Florent Manens

quot;" to your source code. In [9]:data = """eJxzys/Lyi8tUgQADecDAQ== .9.:""" In [10]:data.decode("base64").decode("zlib") Out[10]:'Bonjour!' If you want to create a .exe with files included, take a look at PyInstaller. Hope this helps. -- Florent Manens [EMAIL PROTECTED] http://grossac.org -- http://mail.python.org/mailman/listinfo/python-list

trying to parse non valid html documents with HTMLParser

Re: trying to parse non valid html documents with HTMLParser

Re: trying to parse non valid html documents with HTMLParser

Re: trying to parse non valid html documents with HTMLParser

Re: trying to parse non valid html documents with HTMLParser

Pylint bug in python35 b2

Re: Embedding a binary file in a python script

7 matches

Site Navigation

Mail list logo

Footer information