Re: HTML Parsing and Indexing

2006-11-16 Thread Paul McGuire
On Nov 13, 1:12 pm, [EMAIL PROTECTED] wrote: > > I need a help on HTML parser. > > > I saw a couple of python parsers like pyparsing, yappy, yapps, etc but > they havn't given any example for HTML parsing. Geez, how hard did you look? pyparsing's wiki menu includes an 'Examples' link, which take

Re: HTML Parsing and Indexing

2006-11-13 Thread Stefan Behnel
[EMAIL PROTECTED] wrote: > I am involved in one project which tends to collect news > information published on selected, known web sites inthe format of > HTML, RSS, etc and sortlist them and create a bookmark on our website > for the news content(we will use django for web development). Curren

Re: HTML Parsing and Indexing

2006-11-13 Thread Andy Dingley
[EMAIL PROTECTED] wrote: > I am involved in one project which tends to collect news > information published on selected, known web sites inthe format of > HTML, RSS, etc I just can't imagine why anyone would still want to do this. With RSS, it's an easy (if not trivial) problem. With HTML

Re: HTML Parsing and Indexing

2006-11-13 Thread Bernard
a combination of urllib, urlib2 and BeautifulSoup should do it. Read BeautifulSoup's documentation to know how to browse through the DOM. [EMAIL PROTECTED] a écrit : > Hi All, > > I am involved in one project which tends to collect news > information published on selected, known web sites int

Re: HTML Parsing and Indexing

2006-11-13 Thread Fredrik Lundh
[EMAIL PROTECTED] wrote: > I need a help on HTML parser. http://www.effbot.org/pyfaq/tutor-how-do-i-get-data-out-of-html.htm -- http://mail.python.org/mailman/listinfo/python-list