Re: extract news article from web

2004-12-29 Thread Simon Brunning
On 22 Dec 2004 09:22:15 -0800, Zhang Le <[EMAIL PROTECTED]> wrote: > Hello, > I'm writing a little Tkinter application to retrieve news from > various news websites such as http://news.bbc.co.uk/, and display them > in a TK listbox. All I want are news title and url information. Well, the BBC pub

RE: extract news article from web

2004-12-23 Thread Gabriel Cosentino de Barros
Title: RE: extract news article from web Excel in later offices has the "web query" feature. (sorry about top posting) -Original Message- From: Steve Holden [mailto:[EMAIL PROTECTED]] Sent: quinta-feira, 23 de dezembro de 2004 12:59 To: python-list@python.org Subject: R

Re: extract news article from web

2004-12-23 Thread Fuzzyman
If you have a reliably structured page, then you can write a custom parser. As Steve points out - BeautifulSOup would be a very good place to start. This is the problem that RSS was designed to solve. Many newssites will supply exactly the information you want as an RSS feed. You should then use U

Re: extract news article from web

2004-12-23 Thread Steve Holden
Zhang Le wrote: Thanks for the hint. The xml-rpc service is great, but I want some general techniques to parse news information in the usual html pages. Currently I'm looking at a script-based approach found at: http://www.namo.com/products/handstory/manual/hsceditor/ User can write some simple tem

Re: extract news article from web

2004-12-22 Thread Zhang Le
Thanks for the hint. The xml-rpc service is great, but I want some general techniques to parse news information in the usual html pages. Currently I'm looking at a script-based approach found at: http://www.namo.com/products/handstory/manual/hsceditor/ User can write some simple template to extrac

Re: extract news article from web

2004-12-22 Thread Steve Holden
Steve Holden wrote: [...] However, the code to extract the news is pretty simple. Here's the whole program, modulo newsreader wrapping. It would be shorter if I weren't stashing the extracted links it a relational database: [...] I see that, as is so often the case, I only told half the story,

Re: extract news article from web

2004-12-22 Thread Steve Holden
Zhang Le wrote: Hello, I'm writing a little Tkinter application to retrieve news from various news websites such as http://news.bbc.co.uk/, and display them in a TK listbox. All I want are news title and url information. Since each news site has a different layout, I think I need some template-base