Re: Suitable Python code to scrape specific details from web pages.

Roy Smith Tue, 12 Aug 2014 14:32:23 -0700

In article <a8f10c4f-d4a0-48ed-ae92-2a43e9a09...@googlegroups.com>,
 Simon Evans <musicalhack...@yahoo.co.uk> wrote:


> Dear Programmers,
> I have been looking at the You tube 'Web Scraping Tutorials' of Chris Reeves. 
> I have tried a few of his python programs in the Python27 command prompt, but 
> altered them from accessing data using links say from the Dow Jones index, to 
> accessing the details I would be interested in accessing from the 'Racing 
> Post' on a daily basis. Anyhow, the code it returns is not in the example I 
> am going to give, is not the information I am seeking, instead of returning 
> the given odds on a horse, it only returns a [], which isn't much use. 
> I would be glad if you could tell me where I am going wrong. 

Rather than comment on your specific code (but, thank you for posting 
it), I'll make a couple of more generic suggestions.

First, if you're doing anything with fetching web pages, install the 
wonderful requests module (http://docs.python-requests.org/en/latest/).  
It's so much easier to work with than urllib.

Second, if you're going to be parsing web pages, trying to use regexes 
is a losing game.  You need something that knows how to parse HTML.  The 
canonical answer is lxml (http://lxml.de/), but Beautiful Soup 
(http://www.crummy.com/software/BeautifulSoup/) is less intimidating to 
use.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Suitable Python code to scrape specific details from web pages.

Reply via email to