In article <a8f10c4f-d4a0-48ed-ae92-2a43e9a09...@googlegroups.com>, Simon Evans <musicalhack...@yahoo.co.uk> wrote:
> Dear Programmers, > I have been looking at the You tube 'Web Scraping Tutorials' of Chris Reeves. > I have tried a few of his python programs in the Python27 command prompt, but > altered them from accessing data using links say from the Dow Jones index, to > accessing the details I would be interested in accessing from the 'Racing > Post' on a daily basis. Anyhow, the code it returns is not in the example I > am going to give, is not the information I am seeking, instead of returning > the given odds on a horse, it only returns a [], which isn't much use. > I would be glad if you could tell me where I am going wrong. Rather than comment on your specific code (but, thank you for posting it), I'll make a couple of more generic suggestions. First, if you're doing anything with fetching web pages, install the wonderful requests module (http://docs.python-requests.org/en/latest/). It's so much easier to work with than urllib. Second, if you're going to be parsing web pages, trying to use regexes is a losing game. You need something that knows how to parse HTML. The canonical answer is lxml (http://lxml.de/), but Beautiful Soup (http://www.crummy.com/software/BeautifulSoup/) is less intimidating to use. -- https://mail.python.org/mailman/listinfo/python-list