Re: difference between urllib2.urlopen and firefox view 'page source'?

zacherates Mon, 19 Mar 2007 19:07:40 -0800

On Mar 19, 10:30 pm, "cjl" <[EMAIL PROTECTED]> wrote:
> Hi.
>
> I am trying to screen scrape some stock data from yahoo, so I am
> trying to use urllib2 to retrieve the html and beautiful soup for the
> parsing.
>
> Maybe (most likely) I am doing something wrong, but when I use
> urllib2.urlopen to fetch a page, and when I view 'page source' of the
> exact same URL in firefox, I am seeing slight differences in the raw
> html.
>
> Do I need to set a browser agent so yahoo thinks urllib2 is firefox?
> Is yahoo detecting that urllib2 doesn't process javascript, and
> passing different data?
>
> -cjl


http://developer.yahoo.com/yui/articles/gbs/index.html seems to
indicate that Yahoo! passes you different markup depending on which
grade your browser falls into.  I'm not sure I'd spoof your User-
Agent, after all your client is unlikely to support the features that
their looking for in Firefox (javascript, css, SVG).

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: difference between urllib2.urlopen and firefox view 'page source'?

Reply via email to