Re: Downloading the feed using feedparser

Irmen de Jong Wed, 04 Sep 2013 11:32:37 -0700

On 4-9-2013 13:12, mukesh tiwari wrote:
> Hello all, I am trying to download the feed of 
> http://blogs.forrester.com/feed but I
> am stuck with a problem.
> 
>>>> import feedparser d = feedparser.parse('http://blogs.forrester.com/feed') 
>>>> d.etag
> u'"1378291653-1"'
>>>> d.modified
> 'Wed, 04 Sep 2013 10:47:33 +0000'
> 
>>>> feedparser.parse('http://blogs.forrester.com/feed', etag=d.etag,
>>>> modified=d.modified).status
> 200
> 
> When I am running this, should not this be 304 ( The content can't be change 
> so fast
> in a moment or this server is not configured properly ). If I rely on this 
> then
> whenever I run the code, I will download the content irrespective of content 
> changed
> or not. Could some one please suggest me how to avoid the duplicate download ?


No it's correct because repeatedly downloading that URL gives me a different 
etag and
last-modified header in the server's response. Their server is very likely to be
generating the data on the fly every time you retrieve that location. Why do 
you assume
this can't change so fast? It is very likely not a static file that is being 
retrieved,
but rather a piece of content that is generated for every request, by their 
server
application.


> 
> The below one is working fine so if I try to download again then I will get 
> 304
> response since no data is changed on server.
> 
>>>> d = feedparser.parse("feed://feeds.huffingtonpost.com/HP/MostPopular") 
>>>> d.etag

http, I pressume...........^^^^

But yeah, that url gives the same etag and last-modified header in the 
response, when
repeatedly downloading it. This is probably a static file that is being updated 
once in
a while.

Irmen
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Downloading the feed using feedparser

Reply via email to