On 4-9-2013 13:12, mukesh tiwari wrote: > Hello all, I am trying to download the feed of > http://blogs.forrester.com/feed but I > am stuck with a problem. > >>>> import feedparser d = feedparser.parse('http://blogs.forrester.com/feed') >>>> d.etag > u'"1378291653-1"' >>>> d.modified > 'Wed, 04 Sep 2013 10:47:33 +0000' > >>>> feedparser.parse('http://blogs.forrester.com/feed', etag=d.etag, >>>> modified=d.modified).status > 200 > > When I am running this, should not this be 304 ( The content can't be change > so fast > in a moment or this server is not configured properly ). If I rely on this > then > whenever I run the code, I will download the content irrespective of content > changed > or not. Could some one please suggest me how to avoid the duplicate download ?
No it's correct because repeatedly downloading that URL gives me a different etag and last-modified header in the server's response. Their server is very likely to be generating the data on the fly every time you retrieve that location. Why do you assume this can't change so fast? It is very likely not a static file that is being retrieved, but rather a piece of content that is generated for every request, by their server application. > > The below one is working fine so if I try to download again then I will get > 304 > response since no data is changed on server. > >>>> d = feedparser.parse("feed://feeds.huffingtonpost.com/HP/MostPopular") >>>> d.etag http, I pressume...........^^^^ But yeah, that url gives the same etag and last-modified header in the response, when repeatedly downloading it. This is probably a static file that is being updated once in a while. Irmen -- https://mail.python.org/mailman/listinfo/python-list