> I want to save a web page. I use urllib to parse the web page. But I > find the saved file, where some content is missing. The missing part > is block from the original web page, such as this part <div > style="display: block;" id="GeneInts">...</div>.I don't know how to > parse a whole page without something block in it. Could you help me > figure it out? Thank you! > > > This is my program > > url = 'http://receptome.stanford.edu/hpmr/SearchDB/getGenePage.asp? > Param=4502931&ProtId=1&ProtType=Receptor' > f = urllib.urlretrieve(url,'test.html')
A web server may present different output depending on the client used. When you use your browser to look at the source and then use urllib's saved file you access the web server with different clients. I'm not saying this is your problem, but potentially it is. So you might want to make urllib appear as a browser by sending the appropriate headers. HTH, Daniel -- Psss, psss, put it down! - http://www.cafepress.com/putitdown -- http://mail.python.org/mailman/listinfo/python-list