On 5/2/2016 11:27 PM, jf...@ms4.hinet.net wrote:
DFS at 2016/5/3 9:12:24AM wrote:
try
from urllib.request import urlretrieve
http://stackoverflow.com/questions/21171718/urllib-urlretrieve-file-python-3-3
I'm running python 2.7.11 (32-bit)
Alright, it works...someway.
I try to get a zip file. It works, the file can be unzipped correctly.
from urllib.request import urlretrieve
urlretrieve("http://www.caprilion.com.tw/fed.zip", "d:\\temp\\temp.zip")
('d:\\temp\\temp.zip', <http.client.HTTPMessage object at 0x03102C50>)
But when I try to get this forum page, it does get a html file but can't be
viewed normally.
urlretrieve("https://groups.google.com/forum/#!topic/comp.lang.python/jFl3GJ
bmR7A", "d:\\temp\\temp.html")
('d:\\temp\\temp.html', <http.client.HTTPMessage object at 0x03102A90>)
I suppose the html is a much complex situation where more processes need to be
done before it can be opened by a web browser:-)
Who knows what Google has done... it won't open in Opera. The tab title
shows up, but after 20-30 seconds the screen just stays blank and the
cursor quits loading.
It's a mess - try running it thru BeautifulSoup.prettify() and it looks
better.
------------------------------------------------------------
import BeautifulSoup
from urllib.request import urlretrieve
webfile = "D:\\afile.html"
urllib.urlretrieve("https://groups.google.com/forum/#!topic/comp.lang.python/jFl3GJbmR7A",webfile)
f = open(webfile)
soup = BeautifulSoup.BeautifulSoup(f)
f.close()
print soup.prettify()
------------------------------------------------------------
--
https://mail.python.org/mailman/listinfo/python-list