On Jul 6, 5:39 pm, TimB <timbova...@gmail.com> wrote: > Hi everyone, new to python. I'm attempting to download a large amount > of webpages (about 600) to disk and for some reason a few of them > fail. > > I'm using this in a loop where pagename and urlStr change each time: > import urllib > try: > urllib.urlretrieve(urlStr, 'webpages/'+pagename+'.htm') > except IOError: > print 'Cannot open URL %s for reading' % urlStr > str1 = 'error!' > > Out of all the webpages, it does not work for these > three:http://exoplanet.eu/planet.php?p1=WASP-11/HAT-P-10&p2=bhttp://exoplanet.eu/planet.php?p1=HAT-P-27/WASP-40&p2=bhttp://exoplanet.eu/planet.php?p1=HAT-P-30/WASP-51&p2=b > giving "Cannot open URLhttp://exoplanet.eu/planet.php?p1=WASP-11/HAT-P-10&p2=b > for reading" etc. > > however copying and pasting the URL from the error message > successfully opens in firefox > > it successfully downloads the 500 or so other pages such > as:http://exoplanet.eu/planet.php?p1=HD+88133&p2=b > > I guess it has something to do with the forward slash in the names > (e.g. HAT-P-30/WASP-51 compared to HD+88133 in the examples above) > > Is there a way I can fix this? Thanks.
sorry, I was attempting to save the page to disk with the forward slash in the name, disreguard -- http://mail.python.org/mailman/listinfo/python-list