asit wrote: > import httplib > > class Server: > #server class > def __init__(self, host): > self.host = host > def fetch(self, path): > http = httplib.HTTPConnection(self.host) > http.request("GET", path) > r = http.getresponse() > print str(r.status) + " : " + r.reason > > server = Server("www.python.org") > fp=open("phpvuln.txt") > x=fp.readlines(); > for y in x: > server.fetch("/" + y); > > The above code fetches only the html source of the webpage. How to get > the image, flash animation and other stuffs ????
By parsing the result, extracting img-tags and others that contain references, and fetching these explicitly. The keywords on this for successful search are "python" and "crawling" or "crawler". There are some out there, e.g. here: http://weblab.infosci.cornell.edu/documentation/webbackscript Diez -- http://mail.python.org/mailman/listinfo/python-list