On Fri, Jun 8, 2012 at 4:09 PM, kracethekingmaker <kracethekingma...@gmail.com> wrote: > >> Hello, >> >> I am newbie to Python coding. And, I had a question. I want to write a >> script which will check content changes in websites& send e-mail to a >> >> admin whenever there are changes. > > How many times in a day or how often will this check be performed ? > > You must look into how to use md5, diff utilities, for web scraping scrapy > library is advised. > >> Ideally this script/program should be scalable for say about 1000 websites >> at a time..
1000 sites at a time? Wow, that's huge. Scraping that many sites is resource intensive, would need a nice big stable server that can handle the huge data dumps. Fwiw, Scrapy will only dump the data in the json files so check out a little about the database you want to use, the frontend to serve it, a queueing system to scale 1000 sites, etc... Also, some sites instantly ban scrapers. Watch out for that, and goodluck :) -- Regards, Vid ॥ http://svaksha.com ॥ _______________________________________________ BangPypers mailing list BangPypers@python.org http://mail.python.org/mailman/listinfo/bangpypers