If you need to check for the absence of certain content, then write tests
using Sahi or Selenium, and run those at periodic intervals.

Ram
On Jun 8, 2012 11:21 PM, "Bhavya" <bhavya.ma...@gmail.com> wrote:

> Thanks everyone:)...Much appreciated.
> I will work on it & let the group know how it goes.
>
> Thanks,
> Bhavya
>
> On Fri, Jun 8, 2012 at 1:06 PM, vid <v...@svaksha.com> wrote:
>
> > On Fri, Jun 8, 2012 at 4:09 PM, kracethekingmaker
> > <kracethekingma...@gmail.com> wrote:
> > >
> > >> Hello,
> > >>
> > >> I am newbie to Python coding. And, I had a question. I want to write a
> > >> script which will check content changes in websites&  send e-mail to a
> > >>
> > >> admin whenever there are changes.
> > >
> > > How many times in a day or how often will this check be performed ?
> > >
> > > You must look into how to use md5, diff utilities, for web scraping
> > scrapy
> > > library is advised.
> > >
> > >> Ideally this script/program should be scalable for say about 1000
> > websites
> > >> at a time..
> >
> > 1000 sites at a time? Wow, that's huge. Scraping that many sites is
> > resource intensive, would need a nice big stable server that can
> > handle the huge data dumps. Fwiw, Scrapy will only dump the data in
> > the json files so check out a little about the database you want to
> > use, the frontend to serve it, a queueing system to scale 1000 sites,
> > etc... Also, some sites instantly ban scrapers. Watch out for that,
> > and goodluck :)
> >
> > --
> > Regards,
> > Vid
> > ॥ http://svaksha.com ॥
> > _______________________________________________
> > BangPypers mailing list
> > BangPypers@python.org
> > http://mail.python.org/mailman/listinfo/bangpypers
> >
> _______________________________________________
> BangPypers mailing list
> BangPypers@python.org
> http://mail.python.org/mailman/listinfo/bangpypers
>
_______________________________________________
BangPypers mailing list
BangPypers@python.org
http://mail.python.org/mailman/listinfo/bangpypers

Reply via email to