Below is a program I found at
http://starship.python.net/crew/neale/ (though it does
not seem to be there anymore.)  It uses a seperate
file for the URLs

--- Adisegna <[EMAIL PROTECTED]> wrote:

> My question is how to use a loop to go through a
> tuple of URLs. Please feel
> free to suggest an easier way to do the same thing.


#!/usr/bin/env python

"""watch.py -- Web site change notification tool
Author: Neale Pickett <[EMAIL PROTECTED]>
Time-stamp: <2003-01-24 13:52:13 neale>

This is something you can run from a cron job to
notify you of changes
to a web site.  You just set up a ~/.watchrc file, and
run watcher.py
from cron.  It mails you when a page has changed.

I use this to check for new software releases on sites
that just change
web pages; my wife uses it to check pages for classes
she's in.

You'll want a ~/.watchrc that looks something like
this:

    to: [EMAIL PROTECTED]
    http://www.example.com/path/to/some/page.html

The 'to:' line tells watch.py where to send change
notification email.
You can also specify 'from:' for an address the
message should come from
(defaults to whatever to: is), and 'host:' for what
SMTP server to send
the message through (defaults to localhost).

When watch.py checks a URL for the first time, it will
send you a
message (so you know it's working) and write some
funny characters after
the URL in the .watchrc file.  This is
normal--watch.py uses these
characters to remember what the page looked like the
last time it
checked.

"""

import os.path
import urllib2 as urllib
import sha
import smtplib

rc = '~/.watchrc'
host = 'localhost'
fromaddr = None
toaddr = None

def hash(data):
    return sha.new(data).hexdigest()

def notify(url):
    msg = """From: URL Watcher <%(from)s>
To: %(to)s
Subject: %(url)s changed

%(url)s has changed!
""" % {'from': fromaddr,
       'to':   toaddr,
       'url':  url}
    s = smtplib.SMTP(host)
    s.sendmail(fromaddr, toaddr, msg)
    s.quit()

fn = os.path.expanduser(rc)

f = open(fn)
outlines = []
for line in f.xreadlines():
    if line[0] == '#':
        continue

    line = line.strip()
    if not line:
        continue
    
    splits = line.split(' ', 1)
    url = splits[0]
    if url == 'from:':
        fromaddr = splits[1]
    elif url == 'to:':
        toaddr = splits[1]
        if not fromaddr:
            fromaddr = toaddr
    elif url == 'host:':
        host = splits[1]
    else:
        if (not fromaddr) or (not toaddr):
            raise ValueError("must set to: before any
urls")
        page = urllib.urlopen(url).read()
        ph = hash(page)

        try:
            h = splits[1]
        except IndexError:
            h = None
        if h != ph:
            notify(url)
        line = '%s %s' % (url, ph)
    outlines.append(line)

f.close()

f = open(fn, 'w')
f.write('\n'.join(outlines) + '\n')
f.close()



                
___________________________________________________________ 
Yahoo! Model Search 2005 - Find the next catwalk superstars - 
http://uk.news.yahoo.com/hot/model-search/
_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Reply via email to