problem with my regex?

Brian Mon, 22 May 2006 14:21:49 -0700

I have a simple script below that is causing me some problems and I am
having a hard time tracking them down.  Here is the code:


import urllib
import re

def getPicLinks():
    found = []
    try:
        page =
urllib.urlopen("http://continuouswave.com/whaler/cetacea/";)
    except:
        print "ERROR RREADING PAGE."
        sys.exit()
    page1 = page.read()
    cetLinks = re.compile("cetaceaPage..\.html", page1)
    for line in page1:
        found.append(cetLinks.findall(line))
    print found

This is the error message:
"/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/sre_parse.py",
line 396, in _parse
    if state.flags & SRE_FLAG_VERBOSE:
TypeError: unsupported operand type(s) for &: 'str' and 'int'

I am trying to extract the links on a web page that have a similar
pattern.  Here is an example of the html source:

<HR>
<P><SMALL><A HREF="photoLog.html">PHOTO-LOG</A><br>
<A HREF="guide.html">How-To-Submit</A><BR><A
HREF="cetaceaPage01.html">01</A>&nbsp;|&nbsp;<A
HREF="cetaceaPage02.html">02</A>&nbsp;|&nbsp;<A
HREF="cetaceaPage03.html">03</A>&nbsp;|&nbsp;<A
HREF="cetaceaPage04.html">04</A>&nbsp;|&nbsp;<A
HREF="cetaceaPage05.html">05</A>&nbsp;|&nbsp;<A
HREF="cetaceaPage06.html">06</A>&nbsp;|&nbsp;<A
HREF="cetaceaPage07.html">07</A>&nbsp;|&nbsp;<A
HREF="cetaceaPage08.html">08</A>&nbsp;|&nbsp;<A
HREF="cetaceaPage09.html">09</A>&nbsp;|&nbsp;<A
HREF="cetaceaPage10.html">10</A>
<BR><A>

My problem is that I can't seem to be able to figure out what is going
wrong here.  Mostly because I am a bit confused by the error message as
it points to a file (presumable part of re) that I am unfamiliar with,
and I am a bit new with python.

Any help is greatly appreciated, as is your patience.

Brian

-- 
http://mail.python.org/mailman/listinfo/python-list

problem with my regex?

Reply via email to