On May 21, 6:58 pm, [EMAIL PROTECTED] wrote: > Its not a variable I set, its one of HTMLParser's inbuilt variables. I > am using it with urlopen to get the source of a website and feed it to > htmlparser. > > def parse(self, page): > try: > self.feed(urlopen('http://' + page).read()) > except HTTPError: > print 'Error getting page source' > > This is the code I am using. I have tested the other modules and they > work fine, but I havn't got a clue how to fix this one.
You're not providing enough information. Try to post a minimal code fragment that demonstrates your error; it gives us all a common basis for discussion. Is your Spider class a subclass of HTMLParser? Is it over-riding __init__? If so, is it doing something like: super(Spider, self).__init__() If this is your issue, looking at the HTMLParser code you could get away with just doing the following in __init__: self.reset() This appears to be the function that adds the .rawdata attribute. Ideally, you should use the former super() syntax...you're less reliant on the implementation of HTMLParser that way. - alex23 -- http://mail.python.org/mailman/listinfo/python-list