On May 22, 8:18 am, [EMAIL PROTECTED] wrote: > Sorry, im new to both python and newsgroups, this is all pretty > confusing. So I need a line in my __init__ function of my class? The > spider class I made inherits from HTMLParser. Its just using the > feed() function that produces errors though, the rest seems to work > fine.
Let me repeat: it would make this a lot easier if you would paste actual code. As you say, your Spider class inherits from HTMLParser, so you need to make sure that you set it up correctly so that the HTMLParser functionality you've inherited will work correctly (or work as you want it to work). If you've added your own __init__ to Spider, then the __init__ on HTMLParser is no longer called unless you *explicitly* call it yourself. Unfortunately, my earlier advice wasn't totally correct... HTMLParser is an old-style object, whereas super() only works for new-style objects, I believe. (If you don't know about old- v new-style objects, see http://docs.python.org/ref/node33.html). So there are a couple of approaches that should work for you: class SpiderBroken(HTMLParser): def __init__(self): pass # don't do any ancestral setup class SpiderOldStyle(HTMLParser): def __init__(self): HTMLParser.__init__(self) class SpiderNewStyle(HTMLParser, object): def __init__(self): super(SpiderNewStyle, self).__init__() Python 2.5.1 (r251:54863, May 1 2007, 17:47:05) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> html = open('temp.html','r').read() >>> from spider import * >>> sb = SpiderBroken() >>> sb.feed(html) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\Python25\lib\HTMLParser.py", line 107, in feed self.rawdata = self.rawdata + data AttributeError: SpiderBroken instance has no attribute 'rawdata' >>> so = SpiderOldStyle() >>> so.feed(html) >>> sn = SpiderNewStyle() >>> sn.feed(html) >>> The old-style version is probably easiest, so putting this line in your __init__ should fix your issue: HTMLParser.__init__(self) If this still isn't clear, please let me know. - alex23 -- http://mail.python.org/mailman/listinfo/python-list