Gabriel Genellina wrote:
> On 21 ago, 18:36, [EMAIL PROTECTED] (John J. Lee) wrote:
>> Gabriel Genellina <[EMAIL PROTECTED]> writes:
>>
>> [...]> Don't even try to understand it - it's a mess. Use the HTMLParser
>>> module instead.
>> [...]
>>
>> Module sgmllib (and therefore module htmllib also) i
On 21 ago, 18:36, [EMAIL PROTECTED] (John J. Lee) wrote:
> Gabriel Genellina <[EMAIL PROTECTED]> writes:
>
> [...]> Don't even try to understand it - it's a mess. Use the HTMLParser
> > module instead.
>
> [...]
>
> Module sgmllib (and therefore module htmllib also) is more tolerant of
> bad HTML t
Gabriel Genellina <[EMAIL PROTECTED]> writes:
[...]
> Don't even try to understand it - it's a mess. Use the HTMLParser
> module instead.
[...]
Module sgmllib (and therefore module htmllib also) is more tolerant of
bad HTML than module HTMLParser.
John
--
http://mail.python.org/mailman/listinfo
[EMAIL PROTECTED] wrote:
> I personally think the application itself "feels" more complicated
> than it needs to be but its possible that is just my inexperience. I'm
> going to do some reading about the HTMLParser module. I'm sure I
> could make this spider a bit more functional in the process.
Those responses were both very helpful. John's additional type
checking is straight forward and easy to implement. I will also
rewrite the application a second time using the class Gabriel
offered. Both of these suggestions will help gain some insight into
how Python works.
"Don't even try to
On 20 ago, 15:44, "[EMAIL PROTECTED]"
<[EMAIL PROTECTED]> wrote:
> --
> f = formatter.AbstractFormatter(formatter.DumbWriter(StringIO()))
> parser = htmllib.HTMLParser(f)
> parser.feed(html)
> parser.close()
> return parser.anchor
"[EMAIL PROTECTED]" <[EMAIL PROTECTED]> writes:
[...]
> --
> f = formatter.AbstractFormatter(formatter.DumbWriter(StringIO()))
> parser = htmllib.HTMLParser(f)
> parser.feed(html)
> parser.close()
> return parser.anchorlist
> -
I am reading "Python for Dummies" and found the following example of a
web crawler that I thought was interesting. The first time I keyed
the program and executed it I didn't understand it well enough to
debug it so I just skipped it. A few days later I realized that it
failed after a few second