John Nagle wrote:
> Note what happens when a bad declaration is found.
> SGMLParser.parse_declaration
> raises SGMLParseError, and the exception handler just sucks up the rest
> of the
> input (note that "rawdata[i:]"), treats it as unparsed data, and advances
> the position to the end of input
Robert Kern wrote:
> Carl Banks wrote:
>
>>On Apr 4, 4:55 pm, Robert Kern <[EMAIL PROTECTED]> wrote:
>>
>>>Carl Banks wrote:
>>>
On Apr 4, 2:43 pm, Robert Kern <[EMAIL PROTECTED]> wrote:
>Carl Banks wrote:
>
>>On Apr 4, 2:08 pm, John Nagle <[EMAIL PROTECTED]> wrote:
>>
>>>
Carl Banks wrote:
> On Apr 4, 4:55 pm, Robert Kern <[EMAIL PROTECTED]> wrote:
>> Carl Banks wrote:
>>> On Apr 4, 2:43 pm, Robert Kern <[EMAIL PROTECTED]> wrote:
Carl Banks wrote:
> On Apr 4, 2:08 pm, John Nagle <[EMAIL PROTECTED]> wrote:
>> BeautifulSoup can't parse this page usefully
On Apr 4, 4:55 pm, Robert Kern <[EMAIL PROTECTED]> wrote:
> Carl Banks wrote:
> > On Apr 4, 2:43 pm, Robert Kern <[EMAIL PROTECTED]> wrote:
> >> Carl Banks wrote:
> >>> On Apr 4, 2:08 pm, John Nagle <[EMAIL PROTECTED]> wrote:
> BeautifulSoup can't parse this page usefully at all.
> It tre
Carl Banks wrote:
> On Apr 4, 2:43 pm, Robert Kern <[EMAIL PROTECTED]> wrote:
>> Carl Banks wrote:
>>> On Apr 4, 2:08 pm, John Nagle <[EMAIL PROTECTED]> wrote:
BeautifulSoup can't parse this page usefully at all.
It treats the entire page as a text chunk. It's actually
HTMLParser th
John Nagle wrote:
> The syntax that browsers understand as HTML comments is much less
> restrictive than what BeautifulSoup understands. I keep running into
> sites with formally incorrect HTML comments which are parsed happily
> by browsers. Here's yet another example, this one from
> "http://ww
On Apr 4, 2:43 pm, Robert Kern <[EMAIL PROTECTED]> wrote:
> Carl Banks wrote:
> > On Apr 4, 2:08 pm, John Nagle <[EMAIL PROTECTED]> wrote:
> >> BeautifulSoup can't parse this page usefully at all.
> >> It treats the entire page as a text chunk. It's actually
> >> HTMLParser that parses comments, s
Carl Banks wrote:
> On Apr 4, 2:08 pm, John Nagle <[EMAIL PROTECTED]> wrote:
>> The syntax that browsers understand as HTML comments is much less
>> restrictive than what BeautifulSoup understands. I keep running into
>> sites with formally incorrect HTML comments which are parsed happily
>> b
Carl Banks wrote:
> On Apr 4, 2:08 pm, John Nagle <[EMAIL PROTECTED]> wrote:
> > The syntax that browsers understand as HTML comments is much less
> > restrictive than what BeautifulSoup understands. I keep running into
> > sites with formally incorrect HTML comments which are parsed happily
Carl Banks wrote:
> On Apr 4, 2:08 pm, John Nagle <[EMAIL PROTECTED]> wrote:
>> BeautifulSoup can't parse this page usefully at all.
>> It treats the entire page as a text chunk. It's actually
>> HTMLParser that parses comments, so this is really an HTMLParser
>> level problem.
>
> Google for a
On Apr 4, 2:08 pm, John Nagle <[EMAIL PROTECTED]> wrote:
> The syntax that browsers understand as HTML comments is much less
> restrictive than what BeautifulSoup understands. I keep running into
> sites with formally incorrect HTML comments which are parsed happily
> by browsers. Here's yet
The syntax that browsers understand as HTML comments is much less
restrictive than what BeautifulSoup understands. I keep running into
sites with formally incorrect HTML comments which are parsed happily
by browsers. Here's yet another example, this one from
"http://www.webdirectory.com";. T
12 matches
Mail list logo