Re: BeautifulSoup vs. Microsoft

2007-03-29 Thread John Nagle
Duncan Booth wrote: > John Nagle <[EMAIL PROTECTED]> wrote: > > >>Strictly speaking, it's Microsoft's fault. >> >> title="". So all that following stuff is from what >>follows the next "-->" which terminates a comment. > > > It is an attribute value, and unescaped angle brackets are valid

Re: BeautifulSoup vs. Microsoft

2007-03-29 Thread Paul McGuire
On Mar 29, 1:50 am, John Nagle <[EMAIL PROTECTED]> wrote: > Here's a construct with which BeautifulSoup has problems. It's > from "http://support.microsoft.com/contactussupport/?ws=support";. > > This is the original: > > http://www.microsoft.com/usability/enroll.mspx"; > id="L_75998" >

Re: BeautifulSoup vs. Microsoft

2007-03-29 Thread Duncan Booth
"Justin Ezequiel" <[EMAIL PROTECTED]> wrote: > On Mar 29, 4:08 pm, Duncan Booth <[EMAIL PROTECTED]> wrote: >> John Nagle <[EMAIL PROTECTED]> wrote: >> > title="

Re: BeautifulSoup vs. Microsoft

2007-03-29 Thread Justin Ezequiel
On Mar 29, 6:11 pm, "Justin Ezequiel" <[EMAIL PROTECTED]> wrote: > > FWIW, seehttp://tinyurl.com/yjtzjz > hmm. not quite right. http://tinyurl.com/ynv4ct or http://www.crummy.com/software/BeautifulSoup/documentation.html#Customizing%20the%20Parser -- http://mail.python.org/mailman/listinfo/py

Re: BeautifulSoup vs. Microsoft

2007-03-29 Thread Justin Ezequiel
On Mar 29, 4:08 pm, Duncan Booth <[EMAIL PROTECTED]> wrote: > John Nagle <[EMAIL PROTECTED]> wrote: > > title="

Re: BeautifulSoup vs. Microsoft

2007-03-29 Thread Duncan Booth
John Nagle <[EMAIL PROTECTED]> wrote: > Strictly speaking, it's Microsoft's fault. > > title="". So all that following stuff is from what > follows the next "-->" which terminates a comment. It is an attribute value, and unescaped angle brackets are valid in attributes. It looks to me lik

BeautifulSoup vs. Microsoft

2007-03-28 Thread John Nagle
Here's a construct with which BeautifulSoup has problems. It's from "http://support.microsoft.com/contactussupport/?ws=support";. This is the original: http://www.microsoft.com/usability/enroll.mspx"; id="L_75998" title="". So all that following stuff is from what follows the next "-