Re: Output of HTML parsing

2007-06-19 Thread Stefan Behnel
Jackie schrieb: > On 6 15 , 2 01 , Stefan Behnel <[EMAIL PROTECTED]> wrote: >> Jackie wrote: > >> import lxml.etree as et >> url = "http://www.economics.utoronto.ca/index.php/index/person/faculty/"; >> tree = et.parse(url) >> > >> Stefan- - >> >> - - > > Thank you. But when I t

Re: Output of HTML parsing

2007-06-19 Thread Jackie
On 6 15 , 2 01 , Stefan Behnel <[EMAIL PROTECTED]> wrote: > Jackie wrote: > import lxml.etree as et > url = "http://www.economics.utoronto.ca/index.php/index/person/faculty/"; > tree = et.parse(url) > > Stefan- - > > - - Thank you. But when I tried to run the above part, the fo

Re: Output of HTML parsing

2007-06-15 Thread Stefan Behnel
Jackie wrote: > I want to get the information of the professors (name,title) from the > following link: > > "http://www.economics.utoronto.ca/index.php/index/person/faculty/"; That's even XHTML, no need to go through BeautifulSoup. Use lxml instead. http://codespeak.net/lxml > Ideally, I'd lik

Re: Output of HTML parsing

2007-06-15 Thread Sebastian Wiesner
[ Jackie <[EMAIL PROTECTED]> ] > 1.The code above assume that each Prof has a tilte. If any one of them > does not, the name and title will be mismatched. How to program to > allow that title can be empty? > > 2.Is there any easier way to get the data I want other than using > list? Use BeautifulS