Hi anonymous, your code is working perfectly right. It's just that the only time that you find anything matching //div[@class="col f-cb"] is this one:
<div class="col f-cb"> <div class="name s-fc3 f-fl">名称</div> <div class="down s-fc3 f-fl">视频下载</div> <div class="desc s-fc3 f-fl">课程简介</div> </div> And obviously, there's no <a> in there, so the xpath won't match. Cheers, Philipp On 02/22/2013 02:24 AM, python wrote: > I am having issues with the urllib and lxml.html modules. > > Here is my original code: > > import urllib > import lxml.html > down='http://v.163.com/special/visualizingdata/' > file=urllib.urlopen(down).read() > root=lxml.html.document_fromstring(file) > xpath_str="//div[@class='down s-fc3 f-fl']/a" > urllist=root.xpath(xpath_str)for url in urllist:print url.get("href") > > When run, it returns this output: > > http://mov.bn.netease.com/movieMP4/2012/12/A/7/S8H1TH9A7.mp4 > http://mov.bn.netease.com/movieMP4/2012/12/D/9/S8H1ULCD9.mp4 > http://mov.bn.netease.com/movieMP4/2012/12/4/P/S8H1UUH4P.mp4 > http://mov.bn.netease.com/movieMP4/2012/12/B/V/S8H1V8RBV.mp4 > http://mov.bn.netease.com/movieMP4/2012/12/6/E/S8H1VIF6E.mp4 > http://mov.bn.netease.com/movieMP4/2012/12/B/G/S8H1VQ2BG.mp4 > > But, when I change the line > > xpath_str='//div[@class="down s-fc3 f-fl"]//a' > > into > > xpath_str='//div[@class="col f-cb"]//div[@class="down s-fc3 f-fl"]//a' > > that is to say, > > urllist=root.xpath('//div[@class="col f-cb"]//div[@class="down s-fc3 > f-fl"]//a') > > I do not receive any output. What is the flaw in this code? > it is so strange that the shorter one can work,the longer one can not,they > have the same xpath structure! > > >
signature.asc
Description: OpenPGP digital signature
-- http://mail.python.org/mailman/listinfo/python-list