On Oct 14, 3:19 am, Roy Smith <r...@panix.com> wrote: > I've got to write some tests in python which simulate getting a page of > HTML from an http server, finding a link, clicking on it, and then > examining the HTML on the next page to make sure it has certain features. > > I can use urllib to do the basic fetching, and lxml gives me the tools > to find the link I want and extract its href attribute. What's missing > is dealing with turning the href into an absolute URL that I can give to > urlopen(). Browsers implement all sorts of stateful logic such as "if > the URL has no hostname, use the same hostname as the current page". > I'm talking about something where I can execute this sequence of calls: > > urlopen("http://foo.com:9999/bar") > urlopen("/baz") > > and have the second one know that it needs to get > "http://foo.com:9999/baz". Does anything like that exist? > > I'm really trying to stay away from Selenium and go strictly with > something I can run under unittest.
lxml.html.make_links_absolute() ? -- http://mail.python.org/mailman/listinfo/python-list