"Robert Brewer" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] Andreas Volz wrote: > I try to extract a http target from a URL that is given as parameter. > urlparse couldn't really help me. I tried it like this > > url="http://www.example.com/example.html?url=http://www.exampl > e.org/exa > mple.html" > > p = re.compile( '.*url=') > url = p.sub( '', url) > print url > > http://www.example.org/example.html > > This works, but if there're more parameters it doesn't work: > > url2="http://www.example.com/example.html?url=http://www.examp > le.org/exa > mple.html¶m=1" > > p = re.compile( '.*url=') > url2 = p.sub( '', url2) > print url2 > > http://www.example.org/example.html¶m=1 > > I played with regex to find one that matches also second case with > multible parameters. I think it's easy, but I don't know how > to do. Can you help me?
I'd go back to urlparse if I were you. >>> import urlparse >>> url="http://www.example.com/example.html?url=http://www.example.org/example. html" >>> urlparse.urlparse(url) ('http', 'www.example.com', '/example.html', '','url=http://www.example.org/example.html', '') >>> query = urlparse.urlparse(url)[4] >>> params = [p.split("=", 1) for p in query.split("&")] >>> params [['url', 'http://www.example.org/example.html']] >>> urlparse.urlparse(params[0][1]) ('http', 'www.example.org', '/example.html', '', '', '') << Added by Paul>> Robert Brewer's params list comprehension may be a bit much to swallow all at once for someone new to Python, but it is a very slick example, and it works for multiple parameters. [p.split("=", 1) for p in query.split("&")] First of all, you see that the variable query is returned from urlparse and contains everything in the original url after the '?' mark. Now the list comprehension contains 'query.split("&")' - this will return a list of strings containing each of the individual parameter assignments. 'for p in query.split("&")' will iterate over this list and give us back the temporary variable 'p' representing each individual parameter in turn. For example [p for p in query.split("&")] is sort of a nonsense list comprehension, it just builds a list from the list returned from query.split("&"). But instead, Robert splits each 'p' at its equals sign, so for each parameter we get a 2-element list: the parameter, and its assigned value. Using a list comprehension does all of this iteration and list building in one single, compact statement. A long spelled out version would look like: allparams = query.split("&") params = [] for p in allparams: params.append( p.split("=",1) ) Now if we make a slight change Robert Brewer's "params = [p.split..." line to, and construct a dictionary using dict(): params = dict( [p.split("=", 1) for p in query.split("&")] ) this will create a dictionary for you (the dict() constructor will accept a list of pairs, and interpret them as key-value entries into the dictionary). Then you can reference the params by name. Here's the example, with more than one param in the url. >>> url="http://www.example.com/example.html?url=http://www.example.org/example. html&url2=http://www.xyzzy.net/zork.html" >>> print urlparse.urlparse(url) ('http', 'www.example.com', '/example.html', '', 'url=http://www.example.org/example.html&url2=http://www.xyzzy.net/zork.html ', '') >>> query = urlparse.urlparse(url)[4] >>> params = dict([p.split("=", 1) for p in query.split("&")]) >>> print params {'url': 'http://www.example.org/example.html', 'url2': 'http://www.xyzzy.net/zork.html'} >>> print params.keys() ['url', 'url2'] >>> print params['url'] http://www.example.org/example.html >>> print params['url2'] http://www.xyzzy.net/zork.html List comprehensions are another powerful tool to put in your Python toolbox. Keep pluggin' away, Andreas! -- Paul -- http://mail.python.org/mailman/listinfo/python-list