"Ben Wilson" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > I am working on a script that splits a URL into a page and a url. The > examples below are the conditions I expect a user to pass to the > script. In all cases, "http://www.example.org/test/" is the URL, and > the page comprises parts that have upper case letters (note, 5 & 6 are > the same as earlier examples, sans the 'test'). > > 1. http://www.example.org/test/Main/AnotherPage (page = > Main/AnotherPage) > 2. http://www.example.org/test/Main (page = Main + '/' + > default_page) > 3. http://www.example.org/test (page = default_group + '/' + > default_page) > 4. http://www.example.org/test/ (page = default_group + '/' + > default_page) > 5. http://www.example.org/ (page = default_group + '/' + > default_page) > 6. http://www.example.org/Main/AnotherPage (page = Main/AnotherPage) > > Right now, I'm doing a simple split off condition 1: > > page = '.'.join(in.split('/')[-2:]) > url = '/'.join(in.split('/')[:-2]) + '/' > > Before I start winding my way down a complex path, I wanted to see if > anybody had an elegant approach to this problem. > > Thanks in advance. > Ben >
Standard Python includes urlparse. Possible help? -- Paul import urlparse urls = [ "http://www.example.org/test/Main/AnotherPage", # (page = Main/AnotherPage) "http://www.example.org/test/Main", # (page = Main + '/' + default_page) "http://www.example.org/test", # (page = default_group + '/' + default_page) "http://www.example.org/test/", # (page = default_group + '/' + default_page) "http://www.example.org/", # (page = default_group + '/' + default_page) "http://www.example.org/Main/AnotherPage", ] for u in urls: print u parts = urlparse.urlparse(u) print parts scheme,netloc,path,params,query,frag = parts print path.split("/")[1:] print prints: http://www.example.org/test/Main/AnotherPage ('http', 'www.example.org', '/test/Main/AnotherPage', '', '', '') ['test', 'Main', 'AnotherPage'] http://www.example.org/test/Main ('http', 'www.example.org', '/test/Main', '', '', '') ['test', 'Main'] http://www.example.org/test ('http', 'www.example.org', '/test', '', '', '') ['test'] http://www.example.org/test/ ('http', 'www.example.org', '/test/', '', '', '') ['test', ''] http://www.example.org/ ('http', 'www.example.org', '/', '', '', '') [''] http://www.example.org ('http', 'www.example.org', '', '', '', '') [] http://www.example.org/Main/AnotherPage ('http', 'www.example.org', '/Main/AnotherPage', '', '', '') ['Main', 'AnotherPage'] -- http://mail.python.org/mailman/listinfo/python-list