On Apr 14, 12:02 am, Michael Bentley <[EMAIL PROTECTED]> wrote: > On Apr 13, 2007, at 11:49 PM, [EMAIL PROTECTED] wrote: > > > > > Hi, > > > I have a list of url names like this, and I am trying to strip out the > > domain name using the following code: > > >http://www.cnn.com > >www.yahoo.com > >http://www.ebay.co.uk > > > pattern = re.compile("http:\\\\(.*)\.(.*)", re.S) > > match = re.findall(pattern, line) > > > if (match): > > s1, s2 = match[0] > > > print s2 > > > but none of the site matched, can you please tell me what am i > > missing? > > change re.compile("http:\\\\(.*)\.(.*)", re.S) to re.compile("http:\/ > \/(.*)\.(.*)", re.S)
Thanks. I try this: but when the 'line' is http://www.cnn.com, I get 's2' com, but i want 'cnn.com' (everything after the first '.'), how can I do that? pattern = re.compile("http:\/\/(.*)\.(.*)", re.S) match = re.findall(pattern, line) if (match): s1, s2 = match[0] print s2 -- http://mail.python.org/mailman/listinfo/python-list