<[EMAIL PROTECTED]> wrote:

> Hi,
> 
> I have a list of url names like this, and I am trying to strip out the
> domain name using the following code:
> 
> http://www.cnn.com
> www.yahoo.com
> http://www.ebay.co.uk
> 
> pattern = re.compile("http:\\\\(.*)\.(.*)", re.S)
> match = re.findall(pattern, line)
> 
> if (match):
>         s1, s2 = match[0]
> 
>         print s2
> 
> but none of the site matched, can you please tell me what am i
> missing?

You're using reverse slashes in your RE pattern, to start with, while
the URLs contain plain slashes (or don't have any slashes, in the case
of the second one).

Anyway, forget REs, and use standard library module urlparse,
specifically its urlparse.urlsplit function.


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to