Hi,
 
I've a problem with regular express(dot problem). I checked Python Library Reference, but i can't find any infomation that is useful . Nor did Google.
They have the same way:
 
          re.compile(''www").match(string).
 
It always works better. But my pattern string must be an argument of re.compile(). In this case, I can not use
 
Here is my code. Bold text is where the problem is. Is there anyone know the right way to do this.
----------------------------------------------------------------------------------------------------------
def getLinkType(url, sitedomain):
    # get the domain which 'url' belongs to
    urldomain = urlparse4esa(url)[1]
   
    tmpsd = ''
    if re.compile('^www').match(sitedomain) is not None:
        tmpsd = sitedomain[4:]
   
    tmpsd.replace('.', '\.')        # it seems that it doesn't  work
    pattern = tmpsd + '$'        # this is my pattern string.
                                            # Example: pattern 'ibm.com'

    if re.compile(pattern).match(urldomain) is not None:
        return INTERNAL_LINK    # match. url is internal link
    else:
        return EXTERNAL_LINK    # doesn't match. url is external link
----------------------------------------------------------------------------------------------------------
 
Alex, China


__________________________________________________
赶快注册雅虎超大容量免费邮箱?
http://cn.mail.yahoo.com

-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to