rh wrote: > I am curious to know if others would have done this differently. And if so > how so? > > This converts a url to a more easily managed filename, stripping the > http protocol off. > > This: > > http://alongnameofasite1234567.com/q?sports=run&a=1&b=1 > > becomes this: > > alongnameofasite1234567_com_q_sports_run_a_1_b_1 > > > def u2f(u): > nx = re.compile(r'https?://(.+)$') > u = nx.search(u).group(1) > ux = re.compile(r'([-:./?&=]+)') > return ux.sub('_', u) > > One alternate is to not do the compile step. There must also be a way to > do it all at once. i.e. remove the protocol and replace the chars.
Completely without regular expressions: import string ILLEGAL = "-:./?&=" try: TRANS = string.maketrans(ILLEGAL, "_" * len(ILLEGAL)) except AttributeError: # python 3 TRANS = dict.fromkeys(map(ord, ILLEGAL), "_") PROTOCOLS = {"http", "https"} def url_to_file(url): protocol, sep, rest = url.partition("://") if protocol not in PROTOCOLS: raise ValueError return rest.translate(TRANS) if __name__ == "__main__": url = "http://alongnameofasite1234567.com/q?sports=run&a=1&b=1" print(url) print(url_to_file(url)) -- http://mail.python.org/mailman/listinfo/python-list