Hi RH, translate methods might be faster (and a little easier to read) for your use case. Just precompute and re-use the translation table punct_flatten.
Note that the translate method has changed somewhat for Python 3 due to the separation of text from bytes. The is a Python 3 version. from urllib.parse import urlparse flattened_chars = "./&=?" punct_flatten = str.maketrans(flattened_chars, '_' * len(flattened_chars)) parts = urlparse('http://alongnameofasite1234567.com/q?sports=run&a=1&b=1') unflattened = parts.netloc + parts.path + parts.query flattened = unflattened.translate(punct_flatten) print (flattened) Cheers, Nick On Thursday, 7 February 2013 08:41:05 UTC+11, rh wrote: > I am curious to know if others would have done this differently. And if so > > how so? > > > > This converts a url to a more easily managed filename, stripping the > > http protocol off. > > > > This: > > > > http://alongnameofasite1234567.com/q?sports=run&a=1&b=1 > > > > becomes this: > > > > alongnameofasite1234567_com_q_sports_run_a_1_b_1 > > > > > > def u2f(u): > > nx = re.compile(r'https?://(.+)$') > > u = nx.search(u).group(1) > > ux = re.compile(r'([-:./?&=]+)') > > return ux.sub('_', u) > > > > One alternate is to not do the compile step. There must also be a way to > > do it all at once. i.e. remove the protocol and replace the chars. -- http://mail.python.org/mailman/listinfo/python-list