Jean-Paul Calderone wrote:
> punycode is used by dns. A commonly used email codec is > quoted-printable. Here's an example of each: > > >>> u'Helló world'.encode('utf-8').encode('quopri') > 'Hell=C3=B3=20world' > >>> u'Helló world'.encode('punycode') > 'Hell world-jbb' > >>> > Note the extra trip through utf-8 for quoted-printable, as it is not > implemented in Python as a character encoding, but a byte encoding, so > you cannot (safely) apply it to a unicode string. > > Jean-Paul > >>> u'Helló world\\/\x00'.encode('punycode') 'Hell world\\/\x00-elb' >>> u'Helló world\\/\x00'.encode('utf-8').encode('quopri') 'Hell=C3=B3=20world\\/=00' >>> that doesn't remove \ / that other base.. things similar so finally found me reggae'ing :-( , but this provides minimal optical damage for common strings ... def encode_as_filename(s): def _(m): return "+%02X" % ord(m.group(0)) return re.sub('[\x00"\\\\/*?:<>|+\n]',_,s) def decode_from_filename(s): def _(m): return chr(int(m.group(0)[1:],16)) return re.sub("\\+[\dA-F]{2,2}",_,s) >>> newsletter.encode_as_filename('[EMAIL PROTECTED]/\\+\n\x00:+test') '[EMAIL PROTECTED]' >>> newsletter.decode_from_filename(_) '[EMAIL PROTECTED]/\\+\n\x00:+test' >>> Robert -- http://mail.python.org/mailman/listinfo/python-list