I'm looking for the "best" way to strip a large set of chars from a filename string (my definition of best usually means succinct and readable). I only want to allow alphanumeric chars, dashes, and periods. This is what I would write in Perl (bless me father, for I have sinned...):
$filename =~ tr/\w.-//cd, or equivalently $filename =~ s/[^\w.-]// I could just use re.sub like the second example, but that's a bit overkill. I'm trying to figure out if there's a good way to do the same thing with string methods. string.translate seems to do what I want, the problem is specifying the set of chars to remove. Obviously hardcoding them all is a non-starter. Working with chars seems to be a bit of a pain. There's no equivalent of the range function, one has to do something like this: >>> [chr(x) for x in range(ord('a'), ord('z')+1)] ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'] Do that twice for letters, once for numbers, add in a few others, and I get the chars I want to keep. Then I'd invert the set and call translate. It's a mess and not worth the trouble. Unless there's some way to expand a compact representation of a char list and obtain its complement, it looks like I'll have to use a regex. Ideally, there would be a mythical charset module that works like this: >>> keep = charset.expand (r'\w.-') # or r'a-zA-Z0-9_.-' >>> toss = charset.invert (keep) Sadly I can find no such beast. Anyone have any insight? As of now, regexes look like the best solution. -- http://mail.python.org/mailman/listinfo/python-list