On Aug 13, 5:33 pm, Fredrik Lundh <[EMAIL PROTECTED]> wrote: > kettle wrote: > > I was wondering how I ought to be handling character range > > translations in python. > > > What I want to do is translate fullwidth numbers and roman alphabet > > characters into their halfwidth ascii equivalents. > > In perl I can do this pretty easily with tr: > > > tr/\x{ff00}-\x{ff5e}/\x{0020}-\x{007e}/; > > > and I think the string.translate method is what I need to use to > > achieve the equivalent in python. Unfortunately the maktrans method > > doesn't seem to accept character ranges and I'm also having trouble > > with it's interpretation of length. What I came up with was to first > > fudge the ranges: > > > my_test_string = u"ABCDEFG" > > f_range = "".join([unichr(x) for x in > > range(ord(u"\uff00"),ord(u"\uff5e"))]) > > t_range = "".join([unichr(x) for x in > > range(ord(u"\u0020"),ord(u"\u007e"))]) > > > then use these as input to maketrans: > > my_trans_string = > > my_test_string.translate(string.maketrans(f_range,t_range)) > > Traceback (most recent call last): > > File "<stdin>", line 1, in ? > > UnicodeEncodeError: 'ascii' codec can't encode characters in position > > 0-93: ordinal not in range(128) > > maketrans only works for byte strings. > > as for translate itself, it has different signatures for byte strings > and unicode strings; in the former case, it takes lookup table > represented as a 256-byte string (e.g. created by maketrans), in the > latter case, it takes a dictionary mapping from ordinals to ordinals or > unicode strings. > > something like > > lut = dict((0xff00 + ch, 0x0020 + ch) for ch in range(0x80)) > > new_string = old_string.translate(lut) > > could work (untested). > > </F>
excellent. i didnt realize from the docs that i could do that. thanks -- http://mail.python.org/mailman/listinfo/python-list