japanese encoding iso-2022-jp in python vs. perl

2007-10-23 Thread kettle
Hi,
  I am rather new to python, and am currently struggling with some
encoding issues.  I have some utf-8-encoded text which I need to
encode as iso-2022-jp before sending it out to the world. I am using
python's encode functions:
--
 var = var.encode("iso-2022-jp", "replace")
 print var
--

 I am using the 'replace' argument because there seem to be a couple
of utf-8 japanese characters which python can't correctly convert to
iso-2022-jp.  The output looks like this:
↓東京???日比谷線?北千住行

 However if use perl's encode module to re-encode the exact same bit
of text:
--
 $var = encode("iso-2022-jp", decode("utf8", $var))
 print $var
--

 I get proper output (no unsightly question-marks):
↓東京メトロ日比谷線・北千住行

So, what's the deal?  Why can't python properly encode some of these
characters?  I know there are a host of different iso-2022-jp
variants, could it be using a different one than I think (the
default)?  I'm quite liking python at the moment for a variety of
different reasons (I suspect perl will forever win when it comes to
regular expressions but everything else is pretty darn nice), but this
is a bit worrying.

-Joe

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: japanese encoding iso-2022-jp in python vs. perl

2007-10-24 Thread kettle
Thanks Leo, and everyone else, these were very helpful replies.  The
issue was exactly as Leo described, and I apologize for not being
aware of it, and thus not quite reporting it correctly.

At the moment I don't care about round-tripping between half-width and
full-width kana, rather I need only be able to rely on any particular
kana character be translated correctly to its half-width or full-width
equivalent, and I need the Japanese I send out to be readable.

I appreciate the 'implicit versus explicit' point, and have read about
it in a few different python mailing lists.  In this instance it seems
that perl perhaps ought to flash a warning notification regarding what
it is doing, but as this conversion between half-width and full-width
characters is by far the most logical one available, it also seems
reasonable that python might perhaps include such capabilities by
default, just as it currently includes the 'replace' option for
mapping missed characters generically to '?'.

I still haven't worked out the entire mapping routine, but Leo's hint
is probably sufficient to get it working with a bit more effort.

Again, thanks for the help.

-Joe

> Thanks that I have my crystal ball working. I can see clearly that the
> forth
> character of the input is 'HALFWIDTH KATAKANA LETTER ME' (U+FF92)
> which is
> not present in ISO-2022-JP as defined by RFC 1468 so python converts
> it into
> question mark as you requested. Meanwhile perl as usual is trying to
> guess what
> you want and silently converts that character into 'KATAKANA LETTER
> ME' (U+30E1)
> which is present in ISO-2022-JP.
>
> > Why can't python properly encode some of these
> > characters?
>
> Because "Explicit is better than implicit". Do you care about
> roundtripping?
> Do you care about width of characters? What about full-width " (U
> +FF02)? Python
> doesn't know answers to these questions so it doesn't do anything with
> your
> input. You have to do it yourself. Assuming you don't care about
> roundtripping
> and width here is an example demonstrating how to deal with narrow
> characters:
>
> from unicodedata import normalize
> iso2022_squeezing = dict((i, normalize('NFKC',unichr(i))) for i in
> range(0xFF61,0xFFE0))
> print repr(u'\uFF92'.translate(iso2022_squeezing))
>
> It prints u'\u30e1'. Feel free to ask questions if something is not
> clear.
>
> Note, this is just an example, I *don't* claim it does what you want
> for any character
> in FF61-FFDF range. You may want to carefully review the whole unicode
> block:http://www.unicode.org/charts/PDF/UFF00.pdf
>
>   -- Leo.


-- 
http://mail.python.org/mailman/listinfo/python-list


dictionary of dictionaries

2007-12-09 Thread kettle
Hi,
 I'm wondering what the best practice is for creating an extensible
dictionary-of-dictionaries in python?

 In perl I would just do something like:

my %hash_of_hashes;
for(my $i=0;$i<10;$i++){
for(my $j=0;$j<10;$j++){
${$hash_of_hashes{$i}}{$j} = int(rand(10));
}
}

but it seems to be more hassle to replicate this in python.  I've
found a couple of references around the web but they seem cumbersome.
I'd like something compact.
-joe
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: dictionary of dictionaries

2007-12-09 Thread kettle
On Dec 9, 5:49 pm, Marc 'BlackJack' Rintsch <[EMAIL PROTECTED]> wrote:
> On Sun, 09 Dec 2007 00:35:18 -0800, kettle wrote:
> > Hi,
> >  I'm wondering what the best practice is for creating an extensible
> > dictionary-of-dictionaries in python?
>
> >  In perl I would just do something like:
>
> > my %hash_of_hashes;
> > for(my $i=0;$i<10;$i++){
> > for(my $j=0;$j<10;$j++){
> >${$hash_of_hashes{$i}}{$j} = int(rand(10));
> > }
> > }
>
> > but it seems to be more hassle to replicate this in python.  I've
> > found a couple of references around the web but they seem cumbersome.
> > I'd like something compact.
>
> Use `collections.defaultdict`:
>
> from collections import defaultdict
> from random import randint
>
> data = defaultdict(dict)
> for i in xrange(11):
> for j in xrange(11):
> data[i][j] = randint(0, 10)
>
> If the keys `i` and `j` are not "independent" you might use a "flat"
> dictionary with a tuple of both as keys:
>
> data = dict(((i, j), randint(0, 10)) for i in xrange(11) for j in xrange(11))
>
> And just for completeness: The given data in the example can be stored in a
> list of lists of course:
>
> data = [[randint(0, 10) for dummy in xrange(11)] for dummy in xrange(11)]
>
> Ciao,
> Marc 'BlackJack' Rintsch

Thanks for the heads up.  Indeed it's just as nice as perl.  One more
question though, this defaultdict seems to only work with python2.5+
in the case of python < 2.5 it seems I have to do something like:
#!/usr/bin/python
from random import randint

dict_dict = {}
for x in xrange(10):
for y in xrange(10):
r = randint(0,10)
try:
dict_dict[x][y] = r
except:
if x in dict_dict:
dict_dict[x][y] = r
else:
dict_dict[x] = {}
dict_dict[x][y] = r

what I really want to / need to be able to do is autoincrement the
values when I hit another word.  Again in perl I'd just do something
like:

my %my_hash;
while(){
  chomp;
  @_ = split(/\s+/);
  grep{$my_hash{$_}++} @_;
}

and this generalizes transparently to a hash of hashes or hash of a
hash of hashes etc.  In python < 2.5 this seems to require something
like:

for line in file:
  words = line.split()
  for word in words:
my_dict[word] = 1 + my_dict.get(word, 0)

which I guess I can generalize to a dict of dicts but it seems it will
require more if/else statements to check whether or not the higher-
level keys exist.  I guess the real answer is that I should just
migrate to python2.5...!

-joe
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: dictionary of dictionaries

2007-12-10 Thread kettle
On Dec 10, 6:58 pm, Peter Otten <[EMAIL PROTECTED]> wrote:
> kettle wrote:
> > On Dec 9, 5:49 pm, Marc 'BlackJack' Rintsch <[EMAIL PROTECTED]> wrote:
> >> On Sun, 09 Dec 2007 00:35:18 -0800, kettle wrote:
> >> > Hi,
> >> >  I'm wondering what the best practice is for creating an extensible
> >> > dictionary-of-dictionaries in python?
>
> >> >  In perl I would just do something like:
>
> >> > my %hash_of_hashes;
> >> > for(my $i=0;$i<10;$i++){
> >> > for(my $j=0;$j<10;$j++){
> >> >${$hash_of_hashes{$i}}{$j} = int(rand(10));
> >> > }
> >> > }
>
> >> > but it seems to be more hassle to replicate this in python.  I've
> >> > found a couple of references around the web but they seem cumbersome.
> >> > I'd like something compact.
>
> >> Use `collections.defaultdict`:
>
> >> from collections import defaultdict
> >> from random import randint
>
> >> data = defaultdict(dict)
> >> for i in xrange(11):
> >> for j in xrange(11):
> >> data[i][j] = randint(0, 10)
>
> >> If the keys `i` and `j` are not "independent" you might use a "flat"
> >> dictionary with a tuple of both as keys:
>
> >> data = dict(((i, j), randint(0, 10)) for i in xrange(11) for j in 
> >> xrange(11))
>
> >> And just for completeness: The given data in the example can be stored in a
> >> list of lists of course:
>
> >> data = [[randint(0, 10) for dummy in xrange(11)] for dummy in xrange(11)]
>
> >> Ciao,
> >> Marc 'BlackJack' Rintsch
>
> > Thanks for the heads up.  Indeed it's just as nice as perl.  One more
> > question though, this defaultdict seems to only work with python2.5+
> > in the case of python < 2.5 it seems I have to do something like:
> > #!/usr/bin/python
> > from random import randint
>
> > dict_dict = {}
> > for x in xrange(10):
> > for y in xrange(10):
> > r = randint(0,10)
> > try:
> > dict_dict[x][y] = r
> > except:
> > if x in dict_dict:
> > dict_dict[x][y] = r
> > else:
> > dict_dict[x] = {}
> > dict_dict[x][y] = r
>
> You can clean that up a bit:
>
> from random import randrange
>
> dict_dict = {}
> for x in xrange(10):
> dict_dict[x] = dict((y, randrange(11)) for y in xrange(10))
>
>
>
> > what I really want to / need to be able to do is autoincrement the
> > values when I hit another word.  Again in perl I'd just do something
> > like:
>
> > my %my_hash;
> > while(){
> >   chomp;
> >   @_ = split(/\s+/);
> >   grep{$my_hash{$_}++} @_;
> > }
>
> > and this generalizes transparently to a hash of hashes or hash of a
> > hash of hashes etc.  In python < 2.5 this seems to require something
> > like:
>
> > for line in file:
> >   words = line.split()
> >   for word in words:
> > my_dict[word] = 1 + my_dict.get(word, 0)
>
> > which I guess I can generalize to a dict of dicts but it seems it will
> > require more if/else statements to check whether or not the higher-
> > level keys exist.  I guess the real answer is that I should just
> > migrate to python2.5...!
>
> Well, there's also dict.setdefault()
>
> >>> pairs = ["ab", "ab", "ac", "bc"]
> >>> outer = {}
> >>> for a, b in pairs:
>
> ... inner = outer.setdefault(a, {})
> ... inner[b] = inner.get(b, 0) + 1
> ...>>> outer
>
> {'a': {'c': 1, 'b': 2}, 'b': {'c': 1}}
>
> and it's not hard to write your own defaultdict
>
> >>> class Dict(dict):
>
> ... def __getitem__(self, key):
> ... return self.get(key, 0)
> ...>>> d = Dict()
> >>> for c in "abbbcdeafgh": d[c] += 1
> ...
> >>> d
>
> {'a': 2, 'c': 1, 'b': 3, 'e': 1, 'd': 1, 'g': 1, 'f': 1, 'h': 1}
>
> Peter

Nice, thanks for all the tips!  I knew there had to be some handier
python ways to do these things.  My initial attempts were just what
occurred to me first given my still limited knowledge of the language
and its idioms.  Thanks again! -joe
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: dictionary of dictionaries

2007-12-10 Thread kettle
On Dec 10, 6:58 pm, Peter Otten <[EMAIL PROTECTED]> wrote:
> kettle wrote:
> > On Dec 9, 5:49 pm, Marc 'BlackJack' Rintsch <[EMAIL PROTECTED]> wrote:
> >> On Sun, 09 Dec 2007 00:35:18 -0800, kettle wrote:
> >> > Hi,
> >> >  I'm wondering what the best practice is for creating an extensible
> >> > dictionary-of-dictionaries in python?
>
> >> >  In perl I would just do something like:
>
> >> > my %hash_of_hashes;
> >> > for(my $i=0;$i<10;$i++){
> >> > for(my $j=0;$j<10;$j++){
> >> >${$hash_of_hashes{$i}}{$j} = int(rand(10));
> >> > }
> >> > }
>
> >> > but it seems to be more hassle to replicate this in python.  I've
> >> > found a couple of references around the web but they seem cumbersome.
> >> > I'd like something compact.
>
> >> Use `collections.defaultdict`:
>
> >> from collections import defaultdict
> >> from random import randint
>
> >> data = defaultdict(dict)
> >> for i in xrange(11):
> >> for j in xrange(11):
> >> data[i][j] = randint(0, 10)
>
> >> If the keys `i` and `j` are not "independent" you might use a "flat"
> >> dictionary with a tuple of both as keys:
>
> >> data = dict(((i, j), randint(0, 10)) for i in xrange(11) for j in 
> >> xrange(11))
>
> >> And just for completeness: The given data in the example can be stored in a
> >> list of lists of course:
>
> >> data = [[randint(0, 10) for dummy in xrange(11)] for dummy in xrange(11)]
>
> >> Ciao,
> >> Marc 'BlackJack' Rintsch
>
> > Thanks for the heads up.  Indeed it's just as nice as perl.  One more
> > question though, this defaultdict seems to only work with python2.5+
> > in the case of python < 2.5 it seems I have to do something like:
> > #!/usr/bin/python
> > from random import randint
>
> > dict_dict = {}
> > for x in xrange(10):
> > for y in xrange(10):
> > r = randint(0,10)
> > try:
> > dict_dict[x][y] = r
> > except:
> > if x in dict_dict:
> > dict_dict[x][y] = r
> > else:
> > dict_dict[x] = {}
> > dict_dict[x][y] = r
>
> You can clean that up a bit:
>
> from random import randrange
>
> dict_dict = {}
> for x in xrange(10):
> dict_dict[x] = dict((y, randrange(11)) for y in xrange(10))
>
>
>
> > what I really want to / need to be able to do is autoincrement the
> > values when I hit another word.  Again in perl I'd just do something
> > like:
>
> > my %my_hash;
> > while(){
> >   chomp;
> >   @_ = split(/\s+/);
> >   grep{$my_hash{$_}++} @_;
> > }
>
> > and this generalizes transparently to a hash of hashes or hash of a
> > hash of hashes etc.  In python < 2.5 this seems to require something
> > like:
>
> > for line in file:
> >   words = line.split()
> >   for word in words:
> > my_dict[word] = 1 + my_dict.get(word, 0)
>
> > which I guess I can generalize to a dict of dicts but it seems it will
> > require more if/else statements to check whether or not the higher-
> > level keys exist.  I guess the real answer is that I should just
> > migrate to python2.5...!
>
> Well, there's also dict.setdefault()
>
> >>> pairs = ["ab", "ab", "ac", "bc"]
> >>> outer = {}
> >>> for a, b in pairs:
>
> ... inner = outer.setdefault(a, {})
> ... inner[b] = inner.get(b, 0) + 1
> ...>>> outer
>
> {'a': {'c': 1, 'b': 2}, 'b': {'c': 1}}
>
> and it's not hard to write your own defaultdict
>
> >>> class Dict(dict):
>
> ... def __getitem__(self, key):
> ... return self.get(key, 0)
> ...>>> d = Dict()
> >>> for c in "abbbcdeafgh": d[c] += 1
> ...
> >>> d
>
> {'a': 2, 'c': 1, 'b': 3, 'e': 1, 'd': 1, 'g': 1, 'f': 1, 'h': 1}
>
> Peter

One last question.  I've heard the 'Explicit vs. Implicit' argument
but this seems to boil down to a question of general usage case
scenarios and what most people 'expect' for default behavior.  The
above defaultdict implementation defining the __getitem__ method seems
like it is more generally useful than the real default.  What is the
reasoning behind NOT using this as the default implementation for a
dict in python?
-- 
http://mail.python.org/mailman/listinfo/python-list


socket script from perl -> python

2008-02-07 Thread kettle
Hi I have a socket script, written in perl, which I use to send audio
data from one server to another.  I would like to rewrite this in
python so as to replicate exactly the functionality of the perl
script, so as to incorporate this into a larger python program.
Unfortunately I still don't really have the hang of socket programming
in python.  The relevant parts of the perl script are below:
$host = '127.0.0.1';
my $port = 3482;

my $proto = getprotobyname('tcp');
my $iaddr = inet_aton($host);
my $paddr = sockaddr_in($port, $iaddr);

# create the socket, connect to the port
socket(SOCKET, PF_INET, SOCK_STREAM, $proto) or die "socket: $!";
connect(SOCKET, $paddr) or die "connect: $!";

my $length = length($converted_audio);

# pack $length as a 32-bit network-independent long
my $len = pack('N', $length);

#print STDERR "LENGTH: $length\n";

SOCKET->autoflush();

print SOCKET "r";
print SOCKET $len;
print SOCKET "$converted_audio\n";

while(defined($line = )) {
  do something here...
}


I've used python's socket library to connect to the server, and
verified that the first piece of data'r' is read correctly, the
sticking point seems to be the $len variable.  I've tried using
socket.htonl() and the other less likely variants, but nothing seem to
produce the desired result, which would be to have the server-side
message print the same 'length' as the length printed by the client.

The python I've tried looked like this:
from socket import *
host = '127.0.0.1'
port = 3482
addr = (host, port)
s = socket(AF_INET, SOCK_STREAM)
s.connect(addr)

f = open('/home/myuname/socket.wav','rb')
audio = ""
for line in f:
audio += line

leng = htonl(len(audio))
print leng
s.send('r')
s.send(leng)
s.send(audio)
s.send("\n")
s.flush()

--
of course I'd also like to s.recv() the results from the server, but
first I need to properly calculate the length and send it as a network
independent long.  Any tips on how to do this would be greatly
appreciated!

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: socket script from perl -> python

2008-02-07 Thread kettle
On Feb 8, 12:08 am, Bjoern Schliessmann  wrote:
> kettle wrote:
> > Hi I have a socket script, written in perl, which I use to send
> > audio data from one server to another.  I would like to rewrite
> > this in python so as to replicate exactly the functionality of the
> > perl script, so as to incorporate this into a larger python
> > program. Unfortunately I still don't really have the hang of
> > socket programming in python.
>
> Socket programming in Python is just like socket programming in C. I
> suppose with Perl it's the same.
True, but I'm not talking about the concepts, I'm talking about the
idioms, which in python I don't know.


>
> > # pack $length as a 32-bit network-independent long
> > my $len = pack('N', $length);
> > [...]
> > I've used python's socket library to connect to the server, and
> > verified that the first piece of data'r' is read correctly, the
> > sticking point seems to be the $len variable.  I've tried using
> > socket.htonl() and the other less likely variants, but nothing
> > seem to produce the desired result, which would be to have the
> > server-side message print the same 'length' as the length printed
> > by the client.
>
> Try struct.calcsize.
Thanks for the suggestion, I hadn't tried that one.

-joe


>
> Regards,
>
> Björn
>
> --
> BOFH excuse #88:
>
> Boss' kid fucked up the machine

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: socket script from perl -> python

2008-02-07 Thread kettle
On Feb 8, 4:01 am, Hrvoje Niksic <[EMAIL PROTECTED]> wrote:
> kettle <[EMAIL PROTECTED]> writes:
> > # pack $length as a 32-bit network-independent long
> > my $len = pack('N', $length);
> [...]
> > the sticking point seems to be the $len variable.
>
> Use len = struct.pack('!L', length) in Python.  
> Seehttp://docs.python.org/lib/module-struct.htmlfor details.

Thanks, that was exactly what I was missing.  And thanks for the link
as well! -joe
-- 
http://mail.python.org/mailman/listinfo/python-list


python tr equivalent (non-ascii)

2008-08-13 Thread kettle
Hi,
 I was wondering how I ought to be handling character range
translations in python.

 What I want to do is translate fullwidth numbers and roman alphabet
characters into their halfwidth ascii equivalents.
 In perl I can do this pretty easily with tr:

tr/\x{ff00}-\x{ff5e}/\x{0020}-\x{007e}/;

 and I think the string.translate method is what I need to use to
achieve the equivalent in python.  Unfortunately the maktrans method
doesn't seem to accept character ranges and I'm also having trouble
with it's interpretation of length.  What I came up with was to first
fudge the ranges:

my_test_string = u"ABCDEFG"
f_range = "".join([unichr(x) for x in
range(ord(u"\uff00"),ord(u"\uff5e"))])
t_range = "".join([unichr(x) for x in
range(ord(u"\u0020"),ord(u"\u007e"))])

 then use these as input to maketrans:
my_trans_string =
my_test_string.translate(string.maketrans(f_range,t_range))
Traceback (most recent call last):
  File "", line 1, in ?
UnicodeEncodeError: 'ascii' codec can't encode characters in position
0-93: ordinal not in range(128)

 but it generates an encoding error... and if I encodethe ranges in
utf8 before passing them on I get a length error because maketrans is
counting bytes not characters and utf8 is variable width...
my_trans_string =
my_test_string.translate(string.maketrans(f_range.encode("utf8"),t_range.encode("utf8")))
Traceback (most recent call last):
  File "", line 1, in ?
ValueError: maketrans arguments must have same length
--
http://mail.python.org/mailman/listinfo/python-list


Re: python tr equivalent (non-ascii)

2008-08-13 Thread kettle
On Aug 13, 5:18 pm, kettle <[EMAIL PROTECTED]> wrote:
> Hi,
>  I was wondering how I ought to be handling character range
> translations in python.
>
>  What I want to do is translate fullwidth numbers and roman alphabet
> characters into their halfwidth ascii equivalents.
>  In perl I can do this pretty easily with tr:
>
> tr/\x{ff00}-\x{ff5e}/\x{0020}-\x{007e}/;
>
>  and I think the string.translate method is what I need to use to
> achieve the equivalent in python.  Unfortunately the maktrans method
> doesn't seem to accept character ranges and I'm also having trouble
> with it's interpretation of length.  What I came up with was to first
> fudge the ranges:
>
> my_test_string = u"ABCDEFG"
> f_range = "".join([unichr(x) for x in
> range(ord(u"\uff00"),ord(u"\uff5e"))])
> t_range = "".join([unichr(x) for x in
> range(ord(u"\u0020"),ord(u"\u007e"))])
>
>  then use these as input to maketrans:
> my_trans_string =
> my_test_string.translate(string.maketrans(f_range,t_range))
> Traceback (most recent call last):
>   File "", line 1, in ?
> UnicodeEncodeError: 'ascii' codec can't encode characters in position
> 0-93: ordinal not in range(128)
>
>  but it generates an encoding error... and if I encodethe ranges in
> utf8 before passing them on I get a length error because maketrans is
> counting bytes not characters and utf8 is variable width...
> my_trans_string =
> my_test_string.translate(string.maketrans(f_range.encode("utf8"),t_range.encode("utf8")))
> Traceback (most recent call last):
>   File "", line 1, in ?
> ValueError: maketrans arguments must have same length

Ok so I guess I was barking up the wrong tree.  Searching for python 全角
 半角 quickly brought up a solution:
>>>import unicodedata
>>>my_test_string=u"[EMAIL PROTECTED]"
>>>print unicodedata.normalize('NFKC', my_test_string.decode("utf8"))
[EMAIL PROTECTED]@123
>>>

still, it would be nice if there was a more general solution, or if
maketrans actually looked at chars instead of bytes methinks.


--
http://mail.python.org/mailman/listinfo/python-list


Re: python tr equivalent (non-ascii)

2008-08-13 Thread kettle
On Aug 13, 5:33 pm, Fredrik Lundh <[EMAIL PROTECTED]> wrote:
> kettle wrote:
> >  I was wondering how I ought to be handling character range
> > translations in python.
>
> >  What I want to do is translate fullwidth numbers and roman alphabet
> > characters into their halfwidth ascii equivalents.
> >  In perl I can do this pretty easily with tr:
>
> > tr/\x{ff00}-\x{ff5e}/\x{0020}-\x{007e}/;
>
> >  and I think the string.translate method is what I need to use to
> > achieve the equivalent in python.  Unfortunately the maktrans method
> > doesn't seem to accept character ranges and I'm also having trouble
> > with it's interpretation of length.  What I came up with was to first
> > fudge the ranges:
>
> > my_test_string = u"ABCDEFG"
> > f_range = "".join([unichr(x) for x in
> > range(ord(u"\uff00"),ord(u"\uff5e"))])
> > t_range = "".join([unichr(x) for x in
> > range(ord(u"\u0020"),ord(u"\u007e"))])
>
> >  then use these as input to maketrans:
> > my_trans_string =
> > my_test_string.translate(string.maketrans(f_range,t_range))
> > Traceback (most recent call last):
> >   File "", line 1, in ?
> > UnicodeEncodeError: 'ascii' codec can't encode characters in position
> > 0-93: ordinal not in range(128)
>
> maketrans only works for byte strings.
>
> as for translate itself, it has different signatures for byte strings
> and unicode strings; in the former case, it takes lookup table
> represented as a 256-byte string (e.g. created by maketrans), in the
> latter case, it takes a dictionary mapping from ordinals to ordinals or
> unicode strings.
>
> something like
>
> lut = dict((0xff00 + ch, 0x0020 + ch) for ch in range(0x80))
>
> new_string = old_string.translate(lut)
>
> could work (untested).
>
> 

excellent.  i didnt realize from the docs that i could do that. thanks
--
http://mail.python.org/mailman/listinfo/python-list