Oscar Del Ben wrote:
<snip>
You'll notice that one of the strings is a unicode one, and another one
has the character 0x82 in it.  Once join() discovers Unicode, it needs
to produce a Unicode string, and by default, it uses the ASCII codec to
get it.

If you print your 'l' list (bad name, by the way, looks too much like a
'1'), you can see which element is Unicode, and which one has the \xb7
in position 42.  You'll have to decide which is the problem, and solve
it accordingly.  Was the fact that one of the strings is unicode an
oversight? Or did you think that all characters would be 0x7f or less? Or do you want to handle all possible characters, and if so, with what
encoding?

DaveA

Thanks for your reply DaveA.

Since I'm dealing with file uploads, I guess I should only care about
those. I understand the fact that I'm trying to concatenate a unicode
string with a binary, but I don't know how to deal with this. Perhaps
the uploaded file should be encoded in some way? I don't think this is
the case though.

You have to decide what the format of the file is to be. If you have some in bytes, and some in Unicode, you have to be explicit about how you merge them. And that depends who's going to use the file, and for what purpose.

Before you try to do a join(), you have to do a conversion of the Unicode string(s) to bytes. Try str.encode(), where you get to specify what encoding to use.

In general, you want to use the same encoding for all the bytes in a given file. But as I just said, that's entirely up to you.

DaveA

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to