On Fri, Sep 5, 2014 at 12:09 PM, Ian Kelly <ian.g.ke...@gmail.com> wrote:
> On Thu, Sep 4, 2014 at 6:12 PM, Chris Angelico <ros...@gmail.com> wrote:
>> If it's a Unicode string (which is the default in Python 3), all
>> Unicode characters will work correctly.
>
> Assuming the library that needs this is expecting codepoints and will
> accept integers greater than 255.

They're still valid integers. It's just that someone might not know
how to work with them. Everyone has limits - I don't think repr()
would like to be fed Graham's Number, for instance, but we still say
that it accepts integers :)

>> If it's a byte string (the
>> default in Python 2), then you can't actually have any Unicode
>> characters in it at all, you have bytes; Py2 lets you be a bit sloppy
>> with the ASCII range, but technically, you still have bytes, not
>> characters..
>
> In that case the library will almost certainly accept it, but could be
> expecting a different encoding.

Yeah. Either way, the problem isn't "be careful about Unicode
characters". One option has Unicode characters, the other doesn't, and
you need to know which one it is.

I just don't like people talking about "Unicode characters" being
somehow different from "normal text" or something, and being something
that you need to be careful of. It's not that there are some
characters that behave nicely, and then other ones ("Unicode" ones)
that don't.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to