2011/6/22 Saul Spatz :
> Thanks. I agree with you about the generator. Using your first suggestion,
> code points above U+ get separated into two "surrogate pair" characters
> fron UTF-16. So instead of U=10 I get U+DBFF and U+DFFF.
> --
> http://mail.python.org/mailman/listinfo/python
On 22 juin, 16:07, Saul Spatz wrote:
> Thanks very much. This is the elegant kind of solution I was looking for. I
> had hoped there was a way to do it without even addressing the matter of
> surrogates, but apparently not. The reason I don't like this is that it
> depends on knowing that py
Thanks very much. This is the elegant kind of solution I was looking for. I
had hoped there was a way to do it without even addressing the matter of
surrogates, but apparently not. The reason I don't like this is that it
depends on knowing that python internally stores strings in UTF-16. I e
Thanks. I agree with you about the generator. Using your first suggestion,
code points above U+ get separated into two "surrogate pair" characters
fron UTF-16. So instead of U=10 I get U+DBFF and U+DFFF.
--
http://mail.python.org/mailman/listinfo/python-list
That seems to me correct.
>>> '\\u{:04x}'.format(ord(u'é'))
\u00e9
>>> '\\U{:08x}'.format(ord(u'é'))
\U00e9
>>>
because
>>> u'\U00e9'
File "", line 1
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes
in position 0-5: end of string in escape sequence
>>> u'\U00e9'
é
Saul Spatz wrote:
> Hi,
>
> I'm just starting to learn a bit about Unicode. I want to be able to read
> a utf-8 encoded file, and print out the codepoints it encodes. After many
> false starts, here's a script that seems to work, but it strikes me as
> awfully awkward and unpythonic. Have you a
2011/6/22 Saul Spatz :
> Hi,
>
> I'm just starting to learn a bit about Unicode. I want to be able to read a
> utf-8 encoded file, and print out the codepoints it encodes. After many
> false starts, here's a script that seems to work, but it strikes me as
> awfully awkward and unpythonic. Have
On Wed, Jun 22, 2011 at 1:37 PM, Saul Spatz wrote:
> Hi,
>
> I'm just starting to learn a bit about Unicode. I want to be able to read a
> utf-8 encoded file, and print out the codepoints it encodes. After many
> false starts, here's a script that seems to work, but it strikes me as
> awfully