On Sun, Mar 20, 2016 at 3:12 AM, Marko Rauhamaa <ma...@pacujo.net> wrote:
> Steven D'Aprano <st...@pearwood.info>:
>
>> On Sun, 20 Mar 2016 02:02 am, Marko Rauhamaa wrote:
>>> Yes, but UTF-16 produces 16-bit values that are outside Unicode.
>>
>> Show me.
>>
>> Before you answer, if your answer is "surrogate pairs", that is
>> incorrect. Surrogate pairs is how UTF-16 encodes astral characters.
>
> UTF-16 inputs a Unicode stream and produces a stream of 16-bit numbers.
> Thus, the output of UTF-16 is not Unicode.

Then UTF-16 produces 16-bit values that have nothing whatsoever to do
with Unicode. Is that what you're saying? If so, you're correct;
UTF-16LE produces two bytes to represent every BMP character, and four
bytes to represent every non-BMP character, and those are not
themselves Unicode.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to