Serhiy Storchaka added the comment:

After investigating the problem deeper, I see that new parameter is not needed. 
RFC 4627 does not make exceptions for the range 0xD800-0xDFFF, and the decoder 
must accept lone surrogates, both escaped and unescaped. Non-BMP characters may 
be represented as escaped surrogate pair, so escaped surrogate pair may be 
decoded as non-BMP character, while unescaped surrogate pair shouldn't.

Here is a patch, with which JSON decoder accepts encoded lone surrogates. Also 
fixed a bug when Python implementation decodes "\\ud834\\u0079x" as 
"\U0001d179".

----------
keywords: +patch
stage: needs patch -> patch review
title: Add a string error handler to JSON encoder/decoder -> JSON should accept 
lone surrogates
type: enhancement -> behavior
versions: +Python 2.7, Python 3.3
Added file: http://bugs.python.org/file30130/json_decode_lone_surrogates.patch

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue17906>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to