On Aug 12, 2019, at 15:18, Christopher Barker <[email protected]> wrote:
> 
> If that is the goal, then strings would need a hook, too, as Unicode allows 
> different normalized forms for some "characters" (see previous discussion for 
> this, I, frankly, don't quite "get" it. 

Although normalization can be a problem, there’s a much simpler—and more 
common—issue: what to escape, and how to escape it.

For example, many pre-ES5 JS implementations escaped forward slashes. This is 
legal, but unnecessary, and Python’s module doesn’t do it. And you can’t make 
Python’s module do it. So, you load a JSON document with the string “abc\/def”, 
you get the Python string “abc/def”, you dump it and get a JSON document with 
“abc/def”. JSON says the two documents are identical, but they obviously aren’t 
the same bytes.

Similar issues include case for hex letters in \u escapes, whether to use 
\u0008 instead of \b (and similar for all the other special escape sequences, 
but I think \b is the most commonly different one), whether to escape all 
non-ASCII (and what that means—Python doesn’t count \x7f as ASCII), whether to 
escape all non-BMP, whether to treat Unicode separators (or just the two that 
JS source doesn’t allow) as control characters, etc. So, even if you only care 
about preserving the output of one specific library, and you know exactly what 
rule it uses for escaping, you still can’t do it.

And really, the same is true for array and object whitespace. At least most 
libraries are consistent in how they use whitespace, and most rules can be 
covered by the simple separators hook, so if you know which library generated 
the document, you can usually write code to reproduce the whitespace. But if 
you have to work with the output of two different libraries? Or a library you 
don’t know? Or JSON edited by hand or by sed scripts that might not even be 
consistent?

_______________________________________________
Python-ideas mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/OOLFXJXXERYEWJWVRTTIWDSSMZ55GHQM/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to