After some thinking it seems that the float is only case where this "loss of a
precision" happen.
With the current implementation (in the standard module) of how
`JSONEncoder.default`, I believe I can serialize pretty much everything else.
The "composite" types (i.e. the custom types in the sense of JSON spec) can be
serialized either as special string or a map and the logic implemented on both
ends on the application level.
The only possible suspects to suffer from the current `JSONEncoder.default` are
the "integral" types in JSON, i.e. integer, float and boolean (null is in this
sense "equivalent" to boolean). The boolean should not need any custom encoder
as it can only be represented as one of well defined set of representation.
The integer might suffer the similar fate as a float if there was not a native
support in Python for big int. With that I can do this and it works as expected:
```
json:orig = {"val":
10000000000000000000000000000000000000000000000000000000000001}
json:pyth = {'val':
10000000000000000000000000000000000000000000000000000000000001}
json:seri = {"val":
10000000000000000000000000000000000000000000000000000000000001}
```
This is an integer value far exceeding the standard binary representation in
64bit CPU arch.
```
In [12]: hex(10000000000000000000000000000000000000000000000000000000000001)
Out[12]: '0x63917877cec0556b21269d695bdcbf7a87aa000000000000001'
```
So with an integer, Python, thanks to its internal handling, parses the int
correctly and "silently" upgrades it to big int, so it does not lose the
precision (or bit-to-bit/byte-to-byte accuracy).
The only type "left in the dark" is the float. It does have an equivalent of
integer's big int (decimal.Decimal) but it is not automatically applied, which
is perfectly reasonable, because it would involve a custom type, and probably
not many would want/need that.
On the other hand, being aware of the problem, it offers the famous
`parse_float` keyword argument, which can just be conveniently set
`decimal.Decimal` if the user needs an equivalent to the big int for the float.
So far this also seem well thought out, because it shows:
a) the decoder (or better say its implementer) was well aware of the float
properties in JSON input and wanted to give the user a way to handle it in
their own way. It looks better than simplejson's `use_decimal`, because this
one implies one particular type only can be use. While the standard module
leaves the choice to the user. On the other hand in order to implement it
efficiently they both (standard module and simplejson) made this option an
explicit argument which only concerns the float type, so it does not need any
"generic raw" decoder infrastructure to support it.
So far the way standard module handles that makes perfect sense.
Now for the encoder part. simplejson got away with `use_decimal` again, because
it allowed `Decimal` as the only option. Standard module would need a way to
identify the custom codec for the float to serialize "properly".
I can see two ways out of it:
1) The standard module could implement something like `dump_float` keyword
argument in its `dump`, which would allow the user to specify which custom type
he/she used for the float in the load and then the standard encoder will mark
that and will honor the string representation of this object/type as the _raw_
output, either when internally converts the object (possibly by doing something
like str(o), or when the custom implementation of JSONEncoder.default returns
the string.
2) It would implement some specific semantics in the handling of
JSONEncoder.default output which would allow user to signal to the underlying
layer that it needs to output "raw" data to the output stream from the custom
encoder without a need for the keyword argument. Using `bytes` object could be
that trigger:
```
class DecimalEncoder(json.JSONEncoder):
def default(self, o):
print(o)
if isinstance(o, decimal.Decimal):
return str(o).encode()
return super.default(o)
```
Any thoughts on this?
_______________________________________________
Python-ideas mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/[email protected]/message/W2GRPMBJZQ7MZLD3DUTRPFVZCBV7FPVD/
Code of Conduct: http://python.org/psf/codeofconduct/