[C++] Decimal arithmetic edge cases

David Li Thu, 30 Sep 2021 06:42:21 -0700

Hello all,

While looking at decimal arithmetic kernels in ARROW-13130, the question of 
what to do about overflow came up.


Currently, our rules are based on Redshift [1], except we raise an error if we 
exceed the maximum precision (Redshift's docs implies it saturates instead). 
Hence, we can always add/subtract/etc. without checking for overflow, but we 
can't do things like add two decimal256(76, 0) since there's no more precision 
available.

If we were to support this last case, what would people expect the unchecked 
arithmetic kernels to do on overflow? For integers, we wrap around, but this 
doesn't really make sense for decimals; we could also return nulls, or just 
always raise an error (this seems the most reasonable to me). Any thoughts?

For reference, for an unchecked add, currently we have:
"1" (decimal256(75, 0)) + "1" (decimal256(75, 0)) = "2" (decimal256(76, 0))
"1" (decimal128(38, 0)) + "1" (decimal128(38, 0)) = error (not enough precision)
"1" (decimal256(76, 0)) + "1" (decimal256(76, 0)) = error (not enough precision)
"99...9 (76 digits)" (decimal256(76, 0)) + "1" (decimal256(76, 0)) = error (not 
enough precision)

Arguably these last three cases should be:
"1" (decimal128(38, 0)) + "1" (decimal128(38, 0)) = "2" (decimal256(39, 0)) 
(promote to decimal256)
"1" (decimal256(76, 0)) + "1" (decimal256(76, 0)) = "2" (decimal256(76, 0)) 
(saturate at max precision)
"99...9 (76 digits)" (decimal256(76, 0)) + "1" (decimal256(76, 0)) = error 
(overflow)

On a related note, you could also argue that we shouldn't increase the 
precision like this, though DBs other than Redshift also do this. Playing with 
DuckDB a bit, though, it doesn't match Redshift: addition/subtraction increase 
precision by 1 like Redshift does, but division results in a float, and 
multiplication only adds the input precisions together, while Redshift adds 1 
to the sum of the precisions. (That is, decimal128(3, 0) * decimal128(3, 0) is 
decimal128(7, 0) in Redshift/Arrow but decimal128(6, 0) in DuckDB.)

[1]: 
https://docs.aws.amazon.com/redshift/latest/dg/r_numeric_computations201.html

-David

[C++] Decimal arithmetic edge cases

Reply via email to