[ 
https://issues.apache.org/jira/browse/CALCITE-7270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18042662#comment-18042662
 ] 

Mihai Budiu commented on CALCITE-7270:
--------------------------------------

It's complicated. Here's what I learned:
 * there are three kinds of division to consider: integer, numeric, and 
floating point
 ** some databases have different operators for integer division: e.g. DIV in 
MySQL or // in duck DB
 * for integer division
 ** generally databases use checked division and complain on overflow (INT_MIN 
/ -1)
 *** however, some databases like sqlite will convert dynamically the result on 
overflow to a wider data type 
 *** as a conclusion, checked and unchecked division are the same operation, 
and there is no unchecked division
 ** division by 0 can produce:
 *** NULL
 *** infinity (duckdb)
 *** exception
 * for numeric (DECIMAL) division
 ** databases will complain on overflow produced by division, so this division 
is always checked (there is no unchecked division)
 ** division by 0 is handled can produce
 *** NULL
 *** infinity (postgres DECIMAL has infinity values)
 *** exception
 * floating point division
 ** many databases will convert Infinity or NaN to NULL, no matter how they are 
produced
 ** some databases (e.g. DuckDB) will actually produce compliant IEEE 754 
results

Some of the existing behaviors cannot even be emulated at all using Calcite, 
because Calcite uses static typing while some databases use dynamic typing 
(sqlite), changing the result type depending on the actual values of the 
operands.

Some databases (postgres) use types (DECIMAL) that cannot be precisely modeled 
in Calcite.

So the attempt to implement accurately even the division operator in Calcite is 
non-trivial. We don't even have conformance properties precise enough to 
distinguish all these cases. 

Should there be different operators to represent division for different types? 
There is already an integer division (DIVIDE_INTEGER) operator in Calcite, but 
it is only used for some temporary expressions generated internally. If 
operators could be parameterized we could have different kinds operators 
(INTEGER, NUMERIC, FLOATING_POINT) with flags to adjust the behavior (throw, 
NULL, infinity), but I am not sure that is allowed. The sqlToRelConverter would 
have to switch the generic division operator from the parser to the correct one 
based on types and conformance.

Any suggestions are appreciated.

 

> Add support for a DIVIDE_0_NULL operation
> -----------------------------------------
>
>                 Key: CALCITE-7270
>                 URL: https://issues.apache.org/jira/browse/CALCITE-7270
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.41.0
>            Reporter: Mihai Budiu
>            Priority: Minor
>              Labels: pull-request-available
>
> SQL dialects treat DIVISION in different ways: in some dialects division by 0 
> produces an exception, while in other dialects (e.q. sqlite) it produces NULL.
> These are really two different operators with the same name. I think they 
> should be represented by different operations in the IR: e.g., DIVIDE and 
> SAFE_DIVIDE.
> This is reminiscent to the CHECKED arithmetic operations introduced in 
> [CALCITE-6685], and the solution could be similar: in that issue whether an 
> ADD operation is checked or unchecked is a property of SqlConformance. 
> Similarly, we could add a boolean flag to SqlConformance indicating whether 
> division is safe or not and a visitor that will rewrite DIVIDE to SAFE_DIVIDE 
> when necessary.
> Spawn from [CALCITE-7145]
> I am happy to assign this to myself if people agree.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to