Hello, we had a related discussion some months ago about how to handle division-by-zero. I had created a ticket, which includes some links to discussions on the mailing list: https://issues.apache.org/jira/browse/CALCITE-7264
I would appreciate if it would be possible to configure Calcite in a way that division-by-zero to be a valid expression evaluating to NULL. It would be nice to make the design general enough to allow defining the behavior of errors/exceptions either globally or at the statement level, depending on the expression that causes the error. Kind regards, Thomas On 4/24/26 7:15 PM, Mihai Budiu <[email protected]> wrote:
To a large degree this is a problem of the runtime. Calcite comes with a runtime, but you are not obliged to use it, you can implement your own runtime which behaves differently. But I think your question is about language features supporting error handling - and these do pertain to the compiler. Semantics is very subtle, and designing such features of a programming language is a delicate act. That's why in general I prefer to avoid designing new features in favor of reusing existing time-tested designs. What is missing from this description is a short survey of how other mainstream databases and streaming systems handle errors. Ideally, we can adopt a good existing design, or perhaps a mix of features from existing designs. Mihai ________________________________ From: FeatZhang <[email protected]> Sent: Friday, April 24, 2026 4:55 AM To: [email protected] <[email protected]> Subject: [DISCUSS] SQL-level error handling semantics in Calcite Hi Calcite devs, I would like to start a discussion around *error handling semantics at SQL statement level* in Calcite. ------------------------------ 1. Background In modern data processing systems (both batch and streaming), handling malformed or partially invalid data is a common requirement. Typical issues include: - malformed JSON / structured payloads - type mismatches during casting - schema evolution inconsistencies - runtime exceptions in user-defined functions Currently, Calcite provides limited support for error handling via: - expression-level constructs (e.g., TRY_CAST) - NULL propagation semantics However, these approaches are limited to *expression-level error tolerance*. ------------------------------ 2. Problem Statement There is currently no way to express *statement-level error handling semantics* in SQL within Calcite. Specifically: 2.1 No abstraction beyond expression level Error handling must be embedded into individual expressions: TRY_CAST(col AS INT) This leads to: - verbose queries - duplicated logic - lack of composability ------------------------------ 2.2 No structured error propagation There is no way to: - capture error context - classify errors - propagate error metadata alongside query execution ------------------------------ 2.3 No extensibility for downstream systems Many systems built on Calcite (e.g., streaming engines, data processing frameworks) require more advanced error handling capabilities, but currently must implement them outside SQL. ------------------------------ 3. Discussion Proposal I would like to explore whether Calcite should support a more general abstraction for error handling at SQL level. Some possible directions: ------------------------------ Option A: Statement-level TRY semantics SELECT * FROM TRY(source_table) Semantics: - failed records are skipped or handled based on policy ------------------------------ Option B: Error handling clause (conceptual) SELECT * FROM source_table HANDLE ERRORS WITH <policy> Where <policy> could define: - ignore - nullify - propagate - custom handling ------------------------------ Option C: Error-aware relational operator Introduce a logical abstraction such as: ErrorHandlingRelNode Which could: - wrap existing RelNodes - attach error metadata - allow downstream systems to interpret error semantics ------------------------------ 4. Key Questions I would appreciate feedback on: 1. Should Calcite support error handling beyond expression-level? 2. Is there prior discussion or design work in this area? 3. Would a logical operator (instead of SQL syntax) be more appropriate? 4. How should this interact with relational algebra assumptions (single output, determinism)? 5. Should Calcite remain minimal and leave this entirely to downstream systems? ------------------------------ 5. Motivation The goal is not to introduce engine-specific features, but to explore whether Calcite should provide a *generic abstraction layer* for error handling that downstream systems can leverage. ------------------------------ Closing I am interested in hearing thoughts on whether this direction aligns with Calciteās design goals, and whether such an extension would be considered in scope. Thanks!
