Darpan Lunagariya (e6data computing) created CALCITE-7631:
-------------------------------------------------------------

             Summary: Introduce a composable RexImplementorTable SPI for 
operator code generation
                 Key: CALCITE-7631
                 URL: https://issues.apache.org/jira/browse/CALCITE-7631
             Project: Calcite
          Issue Type: Improvement
          Components: core
    Affects Versions: 1.42.0
            Reporter: Darpan Lunagariya (e6data computing)


h2. Problem

Enumerable code generation resolves the implementor for every operator through 
the {{RexImpTable.INSTANCE}} singleton. There is exactly one extension hook: 
{{get(SqlOperator)}} consults {{ImplementableFunction}} when the operator is a 
{{SqlUserDefinedFunction}} (a function registered in a schema). For any *other* 
operator, a custom or dialect operator that an adapter registers through a 
{{SqlOperatorTable}} — there is no way to supply a code-generation implementor:

* the backing maps are {{private final ImmutableMap}};
* the {{Builder}} / {{AbstractBuilder}} and their {{define*}} methods are 
private;
* every consumer references {{RexImpTable.INSTANCE}} directly, most importantly 
{{RexToLixTranslator.visitCall}}, but also {{RexExecutorImpl}}, 
{{EnumerableAggregate}}, {{EnumerableMatch}}, {{EnumerableTableFunctionScan}}.

Practical consequences for such an operator:

* it throws {{"cannot translate call"}} during code generation; and
* it cannot be constant-folded, because {{ReduceExpressionsRule}} runs through 
{{RexExecutorImpl}}, which compiles the whole batch and fails if a single 
operator has no implementor.

h2. Why this matters

{{RexImpTable}} is the only registry of its kind that is a hard, non-composable 
singleton. Every other piece of pluggable behaviour in Calcite is an interface 
(or builder) with a default, obtained or composed at configuration time:

* operators — {{SqlOperatorTable}} + {{SqlOperatorTables.chain(...)}}
* type system — {{RelDataTypeSystem}} (+ {{RelDataTypeSystem.DEFAULT}})
* metadata — {{RelMetadataProvider}} / {{ChainedRelMetadataProvider}}
* cost — {{RelOptCostFactory}}
* constant executor — {{RexExecutor}} (set on the planner)

So an adapter can already _define_ its operators (compose a 
{{SqlOperatorTable}}) and _validate_ them, but it cannot _generate code_ for 
them. The validation half of "defining a function" is open; the code-generation 
half is sealed. Closing that asymmetry is the goal.

h2. Proposal

Make the implementor table a first-class, composable SPI; the code-generation 
counterpart of {{SqlOperatorTable}}, *without changing default behaviour*.

# Extract an interface {{RexImplementorTable}} with the existing lookups 
({{get}} for scalar / aggregate / match / windowed-table-function operators). 
{{RexImpTable}} becomes its default implementation; {{RexImpTable.INSTANCE}} 
and a new {{RexImpTable.instance()}} remain the default.
# Add {{RexImplementorTables.chain(...)}} (mirroring 
{{SqlOperatorTables.chain}}): consult each table in turn, first non-null wins; 
chain order provides override.
# Thread an injectable {{RexImplementorTable}} (defaulting to the built-ins) 
through the code-generation entry points — {{RexToLixTranslator}} (new 
overloads of {{translateProjects}} / {{translateCondition}}) and 
{{RexExecutorImpl}} (for constant folding) — sourced the same way 
{{conformance}} already travels into {{EnumerableRelImplementor}}.

An adapter then supplies implementors for its own operators by composing 
{{RexImplementorTables.chain(myTable, RexImpTable.instance())}} — exactly 
parallel to how it composes its {{SqlOperatorTable}} today.

h3. Backward compatibility

* {{RexImpTable.INSTANCE}} and all existing public methods remain; the default 
resolution path is unchanged.
* New table-carrying overloads are added; the older overloads are deprecated 
and delegate to them.
* The match / windowed-table-function lookups change from "throw on miss" to 
"return {{null}} on miss" so a chained table can fall through; call sites that 
require an implementor preserve the same failure via an explicit check.

h3. Example

{code:java}
RexImplementorTable table =
    RexImplementorTables.chain(myAdapterImplementors, RexImpTable.instance());

// constant folding
planner.setExecutor(new RexExecutorImpl(dataContext, table));
{code}

h2. Scope / non-goals

* Enumerable-engine _execution_ of custom *aggregates* additionally needs the 
table at planning time (the {{EnumerableAggregate}} constructor pre-checks 
operator support), which can be a follow-up; the SPI itself already covers 
aggregate implementors.
* The operator *catalog* ({{SqlLibrary}} / {{SqlLibraryOperators}}) is 
unchanged. That is first-party content behind the already-open 
{{SqlOperatorTable}} and is intentionally out of scope.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to