Mihai Budiu created CALCITE-7092: ------------------------------------ Summary: DPhyp implementation assertion failure Key: CALCITE-7092 URL: https://issues.apache.org/jira/browse/CALCITE-7092 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.40.0 Reporter: Mihai Budiu
This is about the hypergraph-based join optimization algorithm introduced in [CALCITE-6846] and used in the optimization rule HYPER_GRAPH_OPTIMIZE. I have encountered the following error: {code} set type is RecordType(DECIMAL(38, 2) EXPR$2, BIGINT NOT NULL EXPR$0, BIGINT NOT NULL EXPR$1) NOT NULL expression type is RecordType(BIGINT NOT NULL EXPR$1, DECIMAL(38, 2) EXPR$2, BIGINT NOT NULL EXPR$0) NOT NULL set is rel#212:HyperGraph.(input#0=HepRelVertex#216,input#1=HepRelVertex#206,edges={0}——[INNER, true]——{1}) expression is LogicalJoin(condition=[true], joinType=[inner]) LogicalProject(EXPR$1=[$0]) LogicalAggregate(group=[{}], EXPR$1=[COUNT($0)]) LogicalAggregate(group=[{0}]) LogicalProject(id=[$1]) LogicalProject($f0=[CASE(=($1, 1), $0, null:INTEGER)], id=[$0], $f2=[CAST(CASE(AND(=($1, 0), $4), CAST(ROUND($2, 0)):DECIMAL(5, 2), 0.00:DECIMAL(5, 2))):DECIMAL(5, 2)]) LogicalTableScan(table=[[schema, t]]) LogicalProject(EXPR$2=[$0], EXPR$0=[$1]) HyperGraph(edges=[{0}——[INNER, true]——{1}]) LogicalAggregate(group=[{}], EXPR$2=[SUM($2)]) LogicalProject($f0=[CASE(=($1, 1), $0, null:INTEGER)], id=[$0], $f2=[CAST(CASE(AND(=($1, 0), $4), CAST(ROUND($2, 0)):DECIMAL(5, 2), 0.00:DECIMAL(5, 2))):DECIMAL(5, 2)]) LogicalTableScan(table=[[schema, t]]) LogicalAggregate(group=[{}], EXPR$0=[COUNT($0)]) LogicalAggregate(group=[{0}]) LogicalProject($f0=[$0]) LogicalProject($f0=[CASE(=($1, 1), $0, null:INTEGER)], id=[$0], $f2=[CAST(CASE(AND(=($1, 0), $4), CAST(ROUND($2, 0)):DECIMAL(5, 2), 0.00:DECIMAL(5, 2))):DECIMAL(5, 2)]) LogicalTableScan(table=[[schema, t]]) Type mismatch: rowtype of original rel: RecordType(DECIMAL(38, 2) EXPR$2, BIGINT NOT NULL EXPR$0, BIGINT NOT NULL EXPR$1) NOT NULL rowtype of new rel: RecordType(BIGINT NOT NULL EXPR$1, DECIMAL(38, 2) EXPR$2, BIGINT NOT NULL EXPR$0) NOT NULL Difference: EXPR$2: DECIMAL(38, 2) -> BIGINT NOT NULL EXPR$0: BIGINT NOT NULL -> DECIMAL(38, 2) {code} This happens in our compiler after optimizing a query through several other optimizations, and by using a different cost model for DPhyp, so it may be tricky to post an exact reproduction. A query that can trigger it is: {code:sql} CREATE TABLE T(id INT, od INT, val DECIMAL(5, 2), ct INT, e BOOLEAN); SELECT COUNT(DISTINCT CASE WHEN od = 1 THEN id END), COUNT(DISTINCT id), SUM(CASE WHEN (od = 0 AND e) THEN ROUND(val, 0) ELSE 0.0 END) FROM T {code} It looks like the code that permutes fields in the result to reconstruct the original output order is incorrect. -- This message was sent by Atlassian Jira (v8.20.10#820010)