[
https://issues.apache.org/jira/browse/CALCITE-4733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17404193#comment-17404193
]
Vinayakumar B commented on CALCITE-4733:
----------------------------------------
We too have faced the same issue, while rewriting the query using materialized
views.
In our case, it was for TPCDS query #4, where one of the column values was
literal and it was exposed as a column with name *sale_type.*
MV Definition : With name *q04mvss*.
{noformat}
select c_customer_id customer_id
,c_first_name customer_first_name
,c_last_name customer_last_name
,c_preferred_cust_flag customer_preferred_cust_flag
,c_birth_country customer_birth_country
,c_login customer_login
,c_email_address customer_email_address
,d_year dyear
,sum(((ss_ext_list_price-ss_ext_wholesale_cost-ss_ext_discount_amt)+ss_ext_sales_price)/2)
year_total
,'s' sale_type
from customer
,store_sales
,date_dim
where c_customer_sk = ss_customer_sk
and ss_sold_date_sk = d_date_sk
group by c_customer_id
,c_first_name
,c_last_name
,c_preferred_cust_flag
,c_birth_country
,c_login
,c_email_address
,d_year;
{noformat}
*Expected after rewrite:*
-----------------------
{noformat}
SELECT "q04mvss"."customer_id", "q04mvss"."customer_first_name",
"q04mvss"."customer_last_name", "q04mvss"."customer_preferred_cust_flag",
"q04mvss"."customer_birth_country", "q04mvss"."customer_login",
"q04mvss"."customer_email_address", "q04mvss"."dyear",
CAST("q04mvss"."year_total" AS DECIMAL(19, 11)) AS "year_total", 's' AS
"sale_type"
FROM "tpcdsmv"."q04mvss" AS "q04mvss"
{noformat}
*Actual after rewrite*
--------------------
{noformat}
SELECT "q04mvss"."customer_id", "q04mvss"."customer_first_name",
"q04mvss"."customer_last_name", "q04mvss"."customer_preferred_cust_flag",
"q04mvss"."customer_birth_country", "q04mvss"."customer_login",
"q04mvss"."customer_email_address", "q04mvss"."dyear",
CAST("q04mvss"."year_total" AS DECIMAL(19, 11)) AS "year_total", 's' AS "$f9"
FROM "tpcdsmv"."q04mvss" AS "q04mvss"
{noformat}
i.e. Rewritten as _*{{'s' AS "$f9"}}*_ instead of _*{{'s' AS "sale_type"}}*_
As mentioned above, we could control the filed names in the root of the tree,
only if the rewrite happens at the root itself. But, in current case, above MV
is for one of WITH clauses of TPCDS Q04. Rewrite happens in part of the entire
query. Field names coming from MV itself are propogated properly, but the
literal column added (_'s' as `sale_type`_) will not retail the field name, and
rewritten as _*'$f9'*_.
Actual problem arised is because, references of this columns in conditions,
which are in non-rewritten-part of the query, still refers to name *sale_type*.
So while executing query fails saying, column cannot be resolved.
As [~julianhyde] mentioned, adding extra project just to rename columns would
be overhead. But, while adding a current project itself, fieldnames can be
passed, which are derived from expected rowType as below. This has solved
problem for us.
{noformat}
return relBuilder
- .project(rewrittenExprs)
+ .project(rewrittenExprs, topRowType.getFieldNames())
.convert(topRowType, false)
.build();
}
{noformat}
This will not add one-more project, but it names it in the current project
itself.
[~julianhyde] please take a look.
Thanks.
> MaterializedViewAggregateRule should not rewrite top level name alias in
> project.
> ---------------------------------------------------------------------------------
>
> Key: CALCITE-4733
> URL: https://issues.apache.org/jira/browse/CALCITE-4733
> Project: Calcite
> Issue Type: Bug
> Components: core
> Affects Versions: 1.27.0
> Reporter: Enze Liu
> Priority: Minor
> Attachments: image-2021-08-13-14-06-36-712.png
>
>
> We use *MaterializedViewAggregateRule* to do optimize.
> the original sql is :
> {code:java}
> SELECT (intDiv(ts/1000, 60) * 60) * 1000 as t, source_idc, dest_idc,
> sum(jitter200)/sum(total) as `jitter200_rate` FROM metricsdb.test_table WHERE
> ts/1000 >= 1627874961 AND source_idc in ('xxx') AND dest_idc IN ('xxx') GROUP
> BY t,source_idc,dest_idc ORDER BY t
> {code}
> after optimize, the sql is :
> {code:java}
> SELECT `intDiv`(`ts` / 1000, 60) * 60 * 1000 AS `$f14`, `source_idc`,
> `dest_idc`, SUM(`jitter200`) / SUM(`total`) AS `$f3` FROM
> `metricsdb`.`s2s_idc2idc_jitter_mv_all` WHERE `ts` / 1000 >= 1627874961 AND
> `source_idc` IN ('xxx') AND `dest_idc` IN ('xxx') GROUP BY `source_idc`,
> `dest_idc`, `intDiv`(`ts` / 1000, 60) * 60 * 1000 ORDER BY `intDiv`(`ts` /
> 1000, 60) * 60 * 1000
> {code}
> we expected it to be :
> {code:java}
> `intDiv`(`ts` / 1000, 60) * 60 * 1000 AS `t`{code}
> , not
> {code:java}
> `intDiv`(`ts` / 1000, 60) * 60 * 1000 AS `$f14`{code}
>
> After examine the code , we found in *rewriteView* of
> *MaterializedViewAggregateRule*
> {code:java}
> return relBuilder
> .project(rewrittenExprs)
> .convert(topRowType, false) // this means do not compare the names.
> .build();
> {code}
> when change it to true, everything goes as expected.
>
> I think the rewrite rule should maintain then same type and name , not just
> type.
>
> If it's true, we can come up with some modification and tests.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)