Hello,
We've found an issue with running distinct aggregation queries with Calcite
query engine on Ignite 2.15. Queries like that with number of columns > 10-12
fail:
select id, count(distinct col1) VRT_col1, count(distinct col2) VRT_col2, ... ,
count(distinct col20) VRT_col20
from WideRecord group by id order by id;
The error is either "There are not enough rules to produce a node with desired
properties: convention=IGNITE, sort=[175 ASC-nulls-first], distr=single,
rewindability=one-way'" or "Volcano planning timed out..." for somewhat large
datasets what, I believe, is basically the same, planner can't build query
execution plan.
It doesn't depend on data volume, only on number of columns and the same
queries worked fine with H2 engine even for 100 columns.
I can guess that it may be a side effect of the optimization described here:
https://www.querifylabs.com/blog/distinct-aggregation-optimization-in-apache-calcite-and-trino.
Is it possible to control the optimization, maybe just turn it off?
I hope this may be useful to improve such a great product!
Best regards,
Dmytro Androshchuk
CITI, GFT-IT FRM AQUA
[email protected]<mailto:[email protected]>
Mississagua, Ontario
Citi Canada
[cid:[email protected]]<http://www.citigroup.com/citi/>
CONFIDENTIALITY NOTE: The information transmitted, including attachments, is
intended only for the person(s) or entity to which it is addressed and may
contain confidential and/or privileged material. Any review, retransmission,
dissemination or other use of, or taking of any action in reliance upon this
information by persons or entities other than the intended recipient is
prohibited.