Jackie-Jiang opened a new pull request, #13208: URL: https://github.com/apache/pinot/pull/13208
When a column is partitioned on each server (i.e. the same value always show up on the same server), the following queries can be optimized by asking server to directly return final aggregate result instead of intermediate aggregate result. 1: `SELECT DISTINCT_COUNT(partitionedCol) FROM myTable` 2: `SELECT DISTINCT_COUNT(partitionedCol) FROM myTable GROUP BY col` 3: `SELECT AGG(col) FROM myTable GROUP BY partitionedCol` For all 3 queries, we can ask server to return final aggregate result, but there are some difference between 2 and 3. For 2, server can return final aggregate result, but should still keep enough groups because the aggregate result is not global final result, but only the final result for a partition; For 3, server only needs to keep `LIMIT` groups because the aggregate result is global final result for the group. In this PR, user can `SET serverReturnFinalResult = true;` to accelerate 1 and 3; user can `SET serverReturnFinalResultKeyUnpartitioned = true;` to accelerate 2 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
