Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/21698 )
Change subject: IMPALA-13186: Tag query option scope for tuple cache ...................................................................... Patch Set 11: (6 comments) > Patch Set 11: > > (1 comment) > > Let's say that we have low cache hit rates. Is there a way we could tell > whether it is related to different query options? Do we want a way to tune > this at startup (i.e. a list of additional query options to exempt)? > > If we wanted an approximation of how many distinct query option hashes there > are, we could maintain an HLL sketch. I don't think it makes sense as a startup option, that could lead to selecting options that aren't actually safe. I don't see a good way to track whether low hit rate is related to query options without grabbing a bunch of queries. Other options would require tracking the query options cache key as a separate component and making it part of the cache key, with a metric we update based on misses. And it would make the cache more complicated to identify that a miss was specifically due to the query options cache key. I think reviewing profiles will be sufficient. http://gerrit.cloudera.org:8080/#/c/21698/9/be/src/service/query-options.h File be/src/service/query-options.h: http://gerrit.cloudera.org:8080/#/c/21698/9/be/src/service/query-options.h@77 PS9, Line 77: QUERY_OPT_FN(mem_limit, MEM_LIMIT, TQueryOptionLevel::REGULAR) \ > I could see this one being different across different queries or changing o Done http://gerrit.cloudera.org:8080/#/c/21698/9/be/src/service/query-options.h@83 PS9, Line 83: QUERY_OPT_FN(request_pool, REQUEST_POOL, TQueryOptionLevel::REGULAR) \ > I think we want to share results across different request pools. I think we Done http://gerrit.cloudera.org:8080/#/c/21698/9/be/src/service/query-options.h@194 PS9, Line 194: QUERY_OPT_FN(statement_expression_limit, STATEMENT_EXPRESSION_LIMIT, \ : TQueryOptionLevel::REGULAR) \ > Let's make this exempt. It applies a limit, but it otherwise has no impact Done http://gerrit.cloudera.org:8080/#/c/21698/9/be/src/service/query-options.h@211 PS9, Line 211: QUERY_OPT_FN(mem_limit_executors, MEM_LIMIT_EXECUTORS, TQueryOptionLevel::ADVANCED) \ > Same as mem_limit Done http://gerrit.cloudera.org:8080/#/c/21698/9/be/src/service/query-options.h@326 PS9, Line 326: QUERY_OPT_FN(mem_limit_coordinators, MEM_LIMIT_COORDINATORS, \ : TQueryOptionLevel::ADVANCED) > I think we should exempt this Done http://gerrit.cloudera.org:8080/#/c/21698/11/fe/src/main/java/org/apache/impala/planner/PlanNode.java File fe/src/main/java/org/apache/impala/planner/PlanNode.java: http://gerrit.cloudera.org:8080/#/c/21698/11/fe/src/main/java/org/apache/impala/planner/PlanNode.java@1388 PS11, Line 1388: tupleCacheInfo_.hashThrift(queryOptsHash); > Nit: What do you think about incorporating it once (i.e. at leaf nodes) and Done -- To view, visit http://gerrit.cloudera.org:8080/21698 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1f4802ad9548749cd43df8848b6f46dca3739ae7 Gerrit-Change-Number: 21698 Gerrit-PatchSet: 11 Gerrit-Owner: Michael Smith <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Kurt Deschler <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Yida Wu <[email protected]> Gerrit-Comment-Date: Mon, 07 Oct 2024 18:38:33 +0000 Gerrit-HasComments: Yes
