Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21698 )

Change subject: IMPALA-13186: Tag query option scope for tuple cache
......................................................................


Patch Set 11:

(6 comments)

> Patch Set 11:
>
> (1 comment)
>
> Let's say that we have low cache hit rates. Is there a way we could tell 
> whether it is related to different query options? Do we want a way to tune 
> this at startup (i.e. a list of additional query options to exempt)?
>
> If we wanted an approximation of how many distinct query option hashes there 
> are, we could maintain an HLL sketch.

I don't think it makes sense as a startup option, that could lead to selecting 
options that aren't actually safe.

I don't see a good way to track whether low hit rate is related to query 
options without grabbing a bunch of queries. Other options would require 
tracking the query options cache key as a separate component and making it part 
of the cache key, with a metric we update based on misses. And it would make 
the cache more complicated to identify that a miss was specifically due to the 
query options cache key. I think reviewing profiles will be sufficient.

http://gerrit.cloudera.org:8080/#/c/21698/9/be/src/service/query-options.h
File be/src/service/query-options.h:

http://gerrit.cloudera.org:8080/#/c/21698/9/be/src/service/query-options.h@77
PS9, Line 77:   QUERY_OPT_FN(mem_limit, MEM_LIMIT, TQueryOptionLevel::REGULAR)  
                       \
> I could see this one being different across different queries or changing o
Done


http://gerrit.cloudera.org:8080/#/c/21698/9/be/src/service/query-options.h@83
PS9, Line 83:   QUERY_OPT_FN(request_pool, REQUEST_POOL, 
TQueryOptionLevel::REGULAR)                   \
> I think we want to share results across different request pools. I think we
Done


http://gerrit.cloudera.org:8080/#/c/21698/9/be/src/service/query-options.h@194
PS9, Line 194:   QUERY_OPT_FN(statement_expression_limit, 
STATEMENT_EXPRESSION_LIMIT,                   \
             :       TQueryOptionLevel::REGULAR)                                
                        \
> Let's make this exempt. It applies a limit, but it otherwise has no impact
Done


http://gerrit.cloudera.org:8080/#/c/21698/9/be/src/service/query-options.h@211
PS9, Line 211:   QUERY_OPT_FN(mem_limit_executors, MEM_LIMIT_EXECUTORS, 
TQueryOptionLevel::ADVANCED)    \
> Same as mem_limit
Done


http://gerrit.cloudera.org:8080/#/c/21698/9/be/src/service/query-options.h@326
PS9, Line 326:   QUERY_OPT_FN(mem_limit_coordinators, MEM_LIMIT_COORDINATORS,   
                        \
             :       TQueryOptionLevel::ADVANCED)
> I think we should exempt this
Done


http://gerrit.cloudera.org:8080/#/c/21698/11/fe/src/main/java/org/apache/impala/planner/PlanNode.java
File fe/src/main/java/org/apache/impala/planner/PlanNode.java:

http://gerrit.cloudera.org:8080/#/c/21698/11/fe/src/main/java/org/apache/impala/planner/PlanNode.java@1388
PS11, Line 1388:     tupleCacheInfo_.hashThrift(queryOptsHash);
> Nit: What do you think about incorporating it once (i.e. at leaf nodes) and
Done



--
To view, visit http://gerrit.cloudera.org:8080/21698
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1f4802ad9548749cd43df8848b6f46dca3739ae7
Gerrit-Change-Number: 21698
Gerrit-PatchSet: 11
Gerrit-Owner: Michael Smith <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Kurt Deschler <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Yida Wu <[email protected]>
Gerrit-Comment-Date: Mon, 07 Oct 2024 18:38:33 +0000
Gerrit-HasComments: Yes

Reply via email to