With four +1's, the vote passes.

- A cluster-wide configuration (most likely as a "cc.conf" parameter) should 
definitely be added.
- Agreed on the "compiler.querycache.clear" not being ideal, using the admin 
API is a great alternative.
- Agreed on more research into the memory footprint of the cache to set a 
better default value.

> On Dec 8, 2023, at 12:49, Ian Maxon <ima...@apache.org> wrote:
> 
> +1. This has been a long awaited feature.
> 
> On Dec 7, 2023 at 14:21:31, Glenn Justo Galvizo <ggalv...@uci.edu 
> <mailto:ggalv...@uci.edu>> wrote:
> 
>> Every time a query is issued to AsterixDB, the query must undergo
>> compilation. If the same query is run repeatedly, this query must be
>> recompiled each and every time. A query plan cache can help AsterixDB
>> achieve a lower floor on the end-to-end time by storing the job
>> specifications for previously compiled queries, ultimately skipping the AST
>> rewriting and Algebricks compilation of a previously executed query.
>> 
>> (APE copied from contributor Sushrut Borkor)
>> 
>> This APE is about adding a query plan cache to AsterixDB. More
>> specifically, this query plan cache acts as a hash table that skips 1) the
>> AST rewriting, 2) the entire Algebricks plan translation to Algebricks
>> optimization, and 3) the Hyracks job generation. The keys of this hash
>> table are:
>>   • AST String. We cache this instead of the original query string before
>> parsing because it is resilient to minor changes in the query, such as
>> adding spaces or empty lines.
>>   • SessionConfig. For example, if the user runs a query, changes part of
>> the session configuration (e.g. the preferred output format), and reruns
>> the query, this prevents the second query from being served from the cache.
>>   • Config, to capture the effects of used SET statements.
>>   • Active Dataverse, e.g., as defined in a USE statement.
>>   • Result Set ID, which distinguishes among queries in multi-statement
>> requests.
>> 
>> While the values of each hash table entry are:
>>   • Hyracks Job Spec to be submitted to Hyracks.
>>   • Cached warnings. Since we skip compilation when serving queries from
>> the cache, we cannot detect compile time warnings. To get around this, we
>> cache warnings issued during rewriting and compilation, and then reissue
>> them for cache hits. As a result, line numbers in warnings may be incorrect
>> for queries answered using the cache.
>>   • Lock. Since running the same job from multiple threads does not work,
>> we include a lock in the cache value. To use a cached job spec, a thread
>> must acquire this lock, and then release it after the job has finished
>> running. If the lock is held by another thread, we recompile the query
>> instead of blocking.
>> 
>> The proposed changes are the following:
>> 
>> Interface:
>> We introduce two new statements for controlling cache access:
>>   • “SET `compiler.querycache.bypass` "true";” forces the current query
>> to ignore the cache.
>>   • “SET `compiler.querycache.clear` "true";” clears all cache entries.
>> The current query may still insert into the cache.
>> We also add a boolean HTTP API parameter bypass_cache which does the same
>> thing as the first SET statement above. Finally, the parameter
>> query.cache.capacity can be configured in the [cc] section of the cc.conf
>> file to control the maximum cache size before replacement.
>> 
>> Changes:
>>   • Compilation logic is changed in the source code since we skip
>> rewriting and compilation for cache hits.
>>   • Hints are now included in the AST string to prevent incorrect cache
>> lookups that would otherwise miss the hints.
>>   • A bug is fixed where the AST string of WINDOW expressions did not
>> include FROM LAST or IGNORE NULLS.
>> 
>> See
>> https://urldefense.com/v3/__https://issues.apache.org/jira/projects/ASTERIXDB/issues/ASTERIXDB-3183__;!!CzAuKJ42GuquVTTmVmPViYEvSg!Jhj6lSdZ_YN_3h2QM-EEPYwthqvlhCZ13nFvx1rMAotNv3UxlZmgXM-q4xCOBR2zE5iaBDGBXD5P-ZBx$
>> for the JIRA issue, as well as
>> https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/ASTERIXDB/APE*2*3A*Query*Plan*Cache__;KyUrKys!!CzAuKJ42GuquVTTmVmPViYEvSg!Jhj6lSdZ_YN_3h2QM-EEPYwthqvlhCZ13nFvx1rMAotNv3UxlZmgXM-q4xCOBR2zE5iaBDGBXDWm8yfg$
>> for more details.
>> 
>> Please vote on this APE. We will keep this open for 72 hours and pass with
>> either 3 votes or a majority of positive votes.

Reply via email to