> My conclusion is that a query can update some internal states of HiveServer2, 
> affecting DAG generation for subsequent queries. 

Other than the automatic reoptimization feature, there's two other potential 
suspects.

First one would be to disable the in-memory stats cache's variance param, which 
might be triggering some residual effects.

hive.metastore.aggregate.stats.cache.max.variance

I set it to 0.0 when I suspect that feature is messing with the runtime plans 
or just disable the cache entirely with

set hive.metastore.aggregate.stats.cache.enabled=false;

Other than that, query24 is an interesting query.

Is probably one of the corner cases where the predicate push-down is actually 
hurting the shared work optimizer.

Also cross-check if you have accidentally loaded store_sales with 
ss_item_sk(int) and if the item i_item_sk is a bigint (type mismatches will 
trigger a slow join algorithm, but without any consistency issues).

Cheers,
Gopal


Reply via email to