Thanks Gopal. Setting that configuration significantly reduced the query runtime. I chose a value of 0.3f. Is there any empirical way to decide what value to set for this? It is not completely clear from the code how this is being used.
Mainak > On Feb 6, 2019, at 7:44 PM, Gopal Vijayaraghavan <gop...@apache.org> wrote: > > Hi, > > That looks like the TopN hash optimization didn't kick in, that must be a > settings issue in the install. > > | Reduce Output Operator | > | key expressions: _col0 (type: string) | > | sort order: + | > | Statistics: Num rows: 1 Data size: 762813939712 Basic > stats: PARTIAL Column stats: NONE | > > https://github.com/apache/hive/commit/265ae7b4f81ec7cf19c6f0b59a13a3e7dfb942e4#diff-ea752552821a2ae5f3a33c6db210ef0a > > I'd check if the configs for that are setup for you. > > Cheers, > Gopal > > > >