Ashutosh Chauhan created HIVE-20260:
---------------------------------------
Summary: NDV of a column shouldn't be scaled when row count is
changed by filter on another column
Key: HIVE-20260
URL: https://issues.apache.org/jira/browse/HIVE-20260
Project: Hive
Issue Type: Improvement
Components: Statistics
Reporter: Ashutosh Chauhan
HIVE-17465 introduced progressive scaling of rowcounts in presence of multiple
filters. HIVE-19500 improved on that by also scaling col stats (NDV) in such
scenario. However, it should pay attention to column used in filter expression
and not scale for all filters. eg.,
consider filter a = 1 and b = 2 ndv of column b should not be scaled down by
row count changes caused by a = 1
Other way to say this that ndv of a particular column should be updated at the
end of computation of row count for that operator.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)