caofangkun created HIVE-4561: -------------------------------- Summary: Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 ,if all the column values larger than 0.0 (or if all column values smaller than 0.0) Key: HIVE-4561 URL: https://issues.apache.org/jira/browse/HIVE-4561 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.12.0 Reporter: caofangkun Priority: Minor
if all column values larger than 0.0 DOUBLE_LOW_VALUE always will be 0.0 or if all column values less than 0.0, DOUBLE_HIGH_VALUE will always be hive (default)> create table src_test (price double); hive (default)> load data local inpath './test.txt' into table src_test; hive (default)> select * from src_test; OK 1.0 2.0 3.0 Time taken: 0.313 seconds, Fetched: 3 row(s) hive (default)> analyze table src_test compute statistics for columns price; mysql> select * from TAB_COL_STATS \G; *************************** 1. row *************************** CS_ID: 16 DB_NAME: default TABLE_NAME: src_test COLUMN_NAME: price COLUMN_TYPE: double TBL_ID: 2586 LONG_LOW_VALUE: 0 LONG_HIGH_VALUE: 0 DOUBLE_LOW_VALUE: 0.0000 # Wrong Result ! Expected is 1.0000 DOUBLE_HIGH_VALUE: 3.0000 BIG_DECIMAL_LOW_VALUE: NULL BIG_DECIMAL_HIGH_VALUE: NULL NUM_NULLS: 0 NUM_DISTINCTS: 1 AVG_COL_LEN: 0.0000 MAX_COL_LEN: 0 NUM_TRUES: 0 NUM_FALSES: 0 LAST_ANALYZED: 1368596151 2 rows in set (0.00 sec) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira