David Main created HIVE-7578: -------------------------------- Summary: Sorting and max() against numeric partition column does not work Key: HIVE-7578 URL: https://issues.apache.org/jira/browse/HIVE-7578 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0, 0.10.0 Environment: Tested in a linux environment Reporter: David Main
Hive does not return the correct results when you run "max()" on a table that has been partitioned on a numeric column when more than 10 partitions are present and they are numbered 1 - 10. The same thing happens if the numbers are in the range of 1 - 100. The steps to recreate this issue are included below. First, put the file below into HDFS at '/tmp/hive_bug/people_flat/people.dat'. (Leave lines with leading dashes out) ----- File contents ----- 1,David 2,Jeff 3,Cindy 4,Prakash 5,Kate 6,Chung 7,Ginny 8,Huy 9,Brett 10,Jennifer 11,Dan 12,Shivani 13,Nate ----- EOF ----- Now run these statements in either Hive or beeline: drop table if exists people_flat; create external table people_flat ( pnum int, pname string ) row format delimited fields terminated by ',' location '/tmp/hive_bug/people_flat'; select * from people_flat order by plum; -- order of records should be correct select max(pnum) from people_flat; -- 13 (this is correct) create table people_partitioned ( pname string ) partitioned by (pnum int) row format delimited fields terminated by ','; SET hive.exec.max.dynamic.partitions=10000; SET hive.exec.max.dynamic.partitions.pernode=10000; SET hive.exec.dynamic.partition.mode=nonstrict; insert into table people_partitioned partition (pnum) select pname, pnum from people_flat; select max(pnum) from people_partitioned; -- result 9 is wrong! select * from people_partitioned order by pnum; -- records not sorted correctly in numeric order -- Although this is not a solution, the next query yield correct results select max(pnum_int) from ( select cast(pnum as int) pnum_int from people_partitioned ) a; -- This message was sent by Atlassian JIRA (v6.2#6252)