David Main created HIVE-7578:
--------------------------------

             Summary: Sorting and max() against numeric partition column does 
not work
                 Key: HIVE-7578
                 URL: https://issues.apache.org/jira/browse/HIVE-7578
             Project: Hive
          Issue Type: Bug
          Components: Query Processor
    Affects Versions: 0.12.0, 0.10.0
         Environment: Tested in a linux environment
            Reporter: David Main


Hive does not return the correct results when you run "max()" on a table that 
has been partitioned on a numeric column when more than 10 partitions are 
present and they are numbered 1 - 10.  The same thing happens if the numbers 
are in the range of 1 - 100.

The steps to recreate this issue are included below.

First, put the file below into HDFS at '/tmp/hive_bug/people_flat/people.dat'.  
(Leave lines with leading dashes out)

----- File contents -----
1,David
2,Jeff
3,Cindy
4,Prakash
5,Kate
6,Chung
7,Ginny
8,Huy
9,Brett
10,Jennifer
11,Dan
12,Shivani
13,Nate
----- EOF -----

Now run these statements in either Hive or beeline:

drop table if exists people_flat;

create external table people_flat (
    pnum int, 
    pname string
)
row format delimited fields terminated by ','
location '/tmp/hive_bug/people_flat';

select * from people_flat order by plum;
-- order of records should be correct

select max(pnum) from people_flat;
-- 13 (this is correct)

create table people_partitioned (
    pname string
)
partitioned by (pnum int)
row format delimited fields terminated by ',';

SET hive.exec.max.dynamic.partitions=10000;
SET hive.exec.max.dynamic.partitions.pernode=10000;
SET hive.exec.dynamic.partition.mode=nonstrict;

insert into table people_partitioned
partition (pnum)
select pname, 
    pnum
from people_flat;

select max(pnum) from people_partitioned;
-- result 9 is wrong!

select * from people_partitioned order by pnum;
-- records not sorted correctly in numeric order

-- Although this is not a solution, the next query yield correct results

select max(pnum_int)
from (
    select cast(pnum as int) pnum_int
    from people_partitioned
) a;





--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to