-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57009/
-----------------------------------------------------------

(Updated April 15, 2017, 11:52 a.m.)


Review request for hive and Aihua Xu.


Changes
-------

New patch to allow COLLECT_SET to take two arguments so that original behaviour 
is maintained.


Bugs: HIVE-16029
    https://issues.apache.org/jira/browse/HIVE-16029


Repository: hive-git


Description
-------

See the test case below:

{code}
0: jdbc:hive2://localhost:10000/default> select * from collect_set_test;
+---------------------+
| collect_set_test.a  |
+---------------------+
| 1                   |
| 2                   |
| NULL                |
| 4                   |
| NULL                |
+---------------------+

0: jdbc:hive2://localhost:10000/default> select collect_set(a) from 
collect_set_test;
+---------------+
|      _c0      |
+---------------+
| [1,2,4]  |
+---------------+

{code}

The correct result should be:

{code}
0: jdbc:hive2://localhost:10000/default> select collect_set(a) from 
collect_set_test;
+---------------+
|      _c0      |
+---------------+
| [1,2,null,4]  |
+---------------+
{code}


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCollectList.java 
156d19b 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCollectSet.java 
0c2cf90 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMkCollectionEvaluator.java
 2b5e6dd 


Diff: https://reviews.apache.org/r/57009/diff/2/

Changes: https://reviews.apache.org/r/57009/diff/1-2/


Testing
-------

Manully tested and confirmed result is correct:

{code}
0: jdbc:hive2://localhost:10000/default> select collect_set(a) from 
collect_set_test;
+---------------+
|      _c0      |
+---------------+
| [1,2,null,4]  |
+---------------+
{code}


Thanks,

Eric Lin

Reply via email to