Prasanth J created HIVE-8151:
--------------------------------

             Summary: Dynamic partition sort optimization inserts record 
wrongly to partition when used with GroupBy
                 Key: HIVE-8151
                 URL: https://issues.apache.org/jira/browse/HIVE-8151
             Project: Hive
          Issue Type: Bug
    Affects Versions: 0.13.1, 0.14.0
            Reporter: Prasanth J
            Assignee: Prasanth J
            Priority: Critical


HIVE-6455 added dynamic partition sort optimization. It added startGroup() 
method to FileSink operator to look for changes in reduce key for creating 
partition directories. This method however is reliable as the key called with 
startGroup() is different from the key called with processOp(). startGroup() is 
called with newly changed key whereas processOp() is called with previously 
aggregated key. This will result in processOp() writing the last row of 
previous group as the first row of next group. This happens only when used with 
group by operator.

The fix is to not rely on startGroup() and do the partition directory creation 
in processOp() itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to