Vineet Garg created HIVE-21330:
----------------------------------

             Summary: Bucketing id varies b/w data loaded through streaming 
apis and regular query
                 Key: HIVE-21330
                 URL: https://issues.apache.org/jira/browse/HIVE-21330
             Project: Hive
          Issue Type: Bug
            Reporter: Vineet Garg


The test at 
[https://github.com/apache/hive/blob/master/hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java#L439]
 tests for this case. It currently passes but for the wrong reason. This test 
checks for empty result set. Result sets are empty due to prior INSERT failing 
to load data not because the bucketing scheme is different.

This error with INSERT is fixed in https://github.com/apache/hive/pull/552. 
Test with this patch fails because the underlying bucketing ids generated are 
different.

These tests are run on MR instead of TEZ  which could explain the different 
bucketing ids.
I don't really know what are the repercussion of having different bucketing ids 
and why are they expected to be same but since there is a test to test this 
logic it is worth investigating the case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to