[ 
https://issues.apache.org/jira/browse/HIVE-11045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14592502#comment-14592502
 ] 

Soundararajan Velu commented on HIVE-11045:
-------------------------------------------

Vikram, 

I face this issue only with Hive on Tez, my data is in json format and I use 
JsonSerde from https://github.com/rcongiu/Hive-JSON-Serde,
The query runs perfectly fine on Hive. This only occurs with Tez.
Data set is huge and I have no clue on which records this exception arises, 

The query is as below,
SELECT t1.return_id AS return_id,
       t1.approve_date AS approve_date,
       t1.approve_date_key AS approve_date_key,
       t1.cancel_date AS cancel_date,
       t1.cancel_date_key AS cancel_date_key,
       t1.complete_date AS complete_date,
       t1.complete_date_key AS complete_date_key,
       t1.init_cancellation_date AS init_cancellation_date,
       t1.init_cancellation_date_key AS init_cancellation_date_key,
       t1.reject_date AS reject_date,
       t1.reject_date_key AS reject_date_key,
       t1.unhold_date AS unhold_date,
       t1.unhold_date_key AS unhold_date_key,
       t1.request_service_date AS request_service_date,
       t1.request_service_date_key AS request_service_date_key,
       t1.service_approve_return_date AS service_approve_return_date,
       t1.service_approve_return_date_key AS service_approve_return_date_key,
       CASE
           WHEN t2.action_override_status_time IS NULL THEN 0
           ELSE 1
       END AS flag_action_override,
       CASE
           WHEN t2.action_override_status_time IS NULL THEN NULL
           ELSE t2.action_override_status_time
       END AS action_override_status_time,
       CASE
           WHEN t2.action_override_user_login IS NULL THEN 'NA'
           ELSE t2.action_override_user_login
       END AS action_override_user_login,
       CASE
           WHEN t2.action_override_change_reason IS NULL THEN 'NA'
           ELSE t2.action_override_change_reason
       END AS action_override_change_reason,
       CASE
           WHEN t2.action_override_change_sub_reason IS NULL THEN 'NA'
           ELSE t2.action_override_change_sub_reason
       END AS action_override_change_sub_reason,
       CASE
           WHEN t2.action_override_count IS NULL THEN cast(0 AS bigint)
           ELSE t2.action_override_count
       END AS action_override_count,
       CASE
           WHEN t2.action_change_data IS NULL THEN 'NA'
           ELSE t2.action_change_data
       END AS action_change_data,
       CASE
           WHEN t3.policy_override_status_time IS NULL THEN 0
           ELSE 1
       END AS flag_policy_override,
       CASE
           WHEN t3.policy_override_status_time IS NULL THEN NULL
           ELSE t3.policy_override_status_time
       END AS policy_override_status_time,
       CASE
           WHEN t3.policy_override_user_login IS NULL THEN 'NA'
           ELSE t3.policy_override_user_login
       END AS policy_override_user_login,
       CASE
           WHEN t3.policy_override_change_reason IS NULL THEN 'NA'
           ELSE t3.policy_override_change_reason
       END AS policy_override_change_reason,
       CASE
           WHEN t3.policy_override_change_sub_reason IS NULL THEN 'NA'
           ELSE t3.policy_override_change_sub_reason
       END AS policy_override_change_sub_reason,
       CASE
           WHEN t3.policy_override_count IS NULL THEN cast(0 AS bigint)
           ELSE t3.policy_override_count
       END AS policy_override_count,
       CASE
           WHEN t3.policy_change_data IS NULL THEN 'NA'
           ELSE t3.policy_change_data
       END AS policy_change_data,
       cast(0 AS bigint) AS temp_flag,
       CASE
           WHEN t3.policy_override_status_date_key IS NULL THEN 0
           ELSE t3.policy_override_status_date_key
       END AS policy_override_status_date_key,
       CASE
           WHEN t2.action_override_status_date_key IS NULL THEN 0
           ELSE t2.action_override_status_date_key
       END AS action_override_status_date_key,
       t1.user_approved_by AS user_approved_by,
       t1.user_rejected_by AS user_rejected_by,
       t1.user_cancelled_by AS user_cancelled_by,
       t1.reject_reason AS reject_reason,
       t1.reject_sub_reason AS reject_sub_reason,
       t1.reject_change_data AS reject_change_data
FROM
  (SELECT rh1.`data`.return_id,
          MIN (CASE WHEN rh1.`data`.event = 'approve' THEN 
rh1.`data`.status_time ELSE NULL END) AS approve_date,
                                                                                
                    MIN (CASE WHEN rh1.`data`.event = 'cancel' THEN 
rh1.`data`.status_time ELSE NULL END) AS cancel_date,
                                                                                
                                                                                
                             MIN (CASE WHEN rh1.`data`.event = 'complete' THEN 
rh1.`data`.status_time ELSE NULL END) AS complete_date,
MIN (CASE WHEN rh1.`data`.event = 'init_cancellation' THEN 
rh1.`data`.status_time ELSE NULL END) AS init_cancellation_date,
                                                                                
                    MIN (CASE WHEN rh1.`data`.event = 'reject' THEN 
rh1.`data`.status_time ELSE NULL END) AS reject_date,
                                                                                
                                                                                
                             MIN (CASE WHEN rh1.`data`.event = 'unhold' THEN 
rh1.`data`.status_time ELSE NULL END) AS unhold_date,
MIN (CASE WHEN rh1.`data`.event = 'request_service' THEN rh1.`data`.status_time 
ELSE NULL END) AS request_service_date,
                                                                                
                  MIN (CASE WHEN rh1.`data`.event = 'service_approve_return' 
THEN rh1.`data`.status_time ELSE NULL END) AS service_approve_return_date,
MIN (CASE WHEN rh1.`data`.event = 'approve' THEN 
lookup_date(rh1.`data`.status_time) ELSE NULL END) AS approve_date_key,
                                                                                
                       MIN (CASE WHEN rh1.`data`.event = 'cancel' THEN 
lookup_date(rh1.`data`.status_time) ELSE NULL END) AS cancel_date_key,
MIN (CASE WHEN rh1.`data`.event = 'complete' THEN 
lookup_date(rh1.`data`.status_time) ELSE NULL END) AS complete_date_key,
                                                                                
                        MIN (CASE WHEN rh1.`data`.event = 'init_cancellation' 
THEN lookup_date(rh1.`data`.status_time) ELSE NULL END) AS 
init_cancellation_date_key,
MIN (CASE WHEN rh1.`data`.event = 'reject' THEN 
lookup_date(rh1.`data`.status_time) ELSE NULL END) AS reject_date_key,
                                                                                
                      MIN (CASE WHEN rh1.`data`.event = 'unhold' THEN 
lookup_date(rh1.`data`.status_time) ELSE NULL END) AS unhold_date_key,
MIN (CASE WHEN rh1.`data`.event = 'request_service' THEN 
lookup_date(rh1.`data`.status_time) ELSE NULL END) AS request_service_date_key,
                                                                                
                               MIN (CASE WHEN rh1.`data`.event = 
'service_approve_return' THEN lookup_date(rh1.`data`.status_time) ELSE NULL 
END) AS service_approve_return_date_key,
MIN (CASE WHEN rh1.`data`.event = 'approve' THEN rh1.`data`.user_login ELSE 
NULL END) AS user_approved_by,
                                                                                
         MIN (CASE WHEN rh1.`data`.event = 'cancel' THEN rh1.`data`.user_login 
ELSE NULL END) AS user_cancelled_by,
                                                                                
                                                                                
                 MIN (CASE WHEN rh1.`data`.event = 'reject' THEN 
rh1.`data`.user_login ELSE NULL END) AS user_rejected_by,
MIN (CASE WHEN rh1.`data`.event = 'reject' THEN rh1.`data`.change_data ELSE 
NULL END) AS reject_change_data,
                                                                                
         MIN (CASE WHEN rh1.`data`.event = 'reject' THEN 
rh1.`data`.change_reason ELSE NULL END) AS reject_reason,
                                                                                
                                                                                
                    MIN (CASE WHEN rh1.`data`.event = 'reject' THEN 
rh1.`data`.change_sub_reason ELSE NULL END) AS reject_sub_reason
   FROM dart_fkint_scp_rrr_return_history_1_0_view rh1
   GROUP BY rh1.`data`.return_id) t1
LEFT OUTER JOIN
(SELECT rh2.`data`.return_id,
        max(rh2.`data`.status_time) AS action_override_status_time,
                                       max(lookup_date(rh2.`data`.status_time)) 
AS action_override_status_date_key,
                                                                                
   max(rh2.`data`.user_login) AS action_override_user_login,
                                                                                
                                 max(rh2.`data`.change_reason) AS 
action_override_change_reason,
                                                                                
                                                                  
max(rh2.`data`.change_sub_reason) AS action_override_change_sub_reason,
                                                                                
                                                                                
                       max(rh2.`data`.change_data) AS action_change_data,
count(DISTINCT rh2.`data`.status_time) AS action_override_count
 FROM dart_fkint_scp_rrr_return_history_1_0_view rh2
WHERE rh2.`data`.change_reason='action_override'
 GROUP BY rh2.`data`.return_id) t2 ON t1.return_id = t2.return_id
LEFT OUTER JOIN
(SELECT rh3.`data`.return_id,
        max(rh3.`data`.status_time) AS policy_override_status_time,
                                       max(lookup_date(rh3.`data`.status_time)) 
AS policy_override_status_date_key,
                                                                                
   max(rh3.`data`.user_login) AS policy_override_user_login,
                                                                                
                                 max(rh3.`data`.change_reason) AS 
policy_override_change_reason,
                                                                                
                                                                  
max(rh3.`data`.change_sub_reason) AS policy_override_change_sub_reason,
                                                                                
                                                                                
                       max(rh3.`data`.change_data) AS policy_change_data,
count(DISTINCT rh3.`data`.status_time) AS policy_override_count
 FROM dart_fkint_scp_rrr_return_history_1_0_view rh3
WHERE rh3.`data`.change_reason='policy_override'
 GROUP BY rh3.`data`.return_id) t3 ON t1.return_id = t3.return_id ;

> ArrayIndexOutOfBoundsException with Hive 1.2.0 and Tez 0.7.0
> ------------------------------------------------------------
>
>                 Key: HIVE-11045
>                 URL: https://issues.apache.org/jira/browse/HIVE-11045
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>    Affects Versions: 1.2.0
>         Environment: Hive 1.2.0, HDP 2.2, Hadoop 2.6, Tez 0.7.0
>            Reporter: Soundararajan Velu
>
>  TaskAttempt 3 failed, info=[Error: Failure while running 
> task:java.lang.RuntimeException: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row (tag=0) 
> {"key":{"_col0":4457890},"value":{"_col0":null,"_col1":null,"_col2":null,"_col3":null,"_col4":null,"_col5":null,"_col6":null,"_col7":null,"_col8":null,"_col9":null,"_col10":null,"_col11":null,"_col12":null,"_col13":null,"_col14":null,"_col15":null,"_col16":null,"_col17":"fkl_shipping_b2c","_col18":null,"_col19":null,"_col20":null,"_col21":null}}
>         at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
>         at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
>         at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:345)
>         at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
>         at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>         at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
>         at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
>         at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row (tag=0) 
> {"key":{"_col0":4457890},"value":{"_col0":null,"_col1":null,"_col2":null,"_col3":null,"_col4":null,"_col5":null,"_col6":null,"_col7":null,"_col8":null,"_col9":null,"_col10":null,"_col11":null,"_col12":null,"_col13":null,"_col14":null,"_col15":null,"_col16":null,"_col17":"fkl_shipping_b2c","_col18":null,"_col19":null,"_col20":null,"_col21":null}}
>         at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:302)
>         at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:249)
>         at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
>         ... 14 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row (tag=0) 
> {"key":{"_col0":4457890},"value":{"_col0":null,"_col1":null,"_col2":null,"_col3":null,"_col4":null,"_col5":null,"_col6":null,"_col7":null,"_col8":null,"_col9":null,"_col10":null,"_col11":null,"_col12":null,"_col13":null,"_col14":null,"_col15":null,"_col16":null,"_col17":"fkl_shipping_b2c","_col18":null,"_col19":null,"_col20":null,"_col21":null}}
>         at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:370)
>         at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:292)
>         ... 16 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row (tag=1) 
> {"key":{"_col0":6417306,"_col1":{0:{"_col0":"2014-08-01 
> 02:14:02"}}},"value":{"_col0":"2014-08-01 
> 02:14:02","_col1":20140801,"_col2":"sc_jarvis_b2c","_col3":"action_override","_col4":"WITHIN_GRACE_PERIOD","_col5":"policy_override"}}
>         at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:413)
>         at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchNextGroup(CommonMergeJoinOperator.java:381)
>         at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.process(CommonMergeJoinOperator.java:206)
>         at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
>         at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1016)
>         at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:821)
>         at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:695)
>         at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:761)
>         at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:361)
>         ... 17 more
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row (tag=1) 
> {"key":{"_col0":6417306,"_col1":{0:{"_col0":"2014-08-01 
> 02:14:02"}}},"value":{"_col0":"2014-08-01 
> 02:14:02","_col1":20140801,"_col2":"sc_jarvis_b2c","_col3":"action_override","_col4":"WITHIN_GRACE_PERIOD","_col5":"policy_override"}}
>         at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:302)
>         at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:405)
>         ... 25 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row (tag=1) 
> {"key":{"_col0":6417306,"_col1":{0:{"_col0":"2014-08-01 
> 02:14:02"}}},"value":{"_col0":"2014-08-01 
> 02:14:02","_col1":20140801,"_col2":"sc_jarvis_b2c","_col3":"action_override","_col4":"WITHIN_GRACE_PERIOD","_col5":"policy_override"}}
>         at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:370)
>         at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:292)
>         ... 26 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
>         at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:708)
>         at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:361)
>         ... 27 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to