Venkatasubrahmanian Narayanan created HADOOP-19091:
------------------------------------------------------

             Summary: Add support for Tez to MagicS3GuardCommitter
                 Key: HADOOP-19091
                 URL: https://issues.apache.org/jira/browse/HADOOP-19091
             Project: Hadoop Common
          Issue Type: Bug
          Components: tools
    Affects Versions: 3.3.3
         Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
            Reporter: Venkatasubrahmanian Narayanan


The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
that of the job's application master when writing/reading the .pendingset file. 
This assumption is not valid when running with Tez, which creates slightly 
different JobIDs for tasks and the application master.

 

While the MagicS3GuardCommitter is intended only for MRv2, it mostly works fine 
with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run in MR 
mode. This issue only crops up when running queries with the Tez execution 
engine. I can upload a patch to Hive 3.1 to reproduce this error on EMR if 
needed.

 

Fixing this will probably require work from both Tez and Hadoop, wanted to 
start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Reply via email to