[ https://issues.apache.org/jira/browse/HIVE-8394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14168814#comment-14168814 ]
Hive QA commented on HIVE-8394: ------------------------------- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12674344/HIVE-8394.1.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 4137 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_tez_smb_1 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1236/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1236/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1236/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12674344 > HIVE-7803 doesn't handle Pig MultiQuery, can cause data-loss. > ------------------------------------------------------------- > > Key: HIVE-8394 > URL: https://issues.apache.org/jira/browse/HIVE-8394 > Project: Hive > Issue Type: Bug > Components: HCatalog > Affects Versions: 0.12.0, 0.14.0, 0.13.1 > Reporter: Mithun Radhakrishnan > Assignee: Mithun Radhakrishnan > Priority: Critical > Attachments: HIVE-8394.1.patch > > > We've found situations in production where Pig queries using {{HCatStorer}}, > dynamic partitioning and {{opt.multiquery=true}} that produce partitions in > the output table, but the corresponding directories have no data files (in > spite of Pig reporting non-zero records written to HDFS). I don't yet have a > distilled test-case for this. > Here's the code from FileOutputCommitterContainer after HIVE-7803: > {code:java|title=FileOutputCommitterContainer.java|borderStyle=dashed|titleBGColor=#F7D6C1|bgColor=#FFFFCE} > @Override > public void commitTask(TaskAttemptContext context) throws IOException { > String jobInfoStr = > context.getConfiguration().get(FileRecordWriterContainer.DYN_JOBINFO); > if (!dynamicPartitioningUsed) { > //See HCATALOG-499 > FileOutputFormatContainer.setWorkOutputPath(context); > > getBaseOutputCommitter().commitTask(HCatMapRedUtil.createTaskAttemptContext(context)); > } else if (jobInfoStr != null) { > ArrayList<String> jobInfoList = > (ArrayList<String>)HCatUtil.deserialize(jobInfoStr); > org.apache.hadoop.mapred.TaskAttemptContext currTaskContext = > HCatMapRedUtil.createTaskAttemptContext(context); > for (String jobStr : jobInfoList) { > OutputJobInfo localJobInfo = > (OutputJobInfo)HCatUtil.deserialize(jobStr); > FileOutputCommitter committer = new FileOutputCommitter(new > Path(localJobInfo.getLocation()), currTaskContext); > committer.commitTask(currTaskContext); > } > } > } > {code} > The serialized jobInfoList can't be retrieved, and hence the commit never > completes. This is because Pig's MapReducePOStoreImpl deliberately clones > both the TaskAttemptContext and the contained Configuration instance, thus > separating the Configuration instances passed to > {{FileOutputCommitterContainer::commitTask()}} and > {{FileRecordWriterContainer::close()}}. Anything set by the RecordWriter is > unavailable to the Committer. > One approach would have been to store state in the FileOutputFormatContainer. > But that won't work since this is constructed via reflection in > HCatOutputFormat (itself constructed via reflection by PigOutputFormat via > HCatStorer). There's no guarantee that the instance is preserved. > My only recourse seems to be to use a Singleton to store shared state. I'm > loath to indulge in this brand of shenanigans. (Statics and container-reuse > in Tez might not play well together, for instance.) It might work if we're > careful about tearing down the singleton. > Any other ideas? -- This message was sent by Atlassian JIRA (v6.3.4#6332)