After all, I was able to have my MetaHook class' commitInsertTable() method
be properly called by Tez. However, it looks like it's in fact a different
instance of that class, and therefore it doesn't share the same
Configuration object as the one that was initialized at the beginning of
the job. So I'm not sure there's a way to pass information between the two
instances.

Perhaps one way around this issue would be to create a temporary file based
on the query ID conf ("hive.query.id")? The first instance would dump some
information in that file, then the other instance would load it up on the
other end to perform the commit.

Julien

On 2022/04/29 21:06:49 Julien Phalip wrote:
> Hi Peter,
>
> Looking at https://issues.apache.org/jira/browse/TEZ-4279, it seems that
> the fix might have been applied to 0.9.3. Is that correct? If so, do you
> think that just upgrading Tez to that version might be enough to allow the
> "setUpJob()", "commitJob()" and "abortJob()" to be called appropriately?
>
> I'm curious if the Hive changes that you've referenced are also needed or
> not. Would you mind clarifying what those Hive changes specifically
achieve?
>
> Also, to answer your question, I'm currently working on a rewrite of the
> Hive-BigQuery connector (
> https://github.com/GoogleCloudDataproc/hive-bigquery-storage-handler).
I'll
> be happy to post a quick update here once I complete all the changes that
> I'm working on, hopefully some time soon.
>
> Thanks,
>
> Julien
>
> On 2022/04/28 07:40:44 Peter Vary wrote:
> > Hi Julien,
> >
> > Hive 3.1.2 is dependent on 0.9 Tez, and I seem to remember having issues
> running Hive 3.1.2 with Tez 0.10.
> > OTOH you might get away with patching 0.9 Tez with the appropriate
> changes. I would ask this on the Tez mailing list.
> >
> > Are you trying out Hive-Iceberg integration, or it is another custom
> SerDe?
> >
> > Thanks,
> > Peter
> >
> > > On 2022. Apr 27., at 19:12, Julien Phalip <jp...@gmail.com> wrote:
> > >
> > > Thanks Peter.
> > >
> > > By chance could I get things to work by keeping my current version of
> Hive (3.1.2) and only upgrading Tez? Which version(s) should I use?
> > >
> > > Thank you,
> > >
> > > Julien
> > >
> > > On 2022/04/27 08:59:08 Peter Vary wrote:
> > > > We had the same issue with the IcebergOutputCommitter.
> > > >
> > > > The first solution was this:
> https://issues.apache.org/jira/browse/HIVE-25006 <
> https://issues.apache.org/jira/browse/HIVE-25006> <
> https://issues.apache.org/jira/browse/HIVE-25006 <
> https://issues.apache.org/jira/browse/HIVE-25006>>
> > > > It needed https://issues.apache.org/jira/browse/TEZ-4279 <
> https://issues.apache.org/jira/browse/TEZ-4279> <
> https://issues.apache.org/jira/browse/TEZ-4279 <
> https://issues.apache.org/jira/browse/TEZ-4279>>
> > > >
> > > > Later we ended up with this final solution:
> https://issues.apache.org/jira/browse/HIVE-25208 <
> https://issues.apache.org/jira/browse/HIVE-25208> <
> https://issues.apache.org/jira/browse/HIVE-25208 <
> https://issues.apache.org/jira/browse/HIVE-25208>>
> > > >
> > > > I hope this helps,
> > > > Peter
> > > >
> > > > > On 2022. Apr 27., at 1:46, Julien Phalip <jp...@gmail.com <
> ma...@gmail.com>> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > I'm working on a custom storage handler. My custom output
committer
> class gets called normally when using the "mr" engine. However, it seems
to
> be entirely ignored when using the "tez" engine.
> > > > >
> > > > > I'm setting the JobConf's "mapred.output.committer.class" key to
my
> fully-qualified output committer class name in the handler's
> configureJobConf() method. I've also tried the
> "hive.tez.mapreduce.output.committer.class" key and also tried setting
> those keys in the job properties in the configureOutputJobProperties()
> method. But that didn't work either.
> > > > >
> > > > > By the way, I'm using Hive 3.1.2 and Tez 0.9.1.
> > > > >
> > > > > Do you know what I might be missing or doing wrong?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Julien
> > > >
> > > >
> >
> >
>

Reply via email to