Hello,

Last year, we started looking into integrating Iceberg with Hive and
working on a proof-of-concept. Unfortunately, the project was paused a few
months later but we're hoping to resume our work this year, hopefully in Q1.

We'll keep you posted.

Cheers,
Adrien

On Wed, Jan 8, 2020 at 10:43 AM Ryan Blue <rb...@netflix.com.invalid> wrote:

> Thanks for the interest in Hive integration! I haven't heard about
> progress here lately, so it's good that you bring it up. Hopefully the
> other people that are interested can jump in with their current status.
>
> I think you're right that the MR input and output formats are a good place
> to start, but if I remember correctly, Hive ignores the output
> format's committer. That means we will need to plug in at the catalog level
> at some point. Owen O'Malley has pointed us to the `RawStore` API that is
> what backs metastore interaction for that.
>
> On Wed, Jan 8, 2020 at 6:28 AM Elliot West <tea...@gmail.com> wrote:
>
>> Hello,
>>
>> We're considering working on an integration of Iceberg with Apache Hive,
>> initially so that the latest snapshot of Iceberg tables can be queried via
>> Hive, but later to allow the writing of data using the Iceberg table format.
>>
>> I wanted to first check for the existence and status of any similar
>> efforts so that we do not find ourselves duplicating work unnecessarily.
>> I've checked both the Iceberg and Hive projects and can find no issues that
>> suggest that such an integration is underway or planned (only HIVE-19457
>> <https://issues.apache.org/jira/browse/HIVE-19457> which was raised by
>> myself and remains open).
>>
>> If one or more efforts is underway we'd certainly be open to
>> contributing. If not, we'd be keen to capture any thoughts from the
>> community on preferred or recommended technical approaches.
>>
>> I see that some work occurred on MR In/Out formats
>> <https://github.com/guilload/incubator-iceberg/pull/1> which might serve
>> as a foundation, so we'll certainly be investigating those further.
>>
>> Thanks,
>>
>> Elliot.
>>
>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>


-- 
Adrien Guillo
Data Infrastructure
San Francisco

Reply via email to