Hello, Last year, we started looking into integrating Iceberg with Hive and working on a proof-of-concept. Unfortunately, the project was paused a few months later but we're hoping to resume our work this year, hopefully in Q1.
We'll keep you posted. Cheers, Adrien On Wed, Jan 8, 2020 at 10:43 AM Ryan Blue <rb...@netflix.com.invalid> wrote: > Thanks for the interest in Hive integration! I haven't heard about > progress here lately, so it's good that you bring it up. Hopefully the > other people that are interested can jump in with their current status. > > I think you're right that the MR input and output formats are a good place > to start, but if I remember correctly, Hive ignores the output > format's committer. That means we will need to plug in at the catalog level > at some point. Owen O'Malley has pointed us to the `RawStore` API that is > what backs metastore interaction for that. > > On Wed, Jan 8, 2020 at 6:28 AM Elliot West <tea...@gmail.com> wrote: > >> Hello, >> >> We're considering working on an integration of Iceberg with Apache Hive, >> initially so that the latest snapshot of Iceberg tables can be queried via >> Hive, but later to allow the writing of data using the Iceberg table format. >> >> I wanted to first check for the existence and status of any similar >> efforts so that we do not find ourselves duplicating work unnecessarily. >> I've checked both the Iceberg and Hive projects and can find no issues that >> suggest that such an integration is underway or planned (only HIVE-19457 >> <https://issues.apache.org/jira/browse/HIVE-19457> which was raised by >> myself and remains open). >> >> If one or more efforts is underway we'd certainly be open to >> contributing. If not, we'd be keen to capture any thoughts from the >> community on preferred or recommended technical approaches. >> >> I see that some work occurred on MR In/Out formats >> <https://github.com/guilload/incubator-iceberg/pull/1> which might serve >> as a foundation, so we'll certainly be investigating those further. >> >> Thanks, >> >> Elliot. >> > > > -- > Ryan Blue > Software Engineer > Netflix > -- Adrien Guillo Data Infrastructure San Francisco