Re: [DISCUSS] Implementation strategies for supporting Iceberg tables in Hive

2019-08-08 Thread Ryan Blue
> With an iceberg raw store, I suspect that you might not need a storage handler and could go straight to a input/output format. You would probably need an input and output format for each of the storage formats: Iceberg{Orc,Parquet,Avro}{Input,Output}Format. I don't think that would work because

Re: [DISCUSS] Implementation strategies for supporting Iceberg tables in Hive

2019-08-07 Thread Owen O'Malley
> On Jul 24, 2019, at 22:52, Adrien Guillo > wrote: > > Hi Iceberg folks, > > In the last few months, we (the data infrastructure team at Airbnb) have been > closely following the project. We are currently evaluating potential > strategies to migrate our data warehouse to Iceberg. However,

Re: [DISCUSS] Implementation strategies for supporting Iceberg tables in Hive

2019-07-29 Thread Daniel Weeks
Owen or Carl, Do you have any thoughts on this approach? We had previously discussed this but now that we've looked into it more closely there are a few areas that are unclear. HiveMetaHook looks like a good entry point for DDL (though as Adrien pointed out, it doesn't cover all operations). Ho

Re: [DISCUSS] Implementation strategies for supporting Iceberg tables in Hive

2019-07-25 Thread RD
Hi Adrien, We at LinkedIn went through a similar thought process, but given our Hive deployment is not that large, we are in the process of considering deprecating Hive and asking our users to move to Spark [since Spark supports Hive ql]. I'm guessing we'd have to invest in Spark's catalog AFAI

[DISCUSS] Implementation strategies for supporting Iceberg tables in Hive

2019-07-24 Thread Adrien Guillo
Hi Iceberg folks, In the last few months, we (the data infrastructure team at Airbnb) have been closely following the project. We are currently evaluating potential strategies to migrate our data warehouse to Iceberg. However, we have a very large Hive deployment, which means we can’t really do so