Re: Iceberg in Spark 3.0.0

2019-11-22 Thread Ryan Blue
I agree, let's create a spark-3.0 branch to start with. We've been building vectorization this way using the vectorized-reads branch. In the long term, we may want to split Spark into separate modules for 2.x and 3.x in the same branch, but for now we can at least get everything working with a 3.0

Re: Query about the semantics of "overwrite" in Iceberg

2019-11-22 Thread Ryan Blue
Saisai, Iceberg's behavior matches Hive's and Spark's behavior when using dynamic overwrite mode. Spark does not specify the correct behavior -- it varies by source. In addition, it isn't possible for a v2 source in 2.4 to implement the static overwrite mode that is Spark's default. The problem i

Re: Iceberg in Spark 3.0.0

2019-11-22 Thread John Zhuge
+1 for Iceberg branch Thanks for the contribution from you and your team! On Fri, Nov 22, 2019 at 8:29 AM Anton Okolnychyi wrote: > +1 on having a branch in Iceberg as we have for vectorized reads. > > - Anton > > On 22 Nov 2019, at 02:26, Saisai Shao wrote: > > Hi Ryan and team, > > Thanks a

Re: Iceberg in Spark 3.0.0

2019-11-22 Thread Anton Okolnychyi
+1 on having a branch in Iceberg as we have for vectorized reads. - Anton > On 22 Nov 2019, at 02:26, Saisai Shao wrote: > > Hi Ryan and team, > > Thanks a lot for your response. I was wondering how do we share our branch, > one possible way s that we maintain a forked Iceberg repo with Spark