Re: [DISCUSS] Dropping Spark 2.4 support

2023-04-14 Thread Anjali Norwood
Hi Fokko, Ryan,

Netflix is still on Spark-2.4.4 with Iceberg-0.9. We are actively migrating
to Spark-3.x and Iceberg 1.1 (or later). I do not anticipate us using
Spark-2.4.4 with newer versions of Iceberg (>0.9).
If the plan is to not support Spark-2.4.4 with Iceberg >= 1.X, that should
be ok.
@John Zhuge  can you please chime in?

thanks,
Anjali


On Fri, Apr 14, 2023 at 10:56 AM Ryan Blue  wrote:

> Overall I'm +1, but could be convinced otherwise.
>
> Spark 2.4 is old and doesn't really function properly because the Spark
> Catalog API was missing at the time. And people can still use older
> versions of Iceberg that support Spark 2.4 if they need it because the
> Iceberg spec guarantees forward compatibility.
>
> That said, I'd love to hear from more people on this. I think it would be
> great to drop support, but I don't know how many people still use it. Is
> upgrading Hadoop a good reason to drop support for an engine? Hadoop seems
> like a minor concern to me unless it is blocking something.
>
> Ryan
>
> On Thu, Apr 13, 2023 at 12:54 PM Jack Ye  wrote:
>
>> +1 for dropping 2.4 support
>>
>> Best,
>> Jack Ye
>>
>> On Thu, Apr 13, 2023 at 10:59 AM Fokko Driesprong 
>> wrote:
>>
>>> Hi all,
>>>
>>> I'm working on moving to Hadoop 3.x
>>> , and one thing is that it
>>> seems to be incompatible with Spark 2.4. I wanted to ask if people are
>>> still on Spark 2.4 and what we think of dropping the support. The last
>>> release of Spark 2.4.8 was on 2021-05-17 and it also looks like the 2.4
>>> branch on the Spark Github repository is stale, so I don't expect any
>>> further releases.
>>>
>>> Before creating a PR I would like to check on the mail-list if anyone
>>> has any objections. If so, please let us know.
>>>
>>> Thanks,
>>> Fokko Driesprong
>>>
>>
>
> --
> Ryan Blue
> Tabular
>


Re: Approaching Vectorized Reading in Iceberg ..

2019-07-31 Thread Anjali Norwood
Hi Gautam,

You wrote: ' - The filters are not being applied in columnar fashion they
are being applied row by row as in Iceberg each filter visitor is stateless
and applied separately on each row's column. ' .. this should not be a
problem for this particular benchmark as IcebergSourceFlatParquetDataRe
adBenchmark does not apply filters.

-Anjali.

On Wed, Jul 31, 2019 at 1:44 PM Gautam  wrote:

> Hey Samarth,
>   Sorry bout the delay. I ran into some bottlenecks for which
> I had to add more code to be able to run benchmarks. I'v checked in my
> latest changes to my fork's *vectorized-read* branch [0].
>
> Here's the early numbers on the initial implementation...
>
> *Benchmark Data:*
> - 10 files
> - 9MB each
> - 1Millon rows (1 RowGroup)
>
> Ran benchmark using the jmh benchmark tool within 
> incubator-iceberg/spark/src/jmh
> using batch different sizes and compared it to  spark's vectorization and
> non-vectorized reader.
>
> *Command: *
> ./gradlew clean   :iceberg-spark:jmh
>  -PjmhIncludeRegex=IcebergSourceFlatParquetDataReadBenchmark
> -PjmhOutputPath=benchmark/iceberg-source-flat-parquet-data-read-benchmark-result.txt
>
>
>
> *Benchmark
>Mode  Cnt   Score   Error  Units*
> IcebergSourceFlatParquetDataReadBenchmark.readFileSourceNonVectorized
>  ss5  16.172 ± 0.750   s/op
> IcebergSourceFlatParquetDataReadBenchmark.readFileSourceVectorized
> ss5   6.430 ± 0.136   s/op
> IcebergSourceFlatParquetDataReadBenchmark.readIceberg
>  ss5  15.287 ± 0.212   s/op
>
>
>
> *IcebergSourceFlatParquetDataReadBenchmark.readIcebergVectorized100k
>  ss5  18.310 ± 0.498
> s/opIcebergSourceFlatParquetDataReadBenchmark.readIcebergVectorized10k
> ss5  18.020 ± 0.378
> s/opIcebergSourceFlatParquetDataReadBenchmark.readIcebergVectorized5k
>ss5  17.769 ± 0.412   
> s/op*IcebergSourceFlatParquetDataReadBenchmark.readWithProjectionFileSourceNonVectorized
>ss5   2.794 ± 0.141   s/op
> IcebergSourceFlatParquetDataReadBenchmark.readWithProjectionFileSourceVectorized
>   ss5   1.063 ± 0.140   s/op
> IcebergSourceFlatParquetDataReadBenchmark.readWithProjectionIceberg
>  ss5   2.966 ± 0.133   s/op
>
>
> *IcebergSourceFlatParquetDataReadBenchmark.readWithProjectionIcebergVectorized100k
>  ss5   2.015 ± 0.261
> s/opIcebergSourceFlatParquetDataReadBenchmark.readWithProjectionIcebergVectorized10k
>   ss5   1.972 ± 0.105
> s/opIcebergSourceFlatParquetDataReadBenchmark.readWithProjectionIcebergVectorized5k
>ss5   2.065 ± 0.079   s/op*
>
>
>
> So seems like there's no improvement  that vectorization is adding over
> the non-vectorized reading. I'm currently trying to profile where the time
> is being spent.
>
> *Here is my initial speculation of why this is slow:*
>  - There's too much overhead that seems to be from creating the batches.
> i'm creating new instance of ColumnarBatch on each read  [1] . This should
> prolly be re-used.
>  - Although I am reusing the *FieldVector* across batched reads [2] I
> wrap them in new *ArrowColumnVector*s [3]  on each read call. I didn't
> think this would be a big deal but maybe it is.
>  - The filters are not being applied in columnar fashion they are being
> applied row by row as in Iceberg each filter visitor is stateless and
> applied separately on each row's column.
>  - I'm trying to re-use the BufferAllocator that Arrow provides [4] ..
> Dunno if there are other strategies to using this. Will look more into this.
>  - I'm batching until the rowgroup ends and restricting the last batch to
> the Rowgroup boundary. I should prolly spill over to the next rowgroup to
> fill that batch. Dunno if this would help as from what i can tell I don't
> think *VectorizedParquetRecordReader *does this.
>
> I'l try and provide more insights once i improve my code. But if there's
> other insights folks have on where we can improve on things, i'd gladly try
> them.
>
> Cheers,
> - Gautam.
>
> [0] - https://github.com/prodeezy/incubator-iceberg/tree/vectorized-read
> [1] -
> https://github.com/prodeezy/incubator-iceberg/blob/vectorized-read/parquet/src/main/java/org/apache/iceberg/parquet/ParquetValueReaders.java#L655
> [2] -
> https://github.com/prodeezy/incubator-iceberg/blob/vectorized-read/spark/src/main/java/org/apache/iceberg/spark/data/vector/VectorizedParquetValueReaders.java#L108
> [3] -
> https://github.com/prodeezy/incubator-iceberg/blob/vectorized-read/parquet/src/main/java/org/apache/iceberg/parquet/ParquetValueReaders.java#L651
> [4] -
> https://github.com/prodeezy/incubator-iceberg/blob/vectorized-read/spark/src/main/java/org/apache/iceberg/spark/data/vector/VectorizedSparkParquetReaders.java#L92
>
>
> On Tue, Jul 30, 2019 at 5:13 PM Samarth Jain 
> wrote:
>
>> Hey Gautam,
>>
>> Wanted to check back with you and see if you had any success running the
>> benchmark and if you have any numbers to share.
>>
>>
>>
>

Re: Encouraging performance results for Vectorized Iceberg code

2019-08-08 Thread Anjali Norwood
Good suggestion Ryan. Added dev@iceberg now.

Dev: Please see early vectorized Iceberg performance results a couple
emails down. This WIP.

thanks,
Anjali.

On Thu, Aug 8, 2019 at 10:39 AM Ryan Blue  wrote:

> Hi everyone,
>
> Is it possible to copy the Iceberg dev list when sending these emails?
> There are other people in the community that are interested, like Palantir.
> If there isn't anything sensitive then let's try to be more inclusive.
> Thanks!
>
> rb
>
> On Wed, Aug 7, 2019 at 10:34 PM Anjali Norwood 
> wrote:
>
>> Hi Gautam, Padma,
>> We wanted to update you before Gautam takes off for vacation.
>>
>> Samarth and I profiled the code and found the following:
>> Profiling the IcebergSourceFlatParquetDataReadBenchmark (10 files, 10M
>> rows, a single long column) using visualVM shows two places where CPU time
>> can be optimized:
>> 1) Iterator abstractions (triple iterators, page iterators etc) seem to
>> take up quite a bit of time. Not using these iterators or making them
>> 'batched' iterators and moving the reading of the data close to the file
>> should help ameliorate this problem.
>> 2) Current code goes back and forth between definition levels and value
>> reads through the levels of iterators. Quite a bit of CPU time is spent
>> here. Reading a batch of primitive values at once after consulting the
>> definition level should help improve performance.
>>
>> So, we prototyped the code to walk over the definition levels and read
>> corresponding values in batches (read values till we hit a null, then read
>> nulls till we hit values and so on) and made the iterators batched
>> iterators. Here are the results:
>>
>> Benchmark
>>  Mode  Cnt   Score   Error  Units
>> IcebergSourceFlatParquetDataReadBenchmark.readFileSourceNonVectorized
>>  ss5  10.247 ± 0.202   s/op
>> *IcebergSourceFlatParquetDataReadBenchmark.readFileSourceVectorized
>> ss5   3.747 ± 0.206   s/op*
>> *IcebergSourceFlatParquetDataReadBenchmark.readIceberg
>>   ss 5  11.286 ± 0.457   s/op*
>> IcebergSourceFlatParquetDataReadBenchmark.readIcebergVectorized100k
>>  ss5   6.088 ± 0.324   s/op
>> *IcebergSourceFlatParquetDataReadBenchmark.readIcebergVectorized10k
>> ss5   5.875 ± 0.378   s/op*
>> IcebergSourceFlatParquetDataReadBenchmark.readIcebergVectorized1k
>>  ss5   6.029 ± 0.387   s/op
>> IcebergSourceFlatParquetDataReadBenchmark.readIcebergVectorized5k
>>  ss5   6.106 ± 0.497   s/op
>>
>>
>> Moreover, as I mentioned to Gautam on chat, we prototyped reading the
>> string column as a byte array without decoding it into UTF8 (above changes
>> were not made at the time) and we saw significant performance improvements
>> there (21.18 secs before Vs 13.031 secs with the change). When used along
>> with batched iterators, these numbers should get better.
>>
>> Note that we haven't tightened/profiled the new code yet (we will start
>> on that next). Just wanted to share some early positive results.
>>
>> regards,
>> Anjali.
>>
>>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>


Re: Encouraging performance results for Vectorized Iceberg code

2019-08-08 Thread Anjali Norwood
Yes, will do so early next week if not sooner.

thanks,
Anjali.

On Thu, Aug 8, 2019 at 4:45 PM Gautam Kowshik 
wrote:

> Thanks Anjali and Samarth,
>These look good! Great progress.  Can you push your changes to the
> vectorized-read branch please?
>
> Sent from my iPhone
>
> On Aug 8, 2019, at 11:56 AM, Anjali Norwood  wrote:
>
> Good suggestion Ryan. Added dev@iceberg now.
>
> Dev: Please see early vectorized Iceberg performance results a couple
> emails down. This WIP.
>
> thanks,
> Anjali.
>
> On Thu, Aug 8, 2019 at 10:39 AM Ryan Blue  wrote:
>
>> Hi everyone,
>>
>> Is it possible to copy the Iceberg dev list when sending these emails?
>> There are other people in the community that are interested, like Palantir.
>> If there isn't anything sensitive then let's try to be more inclusive.
>> Thanks!
>>
>> rb
>>
>> On Wed, Aug 7, 2019 at 10:34 PM Anjali Norwood 
>> wrote:
>>
>>> Hi Gautam, Padma,
>>> We wanted to update you before Gautam takes off for vacation.
>>>
>>> Samarth and I profiled the code and found the following:
>>> Profiling the IcebergSourceFlatParquetDataReadBenchmark (10 files, 10M
>>> rows, a single long column) using visualVM shows two places where CPU time
>>> can be optimized:
>>> 1) Iterator abstractions (triple iterators, page iterators etc) seem to
>>> take up quite a bit of time. Not using these iterators or making them
>>> 'batched' iterators and moving the reading of the data close to the file
>>> should help ameliorate this problem.
>>> 2) Current code goes back and forth between definition levels and value
>>> reads through the levels of iterators. Quite a bit of CPU time is spent
>>> here. Reading a batch of primitive values at once after consulting the
>>> definition level should help improve performance.
>>>
>>> So, we prototyped the code to walk over the definition levels and read
>>> corresponding values in batches (read values till we hit a null, then read
>>> nulls till we hit values and so on) and made the iterators batched
>>> iterators. Here are the results:
>>>
>>> Benchmark
>>>  Mode  Cnt   Score   Error  Units
>>> IcebergSourceFlatParquetDataReadBenchmark.readFileSourceNonVectorized
>>>  ss5  10.247 ± 0.202   s/op
>>> *IcebergSourceFlatParquetDataReadBenchmark.readFileSourceVectorized
>>>   ss5   3.747 ± 0.206   s/op*
>>> *IcebergSourceFlatParquetDataReadBenchmark.readIceberg
>>> ss 5  11.286 ± 0.457   s/op*
>>> IcebergSourceFlatParquetDataReadBenchmark.readIcebergVectorized100k
>>>  ss5   6.088 ± 0.324   s/op
>>> *IcebergSourceFlatParquetDataReadBenchmark.readIcebergVectorized10k
>>>   ss5   5.875 ± 0.378   s/op*
>>> IcebergSourceFlatParquetDataReadBenchmark.readIcebergVectorized1k
>>>  ss5   6.029 ± 0.387   s/op
>>> IcebergSourceFlatParquetDataReadBenchmark.readIcebergVectorized5k
>>>  ss5   6.106 ± 0.497   s/op
>>>
>>>
>>> Moreover, as I mentioned to Gautam on chat, we prototyped reading the
>>> string column as a byte array without decoding it into UTF8 (above changes
>>> were not made at the time) and we saw significant performance improvements
>>> there (21.18 secs before Vs 13.031 secs with the change). When used along
>>> with batched iterators, these numbers should get better.
>>>
>>> Note that we haven't tightened/profiled the new code yet (we will start
>>> on that next). Just wanted to share some early positive results.
>>>
>>> regards,
>>> Anjali.
>>>
>>>
>>
>> --
>> Ryan Blue
>> Software Engineer
>> Netflix
>>
>


Re: Encouraging performance results for Vectorized Iceberg code(Internet mail)

2019-08-12 Thread Anjali Norwood
Hi Padma, Gautam, All,

Our (Samarth's and mine) wip vectorized code is here:
https://github.com/anjalinorwood/incubator-iceberg/pull/1.
Dan, can you please merge it to 'vectorized-read' branch when you get a
chance? Thanks!

regards,
Anjali.




On Mon, Aug 12, 2019 at 10:49 AM Ryan Blue 
wrote:

> Li,
>
> You're right that the 10k and similar numbers indicate the batch size.
>
> Scores can be interpreted using the "units" column at the end. In this
> case, seconds per operation, so lower is better.
>
> Error is the measurement error. This indicates confidence that the actual
> rate of execution is, for example, within 0.378 of the average 5.875
> seconds per operation, so between around 5.50 and 6.25 second per op.
>
> On Sun, Aug 11, 2019 at 7:11 PM timmycheng(程力) 
> wrote:
>
>> Thanks for broadcasting! Just have a few questions to better understand
>> the awesome work.
>>
>>
>>
>> Could you give a little more details on the score and error columns? Does
>> error mean every time the query hits a null?
>>
>> Shall I assume 5k/10k means the number of rows? What do we learn from
>> compare to IcebergSourceFlatParquetDataReadBenchmark.readIceberg? Or
>> rather, what numbers are we comparing to?
>>
>>
>>
>> -Li
>>
>>
>>
>> *发件人**: *Anjali Norwood 
>> *答复**: *"dev@iceberg.apache.org" 
>> *日期**: *2019年8月10日 星期六 上午4:47
>> *收件人**: *Ryan Blue , "dev@iceberg.apache.org" <
>> dev@iceberg.apache.org>
>> *抄送**: *Gautam , "ppa...@apache.org" <
>> ppa...@apache.org>, Samarth Jain , Daniel Weeks <
>> dwe...@netflix.com>
>> *主题**: *Re: Encouraging performance results for Vectorized Iceberg
>> code(Internet mail)
>>
>>
>>
>> Good suggestion Ryan. Added dev@iceberg now.
>>
>>
>>
>> Dev: Please see early vectorized Iceberg performance results a couple
>> emails down. This WIP.
>>
>>
>>
>> thanks,
>>
>> Anjali.
>>
>>
>>
>> On Thu, Aug 8, 2019 at 10:39 AM Ryan Blue  wrote:
>>
>> Hi everyone,
>>
>>
>>
>> Is it possible to copy the Iceberg dev list when sending these emails?
>> There are other people in the community that are interested, like Palantir.
>> If there isn't anything sensitive then let's try to be more inclusive.
>> Thanks!
>>
>>
>>
>> rb
>>
>>
>>
>> On Wed, Aug 7, 2019 at 10:34 PM Anjali Norwood 
>> wrote:
>>
>> Hi Gautam, Padma,
>> We wanted to update you before Gautam takes off for vacation.
>>
>> Samarth and I profiled the code and found the following:
>> Profiling the IcebergSourceFlatParquetDataReadBenchmark (10 files, 10M
>> rows, a single long column) using visualVM shows two places where CPU time
>> can be optimized:
>> 1) Iterator abstractions (triple iterators, page iterators etc) seem to
>> take up quite a bit of time. Not using these iterators or making them
>> 'batched' iterators and moving the reading of the data close to the file
>> should help ameliorate this problem.
>> 2) Current code goes back and forth between definition levels and value
>> reads through the levels of iterators. Quite a bit of CPU time is spent
>> here. Reading a batch of primitive values at once after consulting the
>> definition level should help improve performance.
>>
>> So, we prototyped the code to walk over the definition levels and read
>> corresponding values in batches (read values till we hit a null, then read
>> nulls till we hit values and so on) and made the iterators batched
>> iterators. Here are the results:
>>
>> Benchmark
>>  Mode  Cnt   Score   Error  Units
>> IcebergSourceFlatParquetDataReadBenchmark.readFileSourceNonVectorized
>>  ss5  10.247 ± 0.202   s/op
>> *IcebergSourceFlatParquetDataReadBenchmark.readFileSourceVectorized
>> ss5   3.747 ± 0.206   s/op*
>>
>> *IcebergSourceFlatParquetDataReadBenchmark.readIceberg
>>   ss 5  11.286 ± 0.457   s/op*
>> IcebergSourceFlatParquetDataReadBenchmark.readIcebergVectorized100k
>>  ss5   6.088 ± 0.324   s/op
>> *IcebergSourceFlatParquetDataReadBenchmark.readIcebergVectorized10k
>> ss5   5.875 ± 0.378   s/op*
>> IcebergSourceFlatParquetDataReadBenchmark.readIcebergVectorized1k
>>  ss5   6.029 ± 0.387   s/op
>> IcebergSourceFlatParquetDataReadBenchmark.readIcebergVectorized5k
>>  ss5   6.106 ± 0.497   s/op
>>
>>
>>
>> Moreover, as I mentioned to Gautam on chat, we prototyped reading the
>> string column as a byte array without decoding it into UTF8 (above changes
>> were not made at the time) and we saw significant performance improvements
>> there (21.18 secs before Vs 13.031 secs with the change). When used along
>> with batched iterators, these numbers should get better.
>>
>>
>>
>> Note that we haven't tightened/profiled the new code yet (we will start
>> on that next). Just wanted to share some early positive results.
>>
>>
>>
>> regards,
>>
>> Anjali.
>>
>>
>>
>>
>>
>>
>> --
>>
>> Ryan Blue
>>
>> Software Engineer
>>
>> Netflix
>>
>>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>


Re: Encouraging performance results for Vectorized Iceberg code(Internet mail)

2019-08-13 Thread Anjali Norwood
It is merged now to 'vectorized-read' branch now. Thanks Ryan.

-Anjali.

On Mon, Aug 12, 2019 at 6:12 PM Anjali Norwood  wrote:

> Hi Padma, Gautam, All,
>
> Our (Samarth's and mine) wip vectorized code is here:
> https://github.com/anjalinorwood/incubator-iceberg/pull/1.
> Dan, can you please merge it to 'vectorized-read' branch when you get a
> chance? Thanks!
>
> regards,
> Anjali.
>
>
>
>
> On Mon, Aug 12, 2019 at 10:49 AM Ryan Blue 
> wrote:
>
>> Li,
>>
>> You're right that the 10k and similar numbers indicate the batch size.
>>
>> Scores can be interpreted using the "units" column at the end. In this
>> case, seconds per operation, so lower is better.
>>
>> Error is the measurement error. This indicates confidence that the actual
>> rate of execution is, for example, within 0.378 of the average 5.875
>> seconds per operation, so between around 5.50 and 6.25 second per op.
>>
>> On Sun, Aug 11, 2019 at 7:11 PM timmycheng(程力) 
>> wrote:
>>
>>> Thanks for broadcasting! Just have a few questions to better understand
>>> the awesome work.
>>>
>>>
>>>
>>> Could you give a little more details on the score and error columns?
>>> Does error mean every time the query hits a null?
>>>
>>> Shall I assume 5k/10k means the number of rows? What do we learn from
>>> compare to IcebergSourceFlatParquetDataReadBenchmark.readIceberg? Or
>>> rather, what numbers are we comparing to?
>>>
>>>
>>>
>>> -Li
>>>
>>>
>>>
>>> *发件人**: *Anjali Norwood 
>>> *答复**: *"dev@iceberg.apache.org" 
>>> *日期**: *2019年8月10日 星期六 上午4:47
>>> *收件人**: *Ryan Blue , "dev@iceberg.apache.org" <
>>> dev@iceberg.apache.org>
>>> *抄送**: *Gautam , "ppa...@apache.org" <
>>> ppa...@apache.org>, Samarth Jain , Daniel Weeks <
>>> dwe...@netflix.com>
>>> *主题**: *Re: Encouraging performance results for Vectorized Iceberg
>>> code(Internet mail)
>>>
>>>
>>>
>>> Good suggestion Ryan. Added dev@iceberg now.
>>>
>>>
>>>
>>> Dev: Please see early vectorized Iceberg performance results a couple
>>> emails down. This WIP.
>>>
>>>
>>>
>>> thanks,
>>>
>>> Anjali.
>>>
>>>
>>>
>>> On Thu, Aug 8, 2019 at 10:39 AM Ryan Blue  wrote:
>>>
>>> Hi everyone,
>>>
>>>
>>>
>>> Is it possible to copy the Iceberg dev list when sending these emails?
>>> There are other people in the community that are interested, like Palantir.
>>> If there isn't anything sensitive then let's try to be more inclusive.
>>> Thanks!
>>>
>>>
>>>
>>> rb
>>>
>>>
>>>
>>> On Wed, Aug 7, 2019 at 10:34 PM Anjali Norwood 
>>> wrote:
>>>
>>> Hi Gautam, Padma,
>>> We wanted to update you before Gautam takes off for vacation.
>>>
>>> Samarth and I profiled the code and found the following:
>>> Profiling the IcebergSourceFlatParquetDataReadBenchmark (10 files, 10M
>>> rows, a single long column) using visualVM shows two places where CPU time
>>> can be optimized:
>>> 1) Iterator abstractions (triple iterators, page iterators etc) seem to
>>> take up quite a bit of time. Not using these iterators or making them
>>> 'batched' iterators and moving the reading of the data close to the file
>>> should help ameliorate this problem.
>>> 2) Current code goes back and forth between definition levels and value
>>> reads through the levels of iterators. Quite a bit of CPU time is spent
>>> here. Reading a batch of primitive values at once after consulting the
>>> definition level should help improve performance.
>>>
>>> So, we prototyped the code to walk over the definition levels and read
>>> corresponding values in batches (read values till we hit a null, then read
>>> nulls till we hit values and so on) and made the iterators batched
>>> iterators. Here are the results:
>>>
>>> Benchmark
>>>  Mode  Cnt   Score   Error  Units
>>> IcebergSourceFlatParquetDataReadBenchmark.readFileSourceNonVectorized
>>>  ss5  10.247 ± 0.202   s/op
>>> *IcebergSourceFlatParquetDataReadBenchmark.readFileSourceVectorized
>>>   ss5   3.747 ± 0.206   s/op*
>>>
>>>

Re: New committer and PPMC member, Anton Okolnychyi

2019-09-03 Thread Anjali Norwood
Congratulations Anton!!

Regards
Anjali

On Tue, Sep 3, 2019 at 2:11 AM Anton Okolnychyi
 wrote:

> Thanks everyone!
>
> I am really excited to see how the community evolves over time and that we
> have more and more great companies and folks joining the project.
> I hope we will continue to build a strong and healthy community!
>
> - Anton
>
> > On 3 Sep 2019, at 09:04, 陈俊杰  wrote:
> >
> > Congratulations Anton!
> >
> >
> > Gautam  于2019年9月3日周二 下午1:26写道:
> >>
> >> Way to go Anton! Appreciate all the work and guidance.
> >>
> >> On Tue, Sep 3, 2019 at 9:33 AM John Zhuge  wrote:
> >>>
> >>> Congratulations Anton!
> >>>
> >>> On Mon, Sep 2, 2019 at 8:45 PM Mouli Mukherjee <
> moulimukher...@gmail.com> wrote:
> 
>  Congratulations Anton!
> 
>  On Mon, Sep 2, 2019, 8:38 PM Saisai Shao 
> wrote:
> >
> > Congrats Anton!
> >
> > Best regards,
> > Saisai
> >
> > Daniel Weeks  于2019年9月3日周二 上午7:48写道:
> >>
> >> Congrats Anton!
> >>
> >> On Fri, Aug 30, 2019 at 1:54 PM Edgar Rodriguez <
> edgar.rodrig...@airbnb.com.invalid> wrote:
> >>>
> >>> Nice! Congratulations, Anton!
> >>>
> >>> Cheers,
> >>>
> >>> On Fri, Aug 30, 2019 at 1:42 PM Dongjoon Hyun <
> dongjoon.h...@gmail.com> wrote:
> 
>  Congratulations, Anton! :D
> 
>  Bests,
>  Dongjoon.
> 
>  On Fri, Aug 30, 2019 at 10:06 AM Ryan Blue 
> wrote:
> >
> > I'd like to congratulate Anton Okolnychyi, who was just invited
> to join the Iceberg committers and PPMC!
> >
> > Thanks for all your contributions, Anton!
> >
> > rb
> >
> > --
> > Ryan Blue
> >>>
> >>>
> >>>
> >>> --
> >>> Edgar Rodriguez
> >>>
> >>>
> >>>
> >>> --
> >>> John Zhuge
>
>


Open issues against Vectorized Iceberg Read milestone

2019-10-08 Thread Anjali Norwood
Thank you Gautam for the summary of the discussion.

Hello Devs,

The follow up vectorized iceberg tasks are now captured as issues against
the milestone.
Listed below for convenience.
https://github.com/apache/incubator-iceberg/issues/518
https://github.com/apache/incubator-iceberg/issues/519
https://github.com/apache/incubator-iceberg/issues/520
https://github.com/apache/incubator-iceberg/issues/521
https://github.com/apache/incubator-iceberg/issues/522

thanks,
Anjali.


On Mon, Oct 7, 2019 at 12:41 PM Gautam  wrote:

> Hello Devs,
> We met to discuss progress and next steps on Vectorized
> read path in Iceberg. Here are my notes from the sync. Feel free to reply
> with clarifications in case I mis-quoted or missed anything.
>
> *Attendees*:
>
> Anjali Norwood
> Padma Pennumarthy
> Ryan Blue
> Samarth Jain
> Gautam Kowshik
>
> *Topics *
> - Progress on Arrow Based Vectorization Reads
> - Features being worked on and possible improvements
> - Pending bottlenecks
> - Identify things to collaborate on going forward.
> - Next steps
>
> Arrow Vectorized Reader
>
>   Samarth/Anjali:
>
>- Working on Arrow based vectoization [1]
>- At  performance parity between Spark and Iceberg on primitive types
>except strings.
>- Planning to do dictionary encoding on strings
>- New Arrow version gives boost in performance and fixes issues
>- Vectorized batched Reading of definition levels improves performance
>- Some checks had to be turned off in arrow to push performance
>further viz. null check, unsafe memory access
>- Implemented prefetching of parquet pages, this improves perf on
>primitives beyond Vanilla spark
>
>
>Ryan:
>
>
>- Arrow version should not tied to spark and have iceberg specific
>implementation binding so it will work with any reader not just spark.
>- Add DatasourceV2Strategy to handle nested pruning into Spark
>upstream. Will coordinate with Apple folks to add their work into Spark.
>- Need ability to fallback  to row based reads for cases where
>columnar isn't possible. A config option maybe.
>- Can add options where columnar batches are read into InternalRow and
>returned to the Datasource.
>
>   Padma:
>
>- Possibly contribute work on arrow back to arrow project. (can punt
>on this for now to move forward faster on current work)
>- Was looking into complex type support for Arrow based reads.
>
>
> V1 Vectorized Read Path [2]
>
> Gautam:
>
>- Been working on V1 vectorized short circuit read path [3]. (this is
>prolly not as useful once we have full featured support on Arrow based
>reads)
>- Will work on getting schema evolution parts working with this reader
>by getting Projection unit/integration tests working. (this can be
>contributed back into iceberg repo to unblock this path if we want to have
>that option till arrow based read is fully capable)
>
>
>
> *Next steps:*
>
>- Unit tests for current Arrow based work.
>- Provide options to perform vectorized batch reads, Row oriented
>reads and Internal Row over Batch reads.
>- Separate Arrow work in Iceberg into it's own sub-module
>- Dictionary encoding support for strings in Arrow.
>- Complex type support for Arrow.
>- File issues for the above and identify how to distribute work
>between us.
>
>
>
>
> [1]  https://github.com/apache/incubator-iceberg/tree/vectorized-read
>
> [2]  https://github.com/apache/incubator-iceberg/pull/462
>
> [3]
> https://github.com/prodeezy/incubator-iceberg/commits/v1-vectorized-reader
>
>
>


Re: [ANNOUNCE] Apache Iceberg release 0.7.0-incubating

2019-10-28 Thread Anjali Norwood
Congratulations!!

Regards
Anjali

On Sun, Oct 27, 2019 at 2:56 PM Ryan Blue  wrote:

> Here's the release announcement that I just sent out. Thanks to everyone
> that contributed to this release!
>
> -- Forwarded message -
> From: Ryan Blue 
> Date: Sun, Oct 27, 2019 at 2:55 PM
> Subject: [ANNOUNCE] Apache Iceberg release 0.7.0-incubating
> To: 
>
>
> I'm pleased to announce the release of Apache Iceberg 0.7.0-incubating!
>
> Apache Iceberg is an open table format for huge analytic datasets. Iceberg
> delivers high query performance for tables with tens of petabytes of data,
> as well as with atomic commits with concurrent writers and side-effect free
> schema evolution.
>
> The source release can be downloaded from:
> https://www.apache.org/dyn/closer.cgi/incubator/iceberg/apache-iceberg-0.7.0-incubating/apache-iceberg-0.7.0-incubating.tar.gz
>  –
> signature
> 
> – sha512
> 
>
> Java artifacts are available from Maven Central, including an all-in-one Spark
> 2.4 runtime Jar
> .
> To use Iceberg in Spark 2.4, add the runtime Jar to the jars folder of your
> Spark install.
>
> Additional information is available at http://iceberg.apache.org/releases/
>
> Thanks to everyone that contributed to this release! This is the first
> Apache release of Iceberg!
>
>
> --
> Ryan Blue
>
>
> --
> Ryan Blue
>


Re: Welcome new committer and PPMC member Ratandeep Ratti

2020-02-17 Thread Anjali Norwood
Congratulations Ratandeep!!

regards.
Anjali.

On Mon, Feb 17, 2020 at 12:19 AM Manish Malhotra <
manish.malhotra.w...@gmail.com> wrote:

> Congratulations 🎉!!
>
> On Sun, Feb 16, 2020 at 8:37 PM RD  wrote:
>
>> Thanks everyone!
>>
>> -Best,
>> R.
>>
>> On Sun, Feb 16, 2020 at 7:39 PM David Christle
>>  wrote:
>>
>>> Congrats!!!
>>>
>>>
>>>
>>> *From: *Jacques Nadeau 
>>> *Reply-To: *"dev@iceberg.apache.org" 
>>> *Date: *Sunday, February 16, 2020 at 7:20 PM
>>> *To: *Iceberg Dev List 
>>> *Subject: *Re: Welcome new committer and PPMC member Ratandeep Ratti
>>>
>>>
>>>
>>> Congrats!
>>>
>>>
>>>
>>> On Sun, Feb 16, 2020, 7:06 PM xiaokun ding 
>>> wrote:
>>>
>>> CONGRATULATIONS
>>>
>>>
>>>
>>> 李响  于2020年2月17日周一 上午11:05写道:
>>>
>>> CONGRATULATIONS!!!
>>>
>>>
>>>
>>> On Mon, Feb 17, 2020 at 9:50 AM Junjie Chen 
>>> wrote:
>>>
>>> Congratulations!
>>>
>>>
>>>
>>> On Mon, Feb 17, 2020 at 5:48 AM Ryan Blue  wrote:
>>>
>>> Hi everyone,
>>>
>>>
>>>
>>> I'd like to congratulate Ratandeep Ratti, who was just invited to join
>>> the Iceberg committers adn PPMC!
>>>
>>>
>>>
>>> Thanks for your contributions and reviews, Ratandeep!
>>>
>>>
>>>
>>> rb
>>>
>>>
>>>
>>> --
>>>
>>> Ryan Blue
>>>
>>>
>>>
>>>
>>> --
>>>
>>> Best Regards
>>>
>>>
>>>
>>>
>>> --
>>>
>>>
>>>李响 Xiang Li
>>>
>>> 手机 cellphone :+86-136-8113-8972
>>> 邮件 e-mail  :wate...@gmail.com
>>>
>>>


Re: Slack invite

2021-07-08 Thread Anjali Norwood
I had the same experience. Please let us know here in dev group when this
is fixed .. a bunch of us would like to join.

thanks,
Anjali.

On Thu, Jul 8, 2021 at 3:52 PM Puneet Zaroo 
wrote:

> I tried joining with a guest account and got an error message: "The email
> address must match one of the domains listed below. Please try another
> email."
>
> https://infra.apache.org/slack.html mentions that "Note: When Slack does
> an update, this URL occasionally stops functioning as it should. If the
> potential participant uses it and the submission form asks for an Apache
> email address, let Infra know so we can update the URL."
>
> Perhaps it is time to update the URL.
>
> thanks,
> - Puneet
>
>
> On Wed, Jul 7, 2021 at 12:24 PM Weston Pace  wrote:
>
>> I believe you can join with a guest account using this link:
>> https://s.apache.org/slack-invite
>>
>> More details: https://infra.apache.org/slack.html
>>
>> On Wed, Jul 7, 2021 at 9:15 AM Michiel De Smet  wrote:
>> >
>> > HI,
>> >
>> > I would like a slack invite, it seems only apache.org addresses are
>> allowed.
>> >
>> > Thanks,
>> >
>> > Michiel De Smet
>>
>


Re: Slack invite

2021-07-09 Thread Anjali Norwood
Thank you for the update Weston. (I was added by a friend.).

regards,
Anjali.

On Thu, Jul 8, 2021 at 9:49 PM Weston Pace  wrote:

> INFRA-21214[1] was just updated.  The invite link is indefinitely disabled
> due to recent spam accounts. Guests will need to be invited by existing
> Slack users.  I'm not on the slack so I cannot help here but I'm sure
> someone can.
>
> [1] https://issues.apache.org/jira/browse/INFRA-21214
>
> On Thu, Jul 8, 2021, 3:15 PM Weston Pace  wrote:
>
>> I added a comment to https://issues.apache.org/jira/browse/INFRA-21214
>> and then I realized that someone has also opened
>> https://issues.apache.org/jira/browse/INFRA-22088 yesterday.  I'm
>> happy to watch the link and add a reply to this thread when it appears
>> to be working again.
>>
>> On Thu, Jul 8, 2021 at 1:45 PM Anjali Norwood
>>  wrote:
>> >
>> > I had the same experience. Please let us know here in dev group when
>> this is fixed .. a bunch of us would like to join.
>> >
>> > thanks,
>> > Anjali.
>> >
>> > On Thu, Jul 8, 2021 at 3:52 PM Puneet Zaroo 
>> wrote:
>> >>
>> >> I tried joining with a guest account and got an error message: "The
>> email address must match one of the domains listed below. Please try
>> another email."
>> >>
>> >> https://infra.apache.org/slack.html mentions that "Note: When Slack
>> does an update, this URL occasionally stops functioning as it should. If
>> the potential participant uses it and the submission form asks for an
>> Apache email address, let Infra know so we can update the URL."
>> >>
>> >> Perhaps it is time to update the URL.
>> >>
>> >> thanks,
>> >> - Puneet
>> >>
>> >>
>> >> On Wed, Jul 7, 2021 at 12:24 PM Weston Pace 
>> wrote:
>> >>>
>> >>> I believe you can join with a guest account using this link:
>> >>> https://s.apache.org/slack-invite
>> >>>
>> >>> More details: https://infra.apache.org/slack.html
>> >>>
>> >>> On Wed, Jul 7, 2021 at 9:15 AM Michiel De Smet 
>> wrote:
>> >>> >
>> >>> > HI,
>> >>> >
>> >>> > I would like a slack invite, it seems only apache.org addresses
>> are allowed.
>> >>> >
>> >>> > Thanks,
>> >>> >
>> >>> > Michiel De Smet
>>
>


Re: View definitions

2021-07-13 Thread Anjali Norwood
Hi Ryan,

John Zhuge and I are working on a proposal. We hope to send out a draft for
discussion soon.

Regards
Anjali

On Tue, Jul 13, 2021 at 2:29 AM Ryan Murray  wrote:

> Hi All,
>
> I was curious if anyone was working on bringing a proposal for Views to
> the community? I know extensive work at Netflix has already been done[1]
> and it looks like a proposal could be relatively easy to extract from
> there. If no one else is currently working on it I can volunteer to take
> the work on as we are hoping to be able to handle views via Iceberg
> relatively soon.
>
> Best,
> Ryan
>
> [1]
> https://github.com/Netflix/iceberg/tree/netflix-spark-2.4/view/src/main/java/com/netflix/bdp/view
>


Proposal: Support for views in Iceberg

2021-07-19 Thread Anjali Norwood
Hello,

John Zhuge and I would like to propose the following spec for storing view
metadata in Iceberg. The proposal has been implemented [1] and is in
production at Netflix for over 15 months.

https://docs.google.com/document/d/1wQt57EWylluNFdnxVxaCkCSvnWlI8vVtwfnPjQ6Y7aw/edit?usp=sharing

[1]
https://github.com/Netflix/iceberg/tree/netflix-spark-2.4/view/src/main/java/com/netflix/bdp/view

Please let us know your thoughts by adding comments to the doc.

Thanks,
Anjali.


Re: Proposal: Support for views in Iceberg

2021-07-20 Thread Anjali Norwood
Thank you Ryan (M), Piotr and Vivekanand for the comments. I have and will
continue to address them in the doc.
Great to know about Trino views, Piotr!

Thanks to everybody who has offered help with implementation. The spec as
it is proposed in the doc has been implemented and is in use at Netflix
(currently on Iceberg 0.9). Once we close the spec, we will rebase our code
to Iceberg-0.12 and incorporate changes to format and other feedback from
the community and should be able to make this MVP implementation available
quickly as a PR.

A few areas that we have not yet worked on and would love for the community
to help are:
1. Time travel on views: Be able to access the view as of a version or time
2. History table: A system table implementation for $versions similar to
the $snapshots table in order to display the history of a view
3. Rollback to a version: A way to rollback a view to a previous version
4. Engine agnostic SQL: more below.

One comment that is worth a broader discussion is the dialect of the SQL
stored in the view metadata. The purpose of the spec is to provide a
storage format for view metadata and APIs to access that metadata. The
dialect of the SQL stored is an orthogonal question and is outside the
scope of this spec.

Nonetheless, it is an important concern, so compiling a few suggestions
that came up in the comments to continue the discussion:
1. Allow only ANSI-compliant SQL and anything that is truly common across
engines in the view definition (this is how currently Netflix uses these
'common' views across Spark and Trino)
2. Add a field to the view metadata to identify the dialect of the SQL.
This allows for any desired dialect, but no improved cross-engine
operability
3. Store AST produced by Calcite in the view metadata and translate back
and forth between engine-supported SQL and AST
4. Intermediate structured language of our own. (What additional
functionality does it provide over Calcite?)

Given that the view metadata is json, it is easily extendable to
incorporate any new fields needed to make the SQL truly compatible across
engines.

What do you think?

regards,
Anjali



On Tue, Jul 20, 2021 at 3:09 AM Piotr Findeisen 
wrote:

> Hi,
>
> FWIW, in Trino we just added Trino views support.
> https://github.com/trinodb/trino/pull/8540
> Of course, this is by no means usable by other query engines.
>
> Anjali, your document does not talk much about compatibility between query
> engines.
> How do you plan to address that?
>
> For example, I am familiar with Coral, and I appreciate its powers for
> dealing with legacy stuff like views defined by Hive.
> I treat it as a great technology supporting transitioning from a query
> engine to a better one.
> However, I would not base a design of some new system for storing
> cross-engine compatible views on that.
>
> Is there something else we can use?
> Maybe the view definition should use some intermediate structured language
> that's not SQL?
> For example, it could represent logical structure of operations in
> semantics manner.
> This would eliminate need for cross-engine compatible parsing and analysis.
>
> Best
> PF
>
>
>
> On Tue, Jul 20, 2021 at 11:04 AM Ryan Murray  wrote:
>
>> Thanks Anjali!
>>
>> I have left some comments on the document. I unfortunately have to miss
>> the community meetup tomorrow but would love to chat more/help w/
>> implementation.
>>
>> Best,
>> Ryan
>>
>> On Tue, Jul 20, 2021 at 7:42 AM Anjali Norwood
>>  wrote:
>>
>>> Hello,
>>>
>>> John Zhuge and I would like to propose the following spec for storing
>>> view metadata in Iceberg. The proposal has been implemented [1] and is in
>>> production at Netflix for over 15 months.
>>>
>>>
>>> https://docs.google.com/document/d/1wQt57EWylluNFdnxVxaCkCSvnWlI8vVtwfnPjQ6Y7aw/edit?usp=sharing
>>>
>>> [1]
>>> https://github.com/Netflix/iceberg/tree/netflix-spark-2.4/view/src/main/java/com/netflix/bdp/view
>>>
>>> Please let us know your thoughts by adding comments to the doc.
>>>
>>> Thanks,
>>> Anjali.
>>>
>>


Re: Proposal: Support for views in Iceberg

2021-07-29 Thread Anjali Norwood
amiliar enough with Calcite.
>>>> However, with new IR being focused on compatible representation, and
>>>> not being tied to anything are actually good things.
>>>> For example, we need to focus on JSON representation, but we don't need
>>>> to deal with tree traversal or anything, so the code for this could be
>>>> pretty simple.
>>>>
>>>> >  Allow only ANSI-compliant SQL and anything that is truly common
>>>> across engines in the view definition (this is how currently Netflix uses
>>>> these 'common' views across Spark and Trino)
>>>>
>>>> it's interesting. Anjali, do  you have means to enforce that, or is
>>>> this just a convention?
>>>>
>>>> What are the common building blocks (relational operations, constructs
>>>> and functions) that you found sufficient for expressing your views?
>>>> Being able to enumerate them could help validate various approaches
>>>> considered here, including feasibility of dedicated representation.
>>>>
>>>>
>>>> Best,
>>>> PF
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, Jul 22, 2021 at 2:28 PM Ryan Murray  wrote:
>>>>
>>>>> Hey Anjali,
>>>>>
>>>>> I am definitely happy to help with implementing 1-3 in your first list
>>>>> once the spec has been approved by the community. My hope is that the 
>>>>> final
>>>>> version of the view spec will make it easy to re-use existing 
>>>>> rollback/time
>>>>> travel/metadata etc functionalities.
>>>>>
>>>>> Regarding SQL dialects.My personal opinion is: Enforcing
>>>>> ANSI-compliant SQL across all engines is hard and probably not
>>>>> desirable while storing Calcite makes it hard for eg python to use views. 
>>>>> A
>>>>> project to make a cross language and cross engine IR for sql views and the
>>>>> relevant transpilers is imho outside the scope of this spec and probably
>>>>> deserving an apache project of its own. A smaller IR like Piotr suggested
>>>>> is possible but I feel it will likely quickly snowball into a larger
>>>>> project and slow down adoption of the view spec in iceberg. So I think the
>>>>> most reasonable way forward is to add a dialect field and a warning to
>>>>> engines that views are not (yet) cross compatible. This is at odds with 
>>>>> the
>>>>> original spirit of iceberg tables and I wonder how the border community
>>>>> feels about it? I would hope that we can make the view spec engine-free
>>>>> over time and eventually deprecate the dialect field.
>>>>>
>>>>> Best,
>>>>> Ryan
>>>>>
>>>>> PS if anyone is interested in collaborating on engine agnostic views
>>>>> please reach out. I am keen on exploring this topic.
>>>>>
>>>>> On Tue, Jul 20, 2021 at 10:51 PM Anjali Norwood
>>>>>  wrote:
>>>>>
>>>>>> Thank you Ryan (M), Piotr and Vivekanand for the comments. I have and
>>>>>> will continue to address them in the doc.
>>>>>> Great to know about Trino views, Piotr!
>>>>>>
>>>>>> Thanks to everybody who has offered help with implementation. The
>>>>>> spec as it is proposed in the doc has been implemented and is in use at
>>>>>> Netflix (currently on Iceberg 0.9). Once we close the spec, we will 
>>>>>> rebase
>>>>>> our code to Iceberg-0.12 and incorporate changes to format and
>>>>>> other feedback from the community and should be able to make this MVP
>>>>>> implementation available quickly as a PR.
>>>>>>
>>>>>> A few areas that we have not yet worked on and would love for the
>>>>>> community to help are:
>>>>>> 1. Time travel on views: Be able to access the view as of a version
>>>>>> or time
>>>>>> 2. History table: A system table implementation for $versions similar
>>>>>> to the $snapshots table in order to display the history of a view
>>>>>> 3. Rollback to a version: A way to rollback a view to a previous
>>>>>> version
>>>>>> 4. Engine agnostic SQL: more below.
>>>>>>
>>

Re: Iceberg disaster recovery and relative path sync-up

2021-08-12 Thread Anjali Norwood
Thanks for the summary Yufei.
Sorry, if this was already discussed, I missed the meeting yesterday.
Is there anything in the design that would prevent multiple roots from
being in different aws regions? For disaster recovery in the case of an
entire aws region down or slow, is metastore still a point of failure or
can metastore be stood up in a different region and could select a
different root?

regards,
Anjali.

On Thu, Aug 12, 2021 at 11:35 AM Yufei Gu  wrote:

> Here is a summary of yesterday's community sync-up.
>
>
> Yufei gave a brief update on disaster recovery requirements and the
> current progress of relative path approach.
>
>
> Ryan: We all agreed that relative path is the way for disaster recovery.
>
>
> *Multiple roots for the relative path*
>
> Ryan proposed an idea to enable multiple roots for a table, basically, we
> can add a list of roots in table metadata, and use a selector to choose
> different roots when we move the table from one place to another. The
> selector reads a property to decide which root to use. The property could
> be either from catalog or the table metadata, which is yet to be decided.
>
>
> Here is an example I’d image:
>
>1. Root1: hdfs://nn:8020/path/to/the/table
>2. Root2: s3://bucket1/path/to/the/table
>3. Root3: s3://bucket2/path/to/the/table
>
> *Relative path use case*
>
> We brainstormed use cases for relative paths. Please let us know if there
> are any other use cases.
>
>1. Disaster Recovery
>2. Jack: AWS s3 bucket alias
>3. Ryan: fall-back use case. In case that the root1 doesn’t work, the
>table falls back to root2, then root3. As Russell mentioned, it is
>challenging to do snapshot expiration and other table maintenance actions.
>
>
> *Timeline*
>
> In terms of timeline, relative path could be a feature in Spec V3, since
> Spec V1 and V2 assume absolute path in metadata.
>
>
> *Misc*
>
>1. Miao: How is the relative path compatible with the absolute path?
>2. How do we migrate an existing table? Build a tool for that.
>
> Please let us know if you have any ideas, questions, or concerns.
>
>
>
> Yufei
>


Re: Iceberg disaster recovery and relative path sync-up

2021-08-13 Thread Anjali Norwood
Perfect, thank you Yufei.

Regards
Anjali

On Thu, Aug 12, 2021 at 9:58 PM Yufei Gu  wrote:

> Hi Anjali,
>
> Inline...
> On Thu, Aug 12, 2021 at 5:31 PM Anjali Norwood
>  wrote:
>
>> Thanks for the summary Yufei.
>> Sorry, if this was already discussed, I missed the meeting yesterday.
>> Is there anything in the design that would prevent multiple roots from
>> being in different aws regions?
>>
> No. DR is the major use case of relative paths, if not the only one. So,
> it will support roots in different regions.
>
> For disaster recovery in the case of an entire aws region down or slow, is
>> metastore still a point of failure or can metastore be stood up in a
>> different region and could select a different root?
>>
> Normally, DR also requires a backup metastore, besides the storage(s3
> bucket). In that case, the backup metastore will be in a different region
> along with the table files. For example, the primary table is located in
> region A as well as its metastore, the backup table is located in region B
> as well as its metastore. The primary table root points to a path in region
> A, while backup table root points to a path in region B.
>
>
>> regards,
>> Anjali.
>>
>> On Thu, Aug 12, 2021 at 11:35 AM Yufei Gu  wrote:
>>
>>> Here is a summary of yesterday's community sync-up.
>>>
>>>
>>> Yufei gave a brief update on disaster recovery requirements and the
>>> current progress of relative path approach.
>>>
>>>
>>> Ryan: We all agreed that relative path is the way for disaster recovery.
>>>
>>>
>>> *Multiple roots for the relative path*
>>>
>>> Ryan proposed an idea to enable multiple roots for a table, basically,
>>> we can add a list of roots in table metadata, and use a selector to choose
>>> different roots when we move the table from one place to another. The
>>> selector reads a property to decide which root to use. The property could
>>> be either from catalog or the table metadata, which is yet to be decided.
>>>
>>>
>>> Here is an example I’d image:
>>>
>>>1. Root1: hdfs://nn:8020/path/to/the/table
>>>2. Root2: s3://bucket1/path/to/the/table
>>>3. Root3: s3://bucket2/path/to/the/table
>>>
>>> *Relative path use case*
>>>
>>> We brainstormed use cases for relative paths. Please let us know if
>>> there are any other use cases.
>>>
>>>1. Disaster Recovery
>>>2. Jack: AWS s3 bucket alias
>>>3. Ryan: fall-back use case. In case that the root1 doesn’t work,
>>>the table falls back to root2, then root3. As Russell mentioned, it is
>>>challenging to do snapshot expiration and other table maintenance 
>>> actions.
>>>
>>>
>>> *Timeline*
>>>
>>> In terms of timeline, relative path could be a feature in Spec V3, since
>>> Spec V1 and V2 assume absolute path in metadata.
>>>
>>>
>>> *Misc*
>>>
>>>1. Miao: How is the relative path compatible with the absolute path?
>>>2. How do we migrate an existing table? Build a tool for that.
>>>
>>> Please let us know if you have any ideas, questions, or concerns.
>>>
>>>
>>>
>>> Yufei
>>>
>>


Re: Identify watermark in the iceberg table properties

2021-08-16 Thread Anjali Norwood
+ Sundaram, as he may have some input.

regards,
Anjali.

On Fri, Aug 13, 2021 at 2:41 AM 1  wrote:

> Hi,all:
>
>   I need to embed the iceberg table, which is regarded as real-time
> table, into our workflow. That is to say, Flink writes data into Iceberg
> table in real-time, I need something to indicate the data completeness on
> the ingestion path so that downstream batch consumer jobs can be triggered
> when data is complete for a window (like hourly). Like
> https://github.com/apache/iceberg/pull/2109.
>
>   I see that netflix has done related work on this, Is there any doc or
> patch for implementation?
>
>   Thx
>
> liubo07199
>
>
> 
>
>


Re: Iceberg disaster recovery and relative path sync-up

2021-08-20 Thread Anjali Norwood
Hi,

This thread is about disaster recovery and relative paths, but I wanted to
ask an orthogonal but related question.
Do we see disaster recovery as the only (or main) use case for
multi-region?
Is data residency requirement a use case for anybody? Is it possible to
shard an iceberg table across regions? How is the location managed in that
case?

thanks,
Anjali.

On Fri, Aug 20, 2021 at 12:20 AM Peter Vary 
wrote:

> Sadly, I have missed the meeting :(
>
> Quick question:
> Was table rename / location change discussed for tables with relative
> paths?
>
> AFAIK when a table rename happens then we do not move old data / metadata
> files, we just change the root location of the new data / metadata files.
> If I am correct about this then we might need to handle this differently
> for tables with relative paths.
>
> Thanks, Peter
>
> On Fri, 13 Aug 2021, 15:12 Anjali Norwood, 
> wrote:
>
>> Perfect, thank you Yufei.
>>
>> Regards
>> Anjali
>>
>> On Thu, Aug 12, 2021 at 9:58 PM Yufei Gu  wrote:
>>
>>> Hi Anjali,
>>>
>>> Inline...
>>> On Thu, Aug 12, 2021 at 5:31 PM Anjali Norwood
>>>  wrote:
>>>
>>>> Thanks for the summary Yufei.
>>>> Sorry, if this was already discussed, I missed the meeting yesterday.
>>>> Is there anything in the design that would prevent multiple roots from
>>>> being in different aws regions?
>>>>
>>> No. DR is the major use case of relative paths, if not the only one. So,
>>> it will support roots in different regions.
>>>
>>> For disaster recovery in the case of an entire aws region down or slow,
>>>> is metastore still a point of failure or can metastore be stood up in a
>>>> different region and could select a different root?
>>>>
>>> Normally, DR also requires a backup metastore, besides the storage(s3
>>> bucket). In that case, the backup metastore will be in a different region
>>> along with the table files. For example, the primary table is located in
>>> region A as well as its metastore, the backup table is located in region B
>>> as well as its metastore. The primary table root points to a path in region
>>> A, while backup table root points to a path in region B.
>>>
>>>
>>>> regards,
>>>> Anjali.
>>>>
>>>> On Thu, Aug 12, 2021 at 11:35 AM Yufei Gu  wrote:
>>>>
>>>>> Here is a summary of yesterday's community sync-up.
>>>>>
>>>>>
>>>>> Yufei gave a brief update on disaster recovery requirements and the
>>>>> current progress of relative path approach.
>>>>>
>>>>>
>>>>> Ryan: We all agreed that relative path is the way for disaster
>>>>> recovery.
>>>>>
>>>>>
>>>>> *Multiple roots for the relative path*
>>>>>
>>>>> Ryan proposed an idea to enable multiple roots for a table, basically,
>>>>> we can add a list of roots in table metadata, and use a selector to choose
>>>>> different roots when we move the table from one place to another. The
>>>>> selector reads a property to decide which root to use. The property could
>>>>> be either from catalog or the table metadata, which is yet to be decided.
>>>>>
>>>>>
>>>>> Here is an example I’d image:
>>>>>
>>>>>1. Root1: hdfs://nn:8020/path/to/the/table
>>>>>2. Root2: s3://bucket1/path/to/the/table
>>>>>3. Root3: s3://bucket2/path/to/the/table
>>>>>
>>>>> *Relative path use case*
>>>>>
>>>>> We brainstormed use cases for relative paths. Please let us know if
>>>>> there are any other use cases.
>>>>>
>>>>>1. Disaster Recovery
>>>>>2. Jack: AWS s3 bucket alias
>>>>>3. Ryan: fall-back use case. In case that the root1 doesn’t work,
>>>>>the table falls back to root2, then root3. As Russell mentioned, it is
>>>>>challenging to do snapshot expiration and other table maintenance 
>>>>> actions.
>>>>>
>>>>>
>>>>> *Timeline*
>>>>>
>>>>> In terms of timeline, relative path could be a feature in Spec V3,
>>>>> since Spec V1 and V2 assume absolute path in metadata.
>>>>>
>>>>>
>>>>> *Misc*
>>>>>
>>>>>1. Miao: How is the relative path compatible with the absolute
>>>>>path?
>>>>>2. How do we migrate an existing table? Build a tool for that.
>>>>>
>>>>> Please let us know if you have any ideas, questions, or concerns.
>>>>>
>>>>>
>>>>>
>>>>> Yufei
>>>>>
>>>>


Re: Iceberg disaster recovery and relative path sync-up

2021-08-23 Thread Anjali Norwood
Hi Ryan, All,

*"The more I think about this, the more I like the solution to add multiple
table roots to metadata, rather than removing table roots. Adding a way to
plug in a root selector makes a lot of sense to me and it ensures that the
metadata is complete (table location is set in metadata) and that multiple
locations can be used. Are there any objections or arguments against doing
it that way?"*

In the context of your comment on multiple locations above, I am thinking
about the following scenarios:
1) Disaster recovery or low-latency use case where clients connect to the
region geographically closest to them: In this case, multiple table roots
represent a copy of the table, the copies may or may not be in sync. (Most
likely active-active replication would be set up in this case and the
copies are near-identical). A root level selector works/makes sense.
2) Data residency requirement: data must not leave a country/region. In
this case, federation of data from multiple roots constitutes the entirety
of the table.
3) One can also imagine combinations of 1 and 2 above where some locations
need to be federated and some locations have data replicated from other
locations.

Curious how the 2nd and 3rd scenarios would be supported with this design.

regards,
Anjali.



On Sun, Aug 22, 2021 at 11:22 PM Peter Vary 
wrote:

>
> @Ryan: If I understand correctly, currently there is a possibility to
> change the root location of the table, and it will not change/move the old
> data/metadata files created before the change, only the new data/metadata
> files will be created in the new location.
>
> Are we planning to make sure that the tables with relative paths will
> always contain every data/metadata file in single root folder?
>
> Here is a few scenarios which I am thinking about:
> 1. Table T1 is created with relative path in HadoopCatalog/HiveCatalog
> where the root location is L1, and this location is generated from the
> TableIdentifier (at least in creation time)
> 2. Data inserted to the table, so data files are created under L1, and the
> metadata files contain R1 relative path to the current L1.
> 3a. Table location is updated to L2 (Hive: ALTER TABLE T1 SET LOCATION L2)
> 3b. Table renamed to T2 (Hive: ALTER TABLE T1 RENAME TO T2) - Table
> location is updated for Hive tables if the old location was the default
> location.
> 4. When we try to read this table we should read the old data/metadata
> files as well.
>
> So in both cases we have to move the old data/metadata files around like
> Hive does for the native tables, and for the tables with relative paths we
> do not have to change the metadata other than the root path? Will we do the
> same thing with other engines as well?
>
> Thanks,
> Peter
>
> On Mon, 23 Aug 2021, 06:38 Yufei Gu,  wrote:
>
>> Steven, here is my understanding. It depends on whether you want to move
>> the data. In the DR case, we do move the data, we expect data to be
>> identical from time to time, but not always be. In the case of S3 aliases,
>> different roots actually point to the same location, there is no data move,
>> and data is identical for sure.
>>
>> On Sun, Aug 22, 2021 at 8:19 PM Steven Wu  wrote:
>>
>>> For the multiple table roots, do we expect or ensure that the data are
>>> identical across the different roots? or this is best-effort background
>>> synchronization across the different roots?
>>>
>>> On Sun, Aug 22, 2021 at 11:53 AM Ryan Blue  wrote:
>>>
>>>> Peter, I think that this feature would be useful when moving tables
>>>> between root locations or when you want to maintain multiple root
>>>> locations. Renames are orthogonal because a rename doesn't change the table
>>>> location. You may want to move the table after a rename, and this would
>>>> help in that case. But actually moving data is optional. That's why we put
>>>> the table location in metadata.
>>>>
>>>> Anjali, DR is a big use case, but we also talked about directing
>>>> accesses through other URLs, like S3 access points, table migration (like
>>>> the rename case), and background data migration (e.g. lifting files between
>>>> S3 regions). There are a few uses for it.
>>>>
>>>> The more I think about this, the more I like the solution to add
>>>> multiple table roots to metadata, rather than removing table roots. Adding
>>>> a way to plug in a root selector makes a lot of sense to me and it ensures
>>>> that the metadata is complete (table location is set in metadata) and that
>>>> multiple locations can be used. Are there any objections or arguments
>>>> 

Re: Proposal: Support for views in Iceberg

2021-08-25 Thread Anjali Norwood
mpatible approach as expressive as full power of
>>> SQL, some views that are possible to create in v1 will not be possible to
>>> create in v2.
>>> Thus, if v1  is "some SQL" and v2 is "something awesomely compatible",
>>> we may not be able to roll it out.
>>>
>>> > the convention of common SQL has been working for a majority of users.
>>> SQL features commonly used are column projections, simple filter
>>> application, joins, grouping and common aggregate and scalar function. A
>>> few users occasionally would like to use Trino or Spark specific functions
>>> but are sometimes able to find a way to use a function that is common to
>>> both the engines.
>>>
>>>
>>> it's an awesome summary of what constructs are necessary to be able to
>>> define useful views, while also keep them portable.
>>>
>>> To be able to express column projections, simple filter application,
>>> joins, grouping and common aggregate and scalar function in a structured
>>> IR, how much effort do you think would be required?
>>> We didn't really talk about downsides of a structured approach, other
>>> than it looks complex.
>>> if we indeed estimate it as a multi-year effort, i wouldn't argue for
>>> that. Maybe i were overly optimistic though.
>>>
>>>
>>> As Jack mentioned, for engine-specific approach that's not supposed to
>>> be consumed by multiple engines, we may be better served with approach
>>> that's outside of Iceberg spec, like
>>> https://github.com/trinodb/trino/pull/8540.
>>>
>>>
>>> Best,
>>> PF
>>>
>>>
>>>
>>>
>>>
>>> On Thu, Jul 29, 2021 at 12:33 PM Anjali Norwood
>>>  wrote:
>>>
>>>> Hi,
>>>>
>>>> Thank you for all the comments. I will try to address them all here
>>>> together.
>>>>
>>>>
>>>>- @all Cross engine compatibility of view definition: Multiple
>>>>options such as engine agnostic SQL or IR of some form have been 
>>>> mentioned.
>>>>We can all agree that all of these options are non-trivial to
>>>>design/implement (perhaps a multi-year effort based on the option 
>>>> chosen)
>>>>and merit further discussion. I would like to suggest that we continue 
>>>> this
>>>>discussion but target this work for the future (v2?). In v1, we can add 
>>>> an
>>>>optional dialect field and an optional expanded/resolved SQL field that 
>>>> can
>>>>be interpreted by engines as they see fit. V1 can unlock many use cases
>>>>where the views are either accessed by a single engine or multi-engine 
>>>> use
>>>>cases where a (common) subset of SQL is supported. This proposal allows 
>>>> for
>>>>desirable features such as versioning of views and a common format of
>>>>storing view metadata while allowing extensibility in the future. *Does
>>>>anyone feel strongly otherwise?*
>>>>- @Piotr  As for common views at Netflix, the restrictions on SQL
>>>>are not enforced, but are advised as best practices. The convention of
>>>>common SQL has been working for a majority of users. SQL features 
>>>> commonly
>>>>used are column projections, simple filter application, joins, grouping 
>>>> and
>>>>common aggregate and scalar function. A few users occasionally would 
>>>> like
>>>>to use Trino or Spark specific functions but are sometimes able to find 
>>>> a
>>>>way to use a function that is common to both the engines.
>>>>- @Jacques and @Jack Iceberg data types are engine agnostic and
>>>>hence were picked for storing view schema. Thinking further, the schema
>>>>field should be made 'optional', since not all engines require it. (e.g.
>>>>Spark does not need it and Trino uses it only for validation).
>>>>- @Jacques Table references in the views can be arbitrary objects
>>>>such as tables from other catalogs or elasticsearch tables etc. I will
>>>>clarify it in the spec.
>>>>
>>>> I will work on incorporating all the comments in the spec and make the
>>>> next revision available for review soon.
>>>>
>>>> Regards,

Re: Proposal: Support for views in Iceberg

2021-09-28 Thread Anjali Norwood
Hi All,

Please see the spec in markdown format at the PR here
 to facilitate
adding/responding to comments. Please review.

thanks,
Anjali

On Tue, Sep 7, 2021 at 9:31 PM Jack Ye  wrote:

> Hi everyone,
>
> I have been thinking about the view support during the weekend, and I
> realize there is a conflict that Trino today already claims to support
> Iceberg view through Hive metastore.
>
> I believe we need to figure out a path forward around this issue before
> voting to pass the current proposal to avoid confusions for end users. I
> have summarized the issue here with a few different potential solutions:
>
>
> https://docs.google.com/document/d/1uupI7JJHEZIkHufo7sU4Enpwgg-ODCVBE6ocFUVD9oQ/edit?usp=sharing
>
> Please let me know what you think.
>
> Best,
> Jack Ye
>
> On Thu, Aug 26, 2021 at 3:29 PM Phillip Cloud  wrote:
>
>> On Thu, Aug 26, 2021 at 6:07 PM Jacques Nadeau 
>> wrote:
>>
>>>
>>> On Thu, Aug 26, 2021 at 2:44 PM Ryan Blue  wrote:
>>>
 Would a physical plan be portable for the purpose of an engine-agnostic
 view?

>>>
>>> My goal is it would be. There may be optional "hints" that a particular
>>> engine could leverage and others wouldn't but I think the goal should be
>>> that the IR is entirely engine-agnostic. Even in the Arrow project proper,
>>> there are really two independent heavy-weight engines that have their own
>>> capabilities and trajectories (c++ vs rust).
>>>
>>>
 Physical plan details seem specific to an engine to me, but maybe I'm
 thinking too much about how Spark is implemented. My inclination would be
 to accept only logical IR, which could just mean accepting a subset of the
 standard.

>>>
>>> I think it is very likely that different consumers will only support a
>>> subset of plans. That being said, I'm not sure what you're specifically
>>> trying to mitigate or avoid. I'd be inclined to simply allow the full
>>> breadth of IR within Iceberg. If it is well specified, an engine can either
>>> choose to execute or not (same as the proposal wrt to SQL syntax or if a
>>> function is missing on an engine). The engine may even have internal
>>> rewrites if it likes doing things a different way than what is requested.
>>>
>>
>> I also believe that consumers will not be expected to support all plans.
>> It will depend on the consumer, but many of the instanations of Read/Write
>> relations won't be executable for many consumers, for example.
>>
>>
>>>
>>>
 The document that Micah linked to is interesting, but I'm not sure that
 our goals are aligned.

>>>
>>> I think there is much commonality here and I'd argue it would be best to
>>> really try to see if a unified set of goals works well. I think Arrow IR is
>>> young enough that it can still be shaped/adapted. It may be that there
>>> should be some give or take on each side. It's possible that the goals are
>>> too far apart to unify but my gut is that they are close enough that we
>>> should try since it would be a great force multiplier.
>>>
>>>
 For one thing, it seems to make assumptions about the IR being used for
 Arrow data (at least in Wes' proposal), when I think that it may be easier
 to be agnostic to vectorization.

>>>
>>> Other than using the Arrow schema/types, I'm not at all convinced that
>>> the IR should be Arrow centric. I've actually argued to some that Arrow IR
>>> should be independent of Arrow to be its best self. Let's try to review it
>>> and see if/where we can avoid a tight coupling between plans and arrow
>>> specific concepts.
>>>
>>
>> Just to echo Jacques's comments here, the only thing that is Arrow
>> specific right now is the use of its type system. Literals, for example,
>> are encoded entirely in flatbuffers.
>>
>> Would love feedback on the current PR [1]. I'm looking to merge the first
>> iteration soonish, so please review at your earliest convenience.
>>
>>
>>>
>>>
 It also delegates forward/backward compatibility to flatbuffers, when I
 think compatibility should be part of the semantics and not delegated to
 serialization. For example, if I have Join("inner", a.id, b.id) and I
 evolve that to allow additional predicates Join("inner", a.id, b.id,
 a.x < b.y) then just because I can deserialize it doesn't mean it is
 compatible.

>>>
>>> I don't think that flatbuffers alone can solve all compatibility
>>> problems. It can solve some and I'd expect that implementation libraries
>>> will have to solve others. Would love to hear if others disagree (and think
>>> flatbuffers can solve everything wrt compatibility).
>>>
>>
>> I agree, I think you need both to achieve sane versioning. The version
>> needs to be shipped along with the IR, and libraries need to be able deal
>> with the different versions. I could be wrong, but I think it probably
>> makes more sense to start versioning the IR once the dust has settled a bit.
>>
>>
>>>
>>> J
>>>
>>
>> [1]: https