Hey Wing Yew, I am planning to focus on this after we get partition stats readers/writers into main. I actually have ideas on how to implement changelog scans for V2 tables efficiently.
- Anton пн, 10 лют. 2025 р. о 21:11 Wing Yew Poon <wyp...@cloudera.com.invalid> пише: > Hi Anton, > > Thank you for looking at https://github.com/apache/iceberg/pull/10935. I > think we are in agreement on the behavior, but you have concerns about the > performance of the scan, which I agree is justified. It has been some > months now. Do you have any suggestions for improving the performance? How > can we move forward with this? Can we get a working implementation in first > and optimize it later? > > - Wing Yew > > > On Sat, Oct 5, 2024 at 10:53 PM Anton Okolnychyi <aokolnyc...@gmail.com> > wrote: > >> I will take a look next week! >> >> субота, 5 жовтня 2024 р. Péter Váry <peter.vary.apa...@gmail.com> пише: >> >>> Hi Team, >>> >>> Gentle reminder, that the PR for the changelog planning ( >>> https://github.com/apache/iceberg/pull/10935) is still waiting for >>> expert reviews. >>> >>> Thanks, Peter >>> >>> On Tue, Oct 1, 2024, 06:46 Yufei Gu <flyrain...@gmail.com> wrote: >>> >>>> Thanks, Peter and Wing Yew Poon, for tackling these! I’ve been eager to >>>> review, but this week has been hectic. I plan to check out PR #10935 next >>>> week, though I’d be happy if someone beats me to it. >>>> >>>> Yufei >>>> >>>> >>>> On Mon, Sep 30, 2024 at 3:02 AM Péter Váry <peter.vary.apa...@gmail.com> >>>> wrote: >>>> >>>>> Hi Team, >>>>> >>>>> The Changelog scan Java API interfaces were created a long time ago by >>>>> Anton, but it has not been implemented until yet. There is a Spark >>>>> specific SQL implementation for the feature, but the feature is not >>>>> available on the Java API. >>>>> >>>>> The Flink CDC streaming read is one of the often required features [1] >>>>> [2]. Flink needs the Java API to provide streaming reads for tables with >>>>> deletes. >>>>> >>>>> Wing Yew Poon implemented the Java API [3]. I did my best reviewing >>>>> the PR, but I am not an expert on this part of the code. I would like to >>>>> ask some of the planning experts (or anyone else for that matter), to take >>>>> a look and validate too. >>>>> >>>>> Thanks, >>>>> Peter >>>>> >>>>> [1] - https://github.com/apache/iceberg/issues/5623 >>>>> [2] - >>>>> https://github.com/apache/iceberg/issues/5803#issuecomment-1259759074 >>>>> [3] - https://github.com/apache/iceberg/pull/10935 >>>>> >>>>