Cool +1 from me then. Steven Wu <stevenz...@gmail.com> ezt írta (időpont: 2024. aug. 12., H, 17:56):
> > My only concern is doing this only for Flink 1.20. If this is only a > single default value change, I'm fine with it. > > it is one config change plus Java doc and @deprecated change. It is very > minimal. > > I don't see the benefit outweighing the state incompatibility of the > switch if we also make the change for Flink 1.18 and 1.19 in the Iceberg > 1.7 release. Hence, I would suggest only making the change for Flink 1.20. > > > > On Mon, Aug 12, 2024 at 4:38 AM Péter Váry <peter.vary.apa...@gmail.com> > wrote: > >> Thanks Steven for driving this! >> >> I'm very much for deprecating FlinkSource for IcebergSource. >> My only concern is doing this only for Flink 1.20. If this is only a >> single default value change, I'm fine with it. OTOH having bigger >> differences between the source of the different Flink versions would cause >> more maintenance headache in the future for a minimal gain. >> >> I understand that Flink "natively" doesn't guarantee state compatibility >> between major/minor versions. If needed, I suggest that we mirror this with >> the Iceberg connector, and use documentation to highlight the change for >> the users between Iceberg 1.6 and Iceberg 1.7. >> >> Thanks, >> Peter >> >> Fokko Driesprong <fo...@apache.org> ezt írta (időpont: 2024. aug. 12., >> H, 10:12): >> >>> Hey Steven, >>> >>> That sounds very exciting! I'm not a heavy Flink user, but I don't see >>> any issues enabling it on Flink 1.20. We should make it explicit in the >>> changelog, and if possible give some hints on how to drain the Flink jobs. >>> >>> Kind regards, >>> Fokko >>> >>> Op ma 12 aug 2024 om 04:57 schreef Steven Wu <stevenz...@gmail.com>: >>> >>>> >>>> *What* >>>> >>>> In the next Iceberg 1.7 release with Flink 1.20 support [1], I >>>> am proposing to make the following changes for *Flink* *1.20 only* . >>>> >>>> 1. Mark the old `FlinkSource` as deprecated and redirect users to the >>>> FLIP-27 `IcebergSource` in the Javadoc. >>>> >>>> 2. Make the FLIP-27 source the default for Flink SQL. Users can still >>>> opt back to the old source via config if needed. Due to the change of >>>> source implementation and checkpoint state, users won't be able to restore >>>> from checkpoint/savepoint for the upgrade to Flink 1.20 and Iceberg 1.7. As >>>> Flink doesn't guarantee state compatibility for new major-minor Flink >>>> version upgrades e.g. from 1.19 to 1.20 [12], this should be >>>> acceptable to Flink SQL users. We should clearly call out the change and >>>> state incompatibility in the release notes. >>>> >>>> *Why* >>>> >>>> FLIP-27 is the new source interface introduced by Flink in early 2021. >>>> The new FLIP-27 `IcebergSource` implementation [2] was added into Iceberg >>>> around mid of 2022. It was initially added as @Experimental and requires >>>> code change to switch to the new API. For Flink SQL jobs, default is still >>>> the old `FlinkSource` implementation and requires config change to opt in >>>> to the FLIP-27 `IcebergSource`. >>>> >>>> It has been two years since the initial introduction of FLIP-27 source >>>> implementation in Iceberg. Now is probably a good time to switch the >>>> default to FLIP-27 source. >>>> >>>> 1. The community has continue to improve the FLIP-27 sources, like JSON >>>> serializer for FileScanTask [3], split discovery throttling [4], watermark >>>> alignment [5], split enumerator monitoring metrics [6], metadata table >>>> reading [8], speculative execution [9]. Those improvements are not >>>> available in the old source implementation. >>>> 2. We have recently closed the remaining gaps like limit pushdown [10], >>>> inferring source parallelism [11] for batch execution to achieve feature >>>> parity between the old and new FLIP-27 source. >>>> 3.FLIP-27 source has been used by many users in the production >>>> environment for almost two years now. It has been battle tested. >>>> 4. The old SourceFunction interface has been marked as deprecated since >>>> Flink 1.18 on Aug 2023 [7]. >>>> >>>> >>>> *References* >>>> [1] https://github.com/apache/iceberg/pull/10881 >>>> [2] https://github.com/apache/iceberg/projects/23 >>>> [3] https://github.com/apache/iceberg/issues/1698 >>>> [4] https://github.com/apache/iceberg/pull/6299 >>>> [5] https://github.com/apache/iceberg/pull/8553 >>>> [6] https://github.com/apache/iceberg/pull/9524 >>>> [7] https://issues.apache.org/jira/browse/FLINK-28046 >>>> [8] https://github.com/apache/iceberg/pull/6222 >>>> [9] https://github.com/apache/iceberg/pull/10548 >>>> [10] https://github.com/apache/iceberg/pull/10748 >>>> [11] https://github.com/apache/iceberg/pull/10832 >>>> [12] >>>> https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/ops/upgrading/#table-api--sql >>>> >>>>