> My only concern is doing this only for Flink 1.20. If this is only a single default value change, I'm fine with it.
it is one config change plus Java doc and @deprecated change. It is very minimal. I don't see the benefit outweighing the state incompatibility of the switch if we also make the change for Flink 1.18 and 1.19 in the Iceberg 1.7 release. Hence, I would suggest only making the change for Flink 1.20. On Mon, Aug 12, 2024 at 4:38 AM Péter Váry <peter.vary.apa...@gmail.com> wrote: > Thanks Steven for driving this! > > I'm very much for deprecating FlinkSource for IcebergSource. > My only concern is doing this only for Flink 1.20. If this is only a > single default value change, I'm fine with it. OTOH having bigger > differences between the source of the different Flink versions would cause > more maintenance headache in the future for a minimal gain. > > I understand that Flink "natively" doesn't guarantee state compatibility > between major/minor versions. If needed, I suggest that we mirror this with > the Iceberg connector, and use documentation to highlight the change for > the users between Iceberg 1.6 and Iceberg 1.7. > > Thanks, > Peter > > Fokko Driesprong <fo...@apache.org> ezt írta (időpont: 2024. aug. 12., H, > 10:12): > >> Hey Steven, >> >> That sounds very exciting! I'm not a heavy Flink user, but I don't see >> any issues enabling it on Flink 1.20. We should make it explicit in the >> changelog, and if possible give some hints on how to drain the Flink jobs. >> >> Kind regards, >> Fokko >> >> Op ma 12 aug 2024 om 04:57 schreef Steven Wu <stevenz...@gmail.com>: >> >>> >>> *What* >>> >>> In the next Iceberg 1.7 release with Flink 1.20 support [1], I >>> am proposing to make the following changes for *Flink* *1.20 only* . >>> >>> 1. Mark the old `FlinkSource` as deprecated and redirect users to the >>> FLIP-27 `IcebergSource` in the Javadoc. >>> >>> 2. Make the FLIP-27 source the default for Flink SQL. Users can still >>> opt back to the old source via config if needed. Due to the change of >>> source implementation and checkpoint state, users won't be able to restore >>> from checkpoint/savepoint for the upgrade to Flink 1.20 and Iceberg 1.7. As >>> Flink doesn't guarantee state compatibility for new major-minor Flink >>> version upgrades e.g. from 1.19 to 1.20 [12], this should be acceptable >>> to Flink SQL users. We should clearly call out the change and state >>> incompatibility in the release notes. >>> >>> *Why* >>> >>> FLIP-27 is the new source interface introduced by Flink in early 2021. >>> The new FLIP-27 `IcebergSource` implementation [2] was added into Iceberg >>> around mid of 2022. It was initially added as @Experimental and requires >>> code change to switch to the new API. For Flink SQL jobs, default is still >>> the old `FlinkSource` implementation and requires config change to opt in >>> to the FLIP-27 `IcebergSource`. >>> >>> It has been two years since the initial introduction of FLIP-27 source >>> implementation in Iceberg. Now is probably a good time to switch the >>> default to FLIP-27 source. >>> >>> 1. The community has continue to improve the FLIP-27 sources, like JSON >>> serializer for FileScanTask [3], split discovery throttling [4], watermark >>> alignment [5], split enumerator monitoring metrics [6], metadata table >>> reading [8], speculative execution [9]. Those improvements are not >>> available in the old source implementation. >>> 2. We have recently closed the remaining gaps like limit pushdown [10], >>> inferring source parallelism [11] for batch execution to achieve feature >>> parity between the old and new FLIP-27 source. >>> 3.FLIP-27 source has been used by many users in the production >>> environment for almost two years now. It has been battle tested. >>> 4. The old SourceFunction interface has been marked as deprecated since >>> Flink 1.18 on Aug 2023 [7]. >>> >>> >>> *References* >>> [1] https://github.com/apache/iceberg/pull/10881 >>> [2] https://github.com/apache/iceberg/projects/23 >>> [3] https://github.com/apache/iceberg/issues/1698 >>> [4] https://github.com/apache/iceberg/pull/6299 >>> [5] https://github.com/apache/iceberg/pull/8553 >>> [6] https://github.com/apache/iceberg/pull/9524 >>> [7] https://issues.apache.org/jira/browse/FLINK-28046 >>> [8] https://github.com/apache/iceberg/pull/6222 >>> [9] https://github.com/apache/iceberg/pull/10548 >>> [10] https://github.com/apache/iceberg/pull/10748 >>> [11] https://github.com/apache/iceberg/pull/10832 >>> [12] >>> https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/ops/upgrading/#table-api--sql >>> >>>