Thanks Peter for the input. I understand the valid concern of potential large diff if we only delete the `FlinkSource` in the flink 2.0 module. It does make the diff across flink versions's modules unusable. For backporting, I thought you had a process that doesn't depend on the diff of directories (apply patches or sth). Anyway, I am open to delete `FlinkSource` across all versions for the concern that Peter brought up.
On Wed, Mar 12, 2025 at 11:55 PM Péter Váry <peter.vary.apa...@gmail.com> wrote: > I agree with the general sentiment, that we should remove FlinkSource, but > only deprecate FlinkSink. > > I understand Steven's approach to remove the deprecated classes only from > the latest Flink version, but keep them for the old versions, but I have > some concerns with this. > > My main concern is that if we remove all the code related to the old > FlinkSource from the latent Flink version (production code and test cleanup > too), then the different Flink versions will have very different code > bases. This will make the backports to older Flink versions painful until > the full deprecation. > > I see the following possibilities: > 1. Bite the bullet, and accept the backport complexity increase > 2. Remove only the main class, and remove other classes only at full > deprecation > 3. After surveying the userbase deprecate the code for every versions > > Since the FlinkSource is deprecated for a long time, I would aim for > conserving development resources and chose option 3, unless there are > objections from the userbase. > > On Wed, Mar 12, 2025, 18:45 Rodrigo Meneses <rmene...@gmail.com> wrote: > >> Once we deprecate FlinkSink, we should also upgrade IcebergSink from >> `Experimental` to `PublicEvolving` >> >> >> On Wed, Mar 12, 2025 at 10:01 AM Steven Wu <stevenz...@gmail.com> wrote: >> >>> Max, thanks for confirming the Flink 2.0 behavior. With the latest >>> information, here are my suggestions >>> >>> 1) I am fine with deleting the old `FlinkSource` in the upcoming flink >>> 2.0 module (but keeping them in flink 1.x modules). As discussed in a >>> previous thread, the new FLIP-27 source has been baked by users for 2 >>> years. We already switched SQL to the new FLIP-27 source in the Iceberg 1.7 >>> release. So far so good. >>> >>> 2) for FlinkSink, I would prefer to keep it around with flink 2.0 >>> module. Here are my reasons. >>> - new sink is still relatively young. Give it more time to bake and >>> mature. >>> - Flink sink (for streaming ingestion) is probably a lot more widely >>> used than Flink source (for batch or streaming read). Hence, it is valuable >>> to be more conservative in the deprecation plan here. >>> - we can stop adding new features to the old FlinkSink and only >>> contribute to the new IcebergSink. that should reduce the maintenance >>> burden. >>> - we can mark the FinkSink as deprecated for all flink versions once the >>> parity is achieved. >>> >>> >>> On Wed, Mar 12, 2025 at 7:09 AM Maximilian Michels <m...@apache.org> >>> wrote: >>> >>>> Hi JB, >>>> >>>> V3 is another topic, but it would be good to start looking into that. >>>> >>>> Hey Steven, >>>> >>>> Flink 2.0 actually did not remove the old interfaces, which was not >>>> what I expected when I started this conversation. Turns out, old >>>> interfaces have been moved to ".legacy" packages. I'm in the process >>>> of verifying they will continue to work as expected. >>>> >>>> It looks like we will still have the option to keep around the old >>>> source / sink implementations. However, I think it still makes sense >>>> to remove FlinkSource and FlinkSink in the Flink 2.0 code path. This >>>> will simplify the code base and avoid confusion for users in the long >>>> run. I don't see real value in bringing over legacy sources / sinks to >>>> a new Flink major release. >>>> >>>> -Max >>>> >>>> On Tue, Mar 11, 2025 at 10:46 PM Steven Wu <stevenz...@gmail.com> >>>> wrote: >>>> > >>>> > I assume Flink 2.0 will remove the old source and sink interfaces. >>>> > >>>> > With the above assumption (please correct me if I am wrong), here is >>>> what I would suggest for the next Iceberg release (1.9 or whichever version >>>> that lands the Flink 2.0 support). >>>> > - Flink 2.0 module - remove the old source and sink impls. we have no >>>> choice here. >>>> > - Flink 1.x modules - update the deprecation note for old >>>> `FlinkSource` that will be removed in Flink 2.0 support. add a similar >>>> deprecation node for old `FlinkSink`. >>>> > >>>> > We shouldn't remove the old `FlinkSource` and `FlinkSink` for the >>>> Flink 1.x modules. Allow users to continue to use them with Flink 1.x. Note >>>> that Iceberg supports 3 active Flink versions (currently 1.18, 1.19, 1.20). >>>> So old versions will be naturally retired over time. As Flink progresses >>>> with 2.0, 2.1, 2.2 (etc.), we may want to keep the support for the last >>>> Flink 1.x version a little longer based on the community feedback. >>>> > >>>> > Thanks, >>>> > Steven >>>> > >>>> > On Tue, Mar 11, 2025 at 8:48 AM Jean-Baptiste Onofré <j...@nanthrax.net> >>>> wrote: >>>> >> >>>> >> Hi Max >>>> >> >>>> >> Yes, it makes sense. Maybe I can help on that as I'm working on Spec >>>> >> V3 support in Flink (I did the default value, and I'm checking the >>>> new >>>> >> types now). >>>> >> >>>> >> Please let me know :) >>>> >> >>>> >> Thanks, >>>> >> Regards >>>> >> JB >>>> >> >>>> >> On Tue, Mar 11, 2025 at 11:23 AM Maximilian Michels <m...@apache.org> >>>> wrote: >>>> >> > >>>> >> > Hi JB, >>>> >> > >>>> >> > Yes, Flink 2.0 support ideally should land in Iceberg 1.9. >>>> >> > >>>> >> > Hi Rodrigo, >>>> >> > >>>> >> > +1 on closing the gap between the two sink implementations: >>>> FlinkSink >>>> >> > and IcebergSink. Thanks for helping to close that gap! >>>> >> > >>>> >> > Hi Steven, >>>> >> > >>>> >> > Good point on the "Remove by Iceberg 2.0" claim stated in >>>> FlinkSource. >>>> >> > We can keep it until Iceberg 2.0 in the Flink 1.X paths, but I >>>> would >>>> >> > propose to remove FlinkSource earlier than Iceberg 2.0 for the >>>> Flink >>>> >> > 2.X path. The reason is that it is not used by default for many >>>> >> > Iceberg versions. I think it is unlikely that users jumping on >>>> Flink >>>> >> > 2.0 will want to use it. >>>> >> > >>>> >> > As for removing FlinkSink, you're right that IcebergSink is >>>> relatively >>>> >> > new as a replacement. If we can keep it in the process of porting >>>> the >>>> >> > code, we may keep it around longer, but I suggest deprecating it >>>> and >>>> >> > scheduling it for removal in the Iceberg release following 1.9. >>>> >> > >>>> >> > To summarize, I'm proposing the following: >>>> >> > >>>> >> > Iceberg 1.9 >>>> >> > - Remove FlinkSource in Flink 2.0 code path >>>> >> > - Deprecate FlinkSink for all supported Flink versions >>>> >> > >>>> >> > Iceberg > 1.9 >>>> >> > - Remove FlinkSource for all supported Flink versions >>>> >> > - Remove FlinkSink for all supported Flink versions >>>> >> > >>>> >> > Does that sound right? >>>> >> > >>>> >> > Cheers, >>>> >> > Max >>>> >> > >>>> >> > On Thu, Mar 6, 2025 at 9:10 PM Steven Wu <stevenz...@gmail.com> >>>> wrote: >>>> >> > > >>>> >> > > > if Flink 2.0 release is going to release the old source and >>>> sink interfaces, t >>>> >> > > >>>> >> > > typo above: "release the old" -> "delete the old" >>>> >> > > >>>> >> > > On Thu, Mar 6, 2025 at 12:08 PM Steven Wu <stevenz...@gmail.com> >>>> wrote: >>>> >> > >> >>>> >> > >> There was a previous thread for the Flink source. The Javadoc >>>> deprecation note for the old `FlinkSource` currently says it will be >>>> removed in Iceberg 2.0 release >>>> >> > >> >>>> https://lists.apache.org/thread/27kcvo3p86pysk9wrggq4vphzo03sv3l >>>> >> > >> >>>> >> > >> Now regarding deprecating and removing the old `FlinkSink` in >>>> favor of the new `IcebergSink`, the new `IcebergSink` was added 6 months >>>> ago and released in Iceberg 1.7 (current release is 1.8). Not sure how many >>>> users have got a chance to use it. Ideally, I would like to have the new >>>> `IcebergSink` bake longer with more users running it in the production >>>> environments. There is also the parity problems that Rod mentioned. >>>> >> > >> >>>> >> > >> However, if Flink 2.0 release is going to release the old >>>> source and sink interfaces, then we will have no choice and remove the old >>>> source and sink implementations earlier than we originally >>>> planned/preferred. >>>> >> > >> >>>> >> > >> >>>> >> > >> On Thu, Mar 6, 2025 at 8:40 AM Rodrigo Meneses < >>>> rmene...@gmail.com> wrote: >>>> >> > >>> >>>> >> > >>> Hi Max, +1 on that too. IcebergSink has been hanging around >>>> for a while now. We want to make sure we have the same features, though: >>>> >> > >>> >>>> >> > >>> I’ve got https://github.com/apache/iceberg/pull/12071 that >>>> adds the Range Distribution Mode to IcebergSink. It still needs a few >>>> recent bug fixes and features backported, but it should be ready soon. >>>> >> > >>> >>>> >> > >>> >>>> >> > >>> >>>> >> > >>> On Thu, Mar 6, 2025 at 8:18 AM Jean-Baptiste Onofré < >>>> j...@nanthrax.net> wrote: >>>> >> > >>>> >>>> >> > >>>> Hi Max, >>>> >> > >>>> >>>> >> > >>>> I guess you are proposing to remove FlinkSink and the >>>> corresponding >>>> >> > >>>> FlinkSource as well. >>>> >> > >>>> >>>> >> > >>>> It makes to me as, I saw in the code, that both FlinkSink and >>>> >> > >>>> FlinkSource are deprecated for a while. >>>> >> > >>>> >>>> >> > >>>> So, +1 to remove it. >>>> >> > >>>> >>>> >> > >>>> Are you planning this for 1.9.0 ? >>>> >> > >>>> >>>> >> > >>>> Regards >>>> >> > >>>> JB >>>> >> > >>>> >>>> >> > >>>> On Thu, Mar 6, 2025 at 4:21 PM Maximilian Michels < >>>> m...@apache.org> wrote: >>>> >> > >>>> > >>>> >> > >>>> > Hi, >>>> >> > >>>> > >>>> >> > >>>> > Today there are two Flink write connectors in Iceberg: >>>> >> > >>>> > >>>> >> > >>>> > 1. FlinkSink (original sink, based on Flink legacy >>>> interfaces) >>>> >> > >>>> > 2. IcebergSink (newer version, based on modern Flink API) >>>> >> > >>>> > >>>> >> > >>>> > In terms of features, (1) is a subset of (2). >>>> >> > >>>> > >>>> >> > >>>> > I'm in the process of adding support for Flink 2.0. The >>>> interfaces >>>> >> > >>>> > used for (1) have been deprecated for several Flink >>>> versions and are >>>> >> > >>>> > removed / discouraged in Flink 2.0. >>>> >> > >>>> > >>>> >> > >>>> > Therefore, I would like to propose to remove FlinkSink for >>>> the Flink >>>> >> > >>>> > 2.0 Iceberg module. We have already deprecated FlinkSink >>>> for a while. >>>> >> > >>>> > >>>> >> > >>>> > Any objections? >>>> >> > >>>> > >>>> >> > >>>> > -Max >>>> >>>