> My only concern is doing this only for Flink 1.20. If this is only a
single default value change, I'm fine with it.

it is one config change plus Java doc and @deprecated change. It is very
minimal.

I don't see the benefit outweighing the state incompatibility of the switch
if we also make the change for Flink 1.18 and 1.19 in the Iceberg 1.7
release. Hence, I would suggest only making the change for Flink 1.20.



On Mon, Aug 12, 2024 at 4:38 AM Péter Váry <peter.vary.apa...@gmail.com>
wrote:

> Thanks Steven for driving this!
>
> I'm very much for deprecating FlinkSource for IcebergSource.
> My only concern is doing this only for Flink 1.20. If this is only a
> single default value change, I'm fine with it. OTOH having bigger
> differences between the source of the different Flink versions would cause
> more maintenance headache in the future for a minimal gain.
>
> I understand that Flink "natively" doesn't guarantee state compatibility
> between major/minor versions. If needed, I suggest that we mirror this with
> the Iceberg connector, and use documentation to highlight the change for
> the users between Iceberg 1.6 and Iceberg 1.7.
>
> Thanks,
> Peter
>
> Fokko Driesprong <fo...@apache.org> ezt írta (időpont: 2024. aug. 12., H,
> 10:12):
>
>> Hey Steven,
>>
>> That sounds very exciting! I'm not a heavy Flink user, but I don't see
>> any issues enabling it on Flink 1.20. We should make it explicit in the
>> changelog, and if possible give some hints on how to drain the Flink jobs.
>>
>> Kind regards,
>> Fokko
>>
>> Op ma 12 aug 2024 om 04:57 schreef Steven Wu <stevenz...@gmail.com>:
>>
>>>
>>> *What*
>>>
>>> In the next Iceberg 1.7 release with Flink 1.20 support [1], I
>>> am proposing to make the following changes for *Flink* *1.20 only* .
>>>
>>> 1. Mark the old `FlinkSource` as deprecated and redirect users to the
>>> FLIP-27 `IcebergSource` in the Javadoc.
>>>
>>> 2. Make the FLIP-27 source the default for Flink SQL. Users can still
>>> opt back to the old source via config if needed. Due to the change of
>>> source implementation and checkpoint state, users won't be able to restore
>>> from checkpoint/savepoint for the upgrade to Flink 1.20 and Iceberg 1.7. As
>>> Flink doesn't guarantee state compatibility for new major-minor Flink
>>> version upgrades e.g. from 1.19 to 1.20 [12], this should be acceptable
>>> to Flink SQL users. We should clearly call out the change and state
>>> incompatibility in the release notes.
>>>
>>> *Why*
>>>
>>> FLIP-27 is the new source interface introduced by Flink in early 2021.
>>> The new FLIP-27 `IcebergSource` implementation [2] was added into Iceberg
>>> around mid of 2022. It was initially added as @Experimental and requires
>>> code change to switch to the new API. For Flink SQL jobs, default is still
>>> the old `FlinkSource` implementation and requires config change to opt in
>>> to the FLIP-27 `IcebergSource`.
>>>
>>> It has been two years since the initial introduction of FLIP-27 source
>>> implementation in Iceberg. Now is probably a good time to switch the
>>> default to FLIP-27 source.
>>>
>>> 1. The community has continue to improve the FLIP-27 sources, like JSON
>>> serializer for FileScanTask [3], split discovery throttling [4], watermark
>>> alignment [5], split enumerator monitoring metrics [6], metadata table
>>> reading [8], speculative execution [9]. Those improvements are not
>>> available in the old source implementation.
>>> 2. We have recently closed the remaining gaps like limit pushdown [10],
>>> inferring source parallelism [11] for batch execution to achieve feature
>>> parity between the old and new FLIP-27 source.
>>> 3.FLIP-27 source has been used by many users in the production
>>> environment for almost two years now. It has been battle tested.
>>> 4. The old SourceFunction interface has been marked as deprecated since
>>> Flink 1.18 on Aug 2023 [7].
>>>
>>>
>>> *References*
>>> [1] https://github.com/apache/iceberg/pull/10881
>>> [2] https://github.com/apache/iceberg/projects/23
>>> [3] https://github.com/apache/iceberg/issues/1698
>>> [4] https://github.com/apache/iceberg/pull/6299
>>> [5] https://github.com/apache/iceberg/pull/8553
>>> [6] https://github.com/apache/iceberg/pull/9524
>>> [7] https://issues.apache.org/jira/browse/FLINK-28046
>>> [8] https://github.com/apache/iceberg/pull/6222
>>> [9] https://github.com/apache/iceberg/pull/10548
>>> [10] https://github.com/apache/iceberg/pull/10748
>>> [11] https://github.com/apache/iceberg/pull/10832
>>> [12]
>>> https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/ops/upgrading/#table-api--sql
>>>
>>>

Reply via email to