Cool +1 from me then.

Steven Wu <stevenz...@gmail.com> ezt írta (időpont: 2024. aug. 12., H,
17:56):

> > My only concern is doing this only for Flink 1.20. If this is only a
> single default value change, I'm fine with it.
>
> it is one config change plus Java doc and @deprecated change. It is very
> minimal.
>
> I don't see the benefit outweighing the state incompatibility of the
> switch if we also make the change for Flink 1.18 and 1.19 in the Iceberg
> 1.7 release. Hence, I would suggest only making the change for Flink 1.20.
>
>
>
> On Mon, Aug 12, 2024 at 4:38 AM Péter Váry <peter.vary.apa...@gmail.com>
> wrote:
>
>> Thanks Steven for driving this!
>>
>> I'm very much for deprecating FlinkSource for IcebergSource.
>> My only concern is doing this only for Flink 1.20. If this is only a
>> single default value change, I'm fine with it. OTOH having bigger
>> differences between the source of the different Flink versions would cause
>> more maintenance headache in the future for a minimal gain.
>>
>> I understand that Flink "natively" doesn't guarantee state compatibility
>> between major/minor versions. If needed, I suggest that we mirror this with
>> the Iceberg connector, and use documentation to highlight the change for
>> the users between Iceberg 1.6 and Iceberg 1.7.
>>
>> Thanks,
>> Peter
>>
>> Fokko Driesprong <fo...@apache.org> ezt írta (időpont: 2024. aug. 12.,
>> H, 10:12):
>>
>>> Hey Steven,
>>>
>>> That sounds very exciting! I'm not a heavy Flink user, but I don't see
>>> any issues enabling it on Flink 1.20. We should make it explicit in the
>>> changelog, and if possible give some hints on how to drain the Flink jobs.
>>>
>>> Kind regards,
>>> Fokko
>>>
>>> Op ma 12 aug 2024 om 04:57 schreef Steven Wu <stevenz...@gmail.com>:
>>>
>>>>
>>>> *What*
>>>>
>>>> In the next Iceberg 1.7 release with Flink 1.20 support [1], I
>>>> am proposing to make the following changes for *Flink* *1.20 only* .
>>>>
>>>> 1. Mark the old `FlinkSource` as deprecated and redirect users to the
>>>> FLIP-27 `IcebergSource` in the Javadoc.
>>>>
>>>> 2. Make the FLIP-27 source the default for Flink SQL. Users can still
>>>> opt back to the old source via config if needed. Due to the change of
>>>> source implementation and checkpoint state, users won't be able to restore
>>>> from checkpoint/savepoint for the upgrade to Flink 1.20 and Iceberg 1.7. As
>>>> Flink doesn't guarantee state compatibility for new major-minor Flink
>>>> version upgrades e.g. from 1.19 to 1.20 [12], this should be
>>>> acceptable to Flink SQL users. We should clearly call out the change and
>>>> state incompatibility in the release notes.
>>>>
>>>> *Why*
>>>>
>>>> FLIP-27 is the new source interface introduced by Flink in early 2021.
>>>> The new FLIP-27 `IcebergSource` implementation [2] was added into Iceberg
>>>> around mid of 2022. It was initially added as @Experimental and requires
>>>> code change to switch to the new API. For Flink SQL jobs, default is still
>>>> the old `FlinkSource` implementation and requires config change to opt in
>>>> to the FLIP-27 `IcebergSource`.
>>>>
>>>> It has been two years since the initial introduction of FLIP-27 source
>>>> implementation in Iceberg. Now is probably a good time to switch the
>>>> default to FLIP-27 source.
>>>>
>>>> 1. The community has continue to improve the FLIP-27 sources, like JSON
>>>> serializer for FileScanTask [3], split discovery throttling [4], watermark
>>>> alignment [5], split enumerator monitoring metrics [6], metadata table
>>>> reading [8], speculative execution [9]. Those improvements are not
>>>> available in the old source implementation.
>>>> 2. We have recently closed the remaining gaps like limit pushdown [10],
>>>> inferring source parallelism [11] for batch execution to achieve feature
>>>> parity between the old and new FLIP-27 source.
>>>> 3.FLIP-27 source has been used by many users in the production
>>>> environment for almost two years now. It has been battle tested.
>>>> 4. The old SourceFunction interface has been marked as deprecated since
>>>> Flink 1.18 on Aug 2023 [7].
>>>>
>>>>
>>>> *References*
>>>> [1] https://github.com/apache/iceberg/pull/10881
>>>> [2] https://github.com/apache/iceberg/projects/23
>>>> [3] https://github.com/apache/iceberg/issues/1698
>>>> [4] https://github.com/apache/iceberg/pull/6299
>>>> [5] https://github.com/apache/iceberg/pull/8553
>>>> [6] https://github.com/apache/iceberg/pull/9524
>>>> [7] https://issues.apache.org/jira/browse/FLINK-28046
>>>> [8] https://github.com/apache/iceberg/pull/6222
>>>> [9] https://github.com/apache/iceberg/pull/10548
>>>> [10] https://github.com/apache/iceberg/pull/10748
>>>> [11] https://github.com/apache/iceberg/pull/10832
>>>> [12]
>>>> https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/ops/upgrading/#table-api--sql
>>>>
>>>>

Reply via email to