Re: [ANNOUNCE] New Apache Flink Committer - Xuannan Su

2024-08-18 Thread Rui Fan
Congratulations, Xuannan!

Best,
Rui

On Sun, Aug 18, 2024 at 2:20 PM Leonard Xu  wrote:

> Congratulations!  Xuannan
>
>
> Best,
> Leonard
>
>


[jira] [Created] (FLINK-36083) Fix kafka table api doc's Connector Options table broken

2024-08-18 Thread Zhongqiang Gong (Jira)
Zhongqiang Gong created FLINK-36083:
---

 Summary: Fix kafka table api doc's Connector Options table broken
 Key: FLINK-36083
 URL: https://issues.apache.org/jira/browse/FLINK-36083
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: kafka-3.2.0
Reporter: Zhongqiang Gong
Assignee: Zhongqiang Gong
 Attachments: image-2024-08-18-15-26-13-109.png

!image-2024-08-18-15-26-13-109.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [ANNOUNCE] New Apache Flink Committer - Xuannan Su

2024-08-18 Thread Feng Jin
Congratulations, Xuannan!

Best,
Feng


On Sun, Aug 18, 2024 at 3:08 PM Rui Fan <1996fan...@gmail.com> wrote:

> Congratulations, Xuannan!
>
> Best,
> Rui
>
> On Sun, Aug 18, 2024 at 2:20 PM Leonard Xu  wrote:
>
> > Congratulations!  Xuannan
> >
> >
> > Best,
> > Leonard
> >
> >
>


Re: [ANNOUNCE] New Apache Flink Committer - Xuannan Su

2024-08-18 Thread Junrui Lee
Congratulations, Xuannan!

Best,
Junrui

Feng Jin  于2024年8月18日周日 16:34写道:

> Congratulations, Xuannan!
>
> Best,
> Feng
>
>
> On Sun, Aug 18, 2024 at 3:08 PM Rui Fan <1996fan...@gmail.com> wrote:
>
> > Congratulations, Xuannan!
> >
> > Best,
> > Rui
> >
> > On Sun, Aug 18, 2024 at 2:20 PM Leonard Xu  wrote:
> >
> > > Congratulations!  Xuannan
> > >
> > >
> > > Best,
> > > Leonard
> > >
> > >
> >
>


Re: [VOTE] Apache Flink CDC Release 3.2.0, release candidate #0

2024-08-18 Thread Yanquan Lv
Hi Qingsheng, I've tested and met a NotSerializableException[1] that will
lead to failure when using Kafka as pipeline sink in 3.2.0 version.
I think it may be a blocker as this happened during the submission phase.

[1] https://issues.apache.org/jira/browse/FLINK-36082

Qingsheng Ren  于2024年8月15日周四 15:13写道:

> Hi everyone,
>
> Please review and vote on the release candidate #0 for the version 3.2.0 of
> Apache Flink CDC,
> as follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
>
> **Release Overview**
>
> As an overview, the release consists of the following:
> a) Flink CDC source release to be deployed to dist.apache.org
> b) Maven artifacts to be deployed to the Maven Central Repository
>
> **Staging Areas to Review**
>
> The staging areas containing the above mentioned artifacts are as follows,
> for your review:
> * All artifacts for a) can be found in the corresponding dev repository at
> dist.apache.org [1], which are signed with the key with fingerprint
> A1BD477F79D036D2C30CA7DBCA8AEEC2F6EB040B [2]
> * All artifacts for b) can be found at the Apache Nexus Repository [3]
>
> Other links for your review:
> * JIRA release notes [4]
> * Source code tag "release-3.2.0-rc0" with commit hash
> c03938e8de46b2d00a5984467d0e9bdca4a1 [5]
> * PR for release announcement blog post of Flink CDC 3.2.0 in flink-web [6]
>
> **Vote Duration**
>
> The voting time will run for at least 72 hours.
> It is adopted by majority approval, with at least 3 PMC affirmative votes.
>
> Thanks,
> Qingsheng
>
> [1] https://dist.apache.org/repos/dist/dev/flink/flink-cdc-3.2.0-rc0/
> [2] https://dist.apache.org/repos/dist/release/flink/KEYS
> [3] https://repository.apache.org/content/repositories/orgapacheflink-1753
> [4]
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12354594
> [5] https://github.com/apache/flink-cdc/tree/release-3.2.0-rc0
> [6] https://github.com/apache/flink-web/pull/753
>


Re: [ANNOUNCE] New Apache Flink Committer - Xuannan Su

2024-08-18 Thread Aleksandr Pilipenko
Congratulations, Xuannan!

Best,
Aleksandr

On Sun, 18 Aug 2024 at 09:49, clouding.vip 
wrote:

> Congratulations, Xuannan!
>
>
>
>
> 在 2024年8月18日 16:39,Junrui Lee 写道:
>
>
> Congratulations, Xuannan! Best, Junrui Feng Jin 
> 于2024年8月18日周日 16:34写道: > Congratulations, Xuannan! > > Best, > Feng > > >
> On Sun, Aug 18, 2024 at 3:08 PM Rui Fan <1996fan...@gmail.com> wrote: > >
> > Congratulations, Xuannan! > > > > Best, > > Rui > > > > On Sun, Aug 18,
> 2024 at 2:20 PM Leonard Xu  wrote: > > > > >
> Congratulations! Xuannan > > > > > > > > > Best, > > > Leonard > > > > > >
> > > >


[jira] [Created] (FLINK-36084) Optimze parquet binary getBytes with getBytesUnsafe to avoid copy cost

2024-08-18 Thread xy (Jira)
xy created FLINK-36084:
--

 Summary: Optimze parquet binary getBytes with getBytesUnsafe to 
avoid copy cost
 Key: FLINK-36084
 URL: https://issues.apache.org/jira/browse/FLINK-36084
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / API
Affects Versions: 2.0.0
Reporter: xy
 Fix For: 2.0.0


Optimze parquet binary getBytes with getBytesUnsafe to avoid copy cost



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-36085) Refeactor operation with LIKE/ILIKE

2024-08-18 Thread Sergey Nuyanzin (Jira)
Sergey Nuyanzin created FLINK-36085:
---

 Summary: Refeactor operation with LIKE/ILIKE
 Key: FLINK-36085
 URL: https://issues.apache.org/jira/browse/FLINK-36085
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Sergey Nuyanzin
Assignee: Sergey Nuyanzin


Current implementation has several issues
1. every operation should implement its own support of {{LIKE}}, {{ILIKE}} 
while common logic could be extracted and reused
2. Some operation generate invalid SQL for asSummaryString and the format is 
different between show operations

so the logic could be generalized for this {{SQL SHOW}} operations 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-36086) SQL Server source connector skips changes if restarted mid transaction

2024-08-18 Thread Sergei Morozov (Jira)
Sergei Morozov created FLINK-36086:
--

 Summary: SQL Server source connector skips changes if restarted 
mid transaction
 Key: FLINK-36086
 URL: https://issues.apache.org/jira/browse/FLINK-36086
 Project: Flink
  Issue Type: Bug
  Components: Flink CDC
Reporter: Sergei Morozov


If the SQL Server source connector is restarted while handling updates from a 
transaction with multiple updates, upon restart, it will skip the non-processed 
changes and proceed from the next transaction.

This is an analog of [DBZ-1128|https://issues.redhat.com/browse/DBZ-1128] but 
reproducible only in Flink CDC.

This is a regression introduced in 
[apache/flink-cdc#2176|https://github.com/apache/flink-cdc/pull/2176].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [ANNOUNCE] New Apache Flink Committer - Xuannan Su

2024-08-18 Thread Luke Chen
Congratulations, Xuannan!

Luke

On Sun, Aug 18, 2024 at 6:53 PM Aleksandr Pilipenko 
wrote:

> Congratulations, Xuannan!
>
> Best,
> Aleksandr
>
> On Sun, 18 Aug 2024 at 09:49, clouding.vip 
> wrote:
>
> > Congratulations, Xuannan!
> >
> >
> >
> >
> > 在 2024年8月18日 16:39,Junrui Lee 写道:
> >
> >
> > Congratulations, Xuannan! Best, Junrui Feng Jin 
> > 于2024年8月18日周日 16:34写道: > Congratulations, Xuannan! > > Best, > Feng > > >
> > On Sun, Aug 18, 2024 at 3:08 PM Rui Fan <1996fan...@gmail.com> wrote: >
> >
> > > Congratulations, Xuannan! > > > > Best, > > Rui > > > > On Sun, Aug 18,
> > 2024 at 2:20 PM Leonard Xu  wrote: > > > > >
> > Congratulations! Xuannan > > > > > > > > > Best, > > > Leonard > > > > >
> >
> > > > >
>


Re: [ANNOUNCE] New Apache Flink Committer - Xuannan Su

2024-08-18 Thread Guowei Ma
Congratulations!

Best,
Guowei


On Mon, Aug 19, 2024 at 8:24 AM Luke Chen  wrote:

> Congratulations, Xuannan!
>
> Luke
>
> On Sun, Aug 18, 2024 at 6:53 PM Aleksandr Pilipenko 
> wrote:
>
> > Congratulations, Xuannan!
> >
> > Best,
> > Aleksandr
> >
> > On Sun, 18 Aug 2024 at 09:49, clouding.vip 
> > wrote:
> >
> > > Congratulations, Xuannan!
> > >
> > >
> > >
> > >
> > > 在 2024年8月18日 16:39,Junrui Lee 写道:
> > >
> > >
> > > Congratulations, Xuannan! Best, Junrui Feng Jin  >
> > > 于2024年8月18日周日 16:34写道: > Congratulations, Xuannan! > > Best, > Feng >
> > >
> > > On Sun, Aug 18, 2024 at 3:08 PM Rui Fan <1996fan...@gmail.com> wrote:
> >
> > >
> > > > Congratulations, Xuannan! > > > > Best, > > Rui > > > > On Sun, Aug
> 18,
> > > 2024 at 2:20 PM Leonard Xu  wrote: > > > > >
> > > Congratulations! Xuannan > > > > > > > > > Best, > > > Leonard > > > >
> >
> > >
> > > > > >
> >
>


Re: [ANNOUNCE] New Apache Flink Committer - Xuannan Su

2024-08-18 Thread Jiabao Sun
Congratulations, Xuannan!

Best,
Jiabao

Guowei Ma  于2024年8月19日周一 09:21写道:

> Congratulations!
>
> Best,
> Guowei
>
>
> On Mon, Aug 19, 2024 at 8:24 AM Luke Chen  wrote:
>
> > Congratulations, Xuannan!
> >
> > Luke
> >
> > On Sun, Aug 18, 2024 at 6:53 PM Aleksandr Pilipenko 
> > wrote:
> >
> > > Congratulations, Xuannan!
> > >
> > > Best,
> > > Aleksandr
> > >
> > > On Sun, 18 Aug 2024 at 09:49, clouding.vip  >
> > > wrote:
> > >
> > > > Congratulations, Xuannan!
> > > >
> > > >
> > > >
> > > >
> > > > 在 2024年8月18日 16:39,Junrui Lee 写道:
> > > >
> > > >
> > > > Congratulations, Xuannan! Best, Junrui Feng Jin <
> jinfeng1...@gmail.com
> > >
> > > > 于2024年8月18日周日 16:34写道: > Congratulations, Xuannan! > > Best, > Feng >
> > > >
> > > > On Sun, Aug 18, 2024 at 3:08 PM Rui Fan <1996fan...@gmail.com>
> wrote:
> > >
> > > >
> > > > > Congratulations, Xuannan! > > > > Best, > > Rui > > > > On Sun, Aug
> > 18,
> > > > 2024 at 2:20 PM Leonard Xu  wrote: > > > > >
> > > > Congratulations! Xuannan > > > > > > > > > Best, > > > Leonard > > >
> >
> > >
> > > >
> > > > > > >
> > >
> >
>


[jira] [Created] (FLINK-36087) Fix

2024-08-18 Thread LvYanquan (Jira)
LvYanquan created FLINK-36087:
-

 Summary: Fix
 Key: FLINK-36087
 URL: https://issues.apache.org/jira/browse/FLINK-36087
 Project: Flink
  Issue Type: Bug
Reporter: LvYanquan






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-36088) Fix NopointException in PaimonDataSink.

2024-08-18 Thread LvYanquan (Jira)
LvYanquan created FLINK-36088:
-

 Summary: Fix NopointException in PaimonDataSink.
 Key: FLINK-36088
 URL: https://issues.apache.org/jira/browse/FLINK-36088
 Project: Flink
  Issue Type: Bug
  Components: Flink CDC
Affects Versions: cdc-3.2.0
Reporter: LvYanquan
 Fix For: cdc-3.2.0


Fix NopointException in BucketAssignOperator when try to get Schema info.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Re:Re: Re: Re:Re: [DISCUSS] FLIP-473: Introduce New SQL Operators Based on Asynchronous State APIs

2024-08-18 Thread ron
Hi, Xuyang

Thanks for your proposal, Look good to me overall, +1

Best,
Ron


> -原始邮件-
> 发件人: Xuyang 
> 发送时间:2024-08-14 15:20:59 (星期三)
> 收件人: dev@flink.apache.org
> 主题: Re:Re: Re: Re:Re: [DISCUSS] FLIP-473: Introduce New SQL Operators Based 
> on Asynchronous State APIs
> 
> Hi, Feng. Thank you for your feedback. 
> 
> I completely agree that these tests are crucial. I have updated the Roadmap 
> section of
> 
> the FLIP to include plans for harness tests and IT tests for data 
> correctness, state 
> 
> compatibility tests, and performance regression tests.
> 
> Regarding the usage of the APIs `thenCombine` and `combineAll`, I have also 
> added a 
> 
> concluding note that there is no significant performance difference between 
> these two, 
> 
> so either one can be used interchangeably.
> 
> 
> 
> 
> --
> 
> Best!
> Xuyang
> 
> 
> 
> 
> 
> At 2024-08-13 17:10:32, "Feng Jin"  wrote:
> >Hi, Xuyang
> >
> >Thank you for initiating this FLIP. I believe this is a significant feature
> >for the future of Flink SQL, but I also share concerns about the
> >maintenance costs related to the correctness, state compatibility, and
> >performance of the two implementations.
> >I fully support ensuring state compatibility, functional correctness, and
> >performance regression detection through HarnessTests and IT Tests. 1. I
> >think relevant testing is a critical part. Is there a more detailed design
> >plan for the HarnessTests and performance regression test? 2. Regarding the
> >final performance comparison between thenCombine and combineAll, is there a
> >more specific conclusion? Which one should we use, or are both options
> >viable?
> >
> >Best,
> >Feng.
> >
> >On Mon, Aug 12, 2024 at 11:23 AM Xuyang  wrote:
> >
> >> Hi, David. Thank you for your review. Let me address your questions:
> >>
> >> >1. Is there a way to enforce this as an invariant during build time,
> >> >perhaps through a generic test framework that switches between the sync
> >> and
> >> >async versions of all operators and verifies checkpoint compatibility?
> >>
> >> Yes, we need to incorporate a corresponding testing framework to verify
> >> the state compatibility during
> >>
> >> transitions between sync and async state operators. In my proposal, it
> >> resembles the existing RestoreTestBase,
> >>
> >> with the testing logic structured as follows: a. Start with the sync state
> >> operator to consume data; b. Execute a
> >>
> >> checkpoint; c. Restart with the async state operator and recover data from
> >> the checkpoint for re-consumption;
> >>
> >> d. Validate the correctness of the results. Additionally, we could also
> >> consider scenarios where the async state
> >>
> >> operator starts consuming data initially, followed by the restart with the
> >> sync state operator. I have updated this part
> >>
> >> to the section `TEST PLAN` in flip.
> >>
> >> >2. If the only difference between them is state handling, could they
> >> >potentially be implemented as the same operator with two different
> >> >interfaces? My main concern is code reuse—it’s crucial to avoid
> >> duplicating
> >> >code to ensure both implementations stay aligned. Additionally, could
> >> >feature parity be verified at the test suite level (similar to the first
> >> >question)? Perhaps we could create a single parameterized test suite that
> >> >runs against both versions?
> >>
> >> IIUC, your focus aligns with the roadmap’s mention of “Refactoring the
> >> sync and async state operators,
> >>
> >> leveraging shared logical calculations while abstracting the state access
> >> details.” Due to the intricate details
> >>
> >> of the code implementation, this was not elaborated in the flip. I share
> >> your vision of designing reusable business
> >>
> >> logic classes (such as JoinHelper) alongside different operator interfaces
> >> (SyncStateJoinOperator and
> >>
> >> AsyncStateJoinOperator), consolidating the reusable logic within the class
> >> JoinHelper.
> >>
> >> For synchronous operators, we already have harness tests to validate data
> >> correctness, and there will also be
> >>
> >> dedicated harness tests for asynchronous operators. Using a parameterized
> >> test suite for both harnesses is indeed feasible.
> >>
> >>
> >>
> >>
> >> --
> >>
> >> Best!
> >> Xuyang
> >>
> >>
> >>
> >>
> >>
> >> 在 2024-08-09 20:49:49,"David Morávek"  写道:
> >> >Hi Xuyang,
> >> >
> >> >Thank you for looking into this—great work! The overall direction seems
> >> >solid. I have two minor questions:
> >> >
> >> >In theory, the implementation of AsyncStateOperator and SyncStateOperator
> >> >> differs only in their state handling. Their state schemas, business
> >> logic,
> >> >> and other aspects remain the same. Therefore, within the same Flink
> >> >> version, when the SQL and other Flink configurations remain unchanged,
> >> or
> >> >> when using the same compiled plan, users can freely switch between
> >> >> AsyncStateOperator and SyncStateOperator by toggli

Re: [VOTE] Apache Flink CDC Release 3.2.0, release candidate #0

2024-08-18 Thread Yanquan Lv
-1 as there are two problems of stable reproduction for main connectors.
Met a NotSerializableException that block user to use Kafka as pipeline
sink[1].
Met a NoPointException that will always lead to failure when job restarted
using Paimon as pipeline sink[2].

[1] https://issues.apache.org/jira/browse/FLINK-36082
[2] https://issues.apache.org/jira/browse/FLINK-36088

Yanquan Lv  于2024年8月18日周日 17:43写道:

> Hi Qingsheng, I've tested and met a NotSerializableException[1] that will
> lead to failure when using Kafka as pipeline sink in 3.2.0 version.
> I think it may be a blocker as this happened during the submission phase.
>
> [1] https://issues.apache.org/jira/browse/FLINK-36082
>
> Qingsheng Ren  于2024年8月15日周四 15:13写道:
>
>> Hi everyone,
>>
>> Please review and vote on the release candidate #0 for the version 3.2.0
>> of
>> Apache Flink CDC,
>> as follows:
>> [ ] +1, Approve the release
>> [ ] -1, Do not approve the release (please provide specific comments)
>>
>> **Release Overview**
>>
>> As an overview, the release consists of the following:
>> a) Flink CDC source release to be deployed to dist.apache.org
>> b) Maven artifacts to be deployed to the Maven Central Repository
>>
>> **Staging Areas to Review**
>>
>> The staging areas containing the above mentioned artifacts are as follows,
>> for your review:
>> * All artifacts for a) can be found in the corresponding dev repository at
>> dist.apache.org [1], which are signed with the key with fingerprint
>> A1BD477F79D036D2C30CA7DBCA8AEEC2F6EB040B [2]
>> * All artifacts for b) can be found at the Apache Nexus Repository [3]
>>
>> Other links for your review:
>> * JIRA release notes [4]
>> * Source code tag "release-3.2.0-rc0" with commit hash
>> c03938e8de46b2d00a5984467d0e9bdca4a1 [5]
>> * PR for release announcement blog post of Flink CDC 3.2.0 in flink-web
>> [6]
>>
>> **Vote Duration**
>>
>> The voting time will run for at least 72 hours.
>> It is adopted by majority approval, with at least 3 PMC affirmative votes.
>>
>> Thanks,
>> Qingsheng
>>
>> [1] https://dist.apache.org/repos/dist/dev/flink/flink-cdc-3.2.0-rc0/
>> [2] https://dist.apache.org/repos/dist/release/flink/KEYS
>> [3]
>> https://repository.apache.org/content/repositories/orgapacheflink-1753
>> [4]
>>
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12354594
>> [5] https://github.com/apache/flink-cdc/tree/release-3.2.0-rc0
>> [6] https://github.com/apache/flink-web/pull/753
>>
>


Re: [VOTE] Apache Flink CDC Release 3.2.0, release candidate #0

2024-08-18 Thread Leonard Xu
Thanks Yanquan for the verification and raise up the issues, I think we need 
cancel this RC, we’ll prepare the RC1 after fixed the two issues.

CC:QingSheng

Best,
Leonard


> 2024年8月19日 上午10:28,Yanquan Lv  写道:
> 
> -1 as there are two problems of stable reproduction for main connectors.
> Met a NotSerializableException that block user to use Kafka as pipeline
> sink[1].
> Met a NoPointException that will always lead to failure when job restarted
> using Paimon as pipeline sink[2].
> 
> [1] https://issues.apache.org/jira/browse/FLINK-36082
> [2] https://issues.apache.org/jira/browse/FLINK-36088
> 
> Yanquan Lv  于2024年8月18日周日 17:43写道:
> 
>> Hi Qingsheng, I've tested and met a NotSerializableException[1] that will
>> lead to failure when using Kafka as pipeline sink in 3.2.0 version.
>> I think it may be a blocker as this happened during the submission phase.
>> 
>> [1] https://issues.apache.org/jira/browse/FLINK-36082
>> 
>> Qingsheng Ren  于2024年8月15日周四 15:13写道:
>> 
>>> Hi everyone,
>>> 
>>> Please review and vote on the release candidate #0 for the version 3.2.0
>>> of
>>> Apache Flink CDC,
>>> as follows:
>>> [ ] +1, Approve the release
>>> [ ] -1, Do not approve the release (please provide specific comments)
>>> 
>>> **Release Overview**
>>> 
>>> As an overview, the release consists of the following:
>>> a) Flink CDC source release to be deployed to dist.apache.org
>>> b) Maven artifacts to be deployed to the Maven Central Repository
>>> 
>>> **Staging Areas to Review**
>>> 
>>> The staging areas containing the above mentioned artifacts are as follows,
>>> for your review:
>>> * All artifacts for a) can be found in the corresponding dev repository at
>>> dist.apache.org [1], which are signed with the key with fingerprint
>>> A1BD477F79D036D2C30CA7DBCA8AEEC2F6EB040B [2]
>>> * All artifacts for b) can be found at the Apache Nexus Repository [3]
>>> 
>>> Other links for your review:
>>> * JIRA release notes [4]
>>> * Source code tag "release-3.2.0-rc0" with commit hash
>>> c03938e8de46b2d00a5984467d0e9bdca4a1 [5]
>>> * PR for release announcement blog post of Flink CDC 3.2.0 in flink-web
>>> [6]
>>> 
>>> **Vote Duration**
>>> 
>>> The voting time will run for at least 72 hours.
>>> It is adopted by majority approval, with at least 3 PMC affirmative votes.
>>> 
>>> Thanks,
>>> Qingsheng
>>> 
>>> [1] https://dist.apache.org/repos/dist/dev/flink/flink-cdc-3.2.0-rc0/
>>> [2] https://dist.apache.org/repos/dist/release/flink/KEYS
>>> [3]
>>> https://repository.apache.org/content/repositories/orgapacheflink-1753
>>> [4]
>>> 
>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12354594
>>> [5] https://github.com/apache/flink-cdc/tree/release-3.2.0-rc0
>>> [6] https://github.com/apache/flink-web/pull/753
>>> 
>> 



[jira] [Created] (FLINK-36089) clean codegen print function

2024-08-18 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-36089:
-

 Summary: clean codegen print function
 Key: FLINK-36089
 URL: https://issues.apache.org/jira/browse/FLINK-36089
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0


flink has new printf implementation here 
https://issues.apache.org/jira/browse/FLINK-35920

and aligns with spark/hive, the original can be removed



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] Apache Flink CDC Release 3.2.0, release candidate #0

2024-08-18 Thread Qingsheng Ren
Thanks for testing the RC! I have marked these two issues as blocker.

This RC is now cancelled. A new release candidate will be built once these
two issues are resolved.

Best,
Qingsheng


On Mon, Aug 19, 2024 at 10:32 AM Leonard Xu  wrote:

> Thanks Yanquan for the verification and raise up the issues, I think we
> need cancel this RC, we’ll prepare the RC1 after fixed the two issues.
>
> CC:QingSheng
>
> Best,
> Leonard
>
>
> > 2024年8月19日 上午10:28,Yanquan Lv  写道:
> >
> > -1 as there are two problems of stable reproduction for main connectors.
> > Met a NotSerializableException that block user to use Kafka as pipeline
> > sink[1].
> > Met a NoPointException that will always lead to failure when job
> restarted
> > using Paimon as pipeline sink[2].
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-36082
> > [2] https://issues.apache.org/jira/browse/FLINK-36088
> >
> > Yanquan Lv  于2024年8月18日周日 17:43写道:
> >
> >> Hi Qingsheng, I've tested and met a NotSerializableException[1] that
> will
> >> lead to failure when using Kafka as pipeline sink in 3.2.0 version.
> >> I think it may be a blocker as this happened during the submission
> phase.
> >>
> >> [1] https://issues.apache.org/jira/browse/FLINK-36082
> >>
> >> Qingsheng Ren  于2024年8月15日周四 15:13写道:
> >>
> >>> Hi everyone,
> >>>
> >>> Please review and vote on the release candidate #0 for the version
> 3.2.0
> >>> of
> >>> Apache Flink CDC,
> >>> as follows:
> >>> [ ] +1, Approve the release
> >>> [ ] -1, Do not approve the release (please provide specific comments)
> >>>
> >>> **Release Overview**
> >>>
> >>> As an overview, the release consists of the following:
> >>> a) Flink CDC source release to be deployed to dist.apache.org
> >>> b) Maven artifacts to be deployed to the Maven Central Repository
> >>>
> >>> **Staging Areas to Review**
> >>>
> >>> The staging areas containing the above mentioned artifacts are as
> follows,
> >>> for your review:
> >>> * All artifacts for a) can be found in the corresponding dev
> repository at
> >>> dist.apache.org [1], which are signed with the key with fingerprint
> >>> A1BD477F79D036D2C30CA7DBCA8AEEC2F6EB040B [2]
> >>> * All artifacts for b) can be found at the Apache Nexus Repository [3]
> >>>
> >>> Other links for your review:
> >>> * JIRA release notes [4]
> >>> * Source code tag "release-3.2.0-rc0" with commit hash
> >>> c03938e8de46b2d00a5984467d0e9bdca4a1 [5]
> >>> * PR for release announcement blog post of Flink CDC 3.2.0 in flink-web
> >>> [6]
> >>>
> >>> **Vote Duration**
> >>>
> >>> The voting time will run for at least 72 hours.
> >>> It is adopted by majority approval, with at least 3 PMC affirmative
> votes.
> >>>
> >>> Thanks,
> >>> Qingsheng
> >>>
> >>> [1] https://dist.apache.org/repos/dist/dev/flink/flink-cdc-3.2.0-rc0/
> >>> [2] https://dist.apache.org/repos/dist/release/flink/KEYS
> >>> [3]
> >>> https://repository.apache.org/content/repositories/orgapacheflink-1753
> >>> [4]
> >>>
> >>>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12354594
> >>> [5] https://github.com/apache/flink-cdc/tree/release-3.2.0-rc0
> >>> [6] https://github.com/apache/flink-web/pull/753
> >>>
> >>
>
>


[jira] [Created] (FLINK-36090) Bug with IngestDB restore operation for priority queue state in backend

2024-08-18 Thread Maxim Vershinin (Jira)
Maxim Vershinin created FLINK-36090:
---

 Summary: Bug with IngestDB restore operation for priority queue 
state in backend
 Key: FLINK-36090
 URL: https://issues.apache.org/jira/browse/FLINK-36090
 Project: Flink
  Issue Type: Bug
  Components: Runtime / State Backends
Affects Versions: 2.0.0
Reporter: Maxim Vershinin


*Summary:* Incorrect handling of priority queue states in IngestDB during 
restoring due to missing {{equals()}} and {{hashCode()}} methods in 
{{{}RegisteredPriorityQueueStateBackendMetaInfo{}}}.

*Problem Description:*

During the restoration of IngestDB in my Flink project, an issue was identified 
where the priority queue states are not managed correctly in the backend. The 
problem stems from the absence of {{equals()}} and {{hashCode()}} methods in 
the {{RegisteredPriorityQueueStateBackendMetaInfo}} class.

In particular, within the {{exportColumnFamiliesWithSstDataInKeyGroupsRange}} 
method of the {{RocksDBIncrementalRestoreOperation}} class, if the state is a 
priority queue, identical states from different subtasks are erroneously 
treated as distinct states within the {{exportedColumnFamiliesOut}} map. This 
leads to inconsistent behavior and errors during the restoration process.

*Proposed Solution:*

To address this issue, add {{equals()}} and {{hashCode()}} methods to the 
{{RegisteredPriorityQueueStateBackendMetaInfo}} class. Implementing these 
methods will ensure that priority queue states are accurately recognized and 
handled across different subtasks, thereby preventing errors during IngestDB 
restoration.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-36091) Config passed in via the environment variable FLINK_PROPERTIES which has space not getting processed

2024-08-18 Thread lajith (Jira)
lajith created FLINK-36091:
--

 Summary: Config passed in via the environment variable 
FLINK_PROPERTIES which has space not getting processed
 Key: FLINK-36091
 URL: https://issues.apache.org/jira/browse/FLINK-36091
 Project: Flink
  Issue Type: Improvement
  Components: flink-contrib, flink-docker
Affects Versions: 1.19.1
Reporter: lajith


With Flink 1.19 , when config passed in via the environment variable 
FLINK_PROPERTIES which contains a space will not get correctly processed as 
space is getting trimmed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] FLIP-474: Store operator name and UID in state metadata

2024-08-18 Thread Gabor Somogyi
Hi All,

Based on our agreement I've created a draft Flink state observability
umbrella [1].
Please share your comments. It contains some details to give some insights
but the focus would be on the direction.

[1]
https://docs.google.com/document/d/1Du1-TShoOjaNDCahs3sgLWIpYkXzJPdSkgHcWLpELyw/edit

BR,
G


On Sat, Aug 10, 2024 at 10:54 AM Zakelly Lan  wrote:

> Hi Gabor,
>
> I apologize for any confusion. Let me clarify my position.
>
> The concept of state observability is important for users, and the current
> FLIP seems to be a step in the right direction. However, before we proceed,
> I suggest we discuss the final presentation of the state observability to
> the user and consider the high-level vision for achieving this. It's
> essential to ensure that the current FLIP aligns with the overall
> objective. I'm not suggesting a comprehensive FLIP to address all the
> missing pieces, and one FLIP for each piece is fine for me. I just want to
> ensure that we are on the same page in terms of vision. The last thing I
> want is a fragmented approach resulting in refactoring or deprecation of
> code when we need a complete feature.
>
> Actually, I would hesitate about the current proposal of adding uid *in*
> the state metadata. It may cause state incompatibility issues across
> versions. In theory we can do this but it is better not if we are adding
> data not for fault tolerance but only for human readability. And it could
> be worse if we add one or two columns sporadically in future.
>
> In fact, I expect the state metadata store to exist next to the checkpoint
> metadata, rather than within it. This gives us enough flexibility to polish
> this function as users need it, and without breaking checkpoint
> compatibility too often. Or moreover we don't have to stick to the form of
> checkpoint and we could choose a more human readable format like json for
> the metadata store. This is where I think this FLIP is inconsistent with my
> expectation of the state observability approach. These considerations
> deserve a discussion before proceeding with other details. WDTY?
>
>
> Best,
> Zakelly
>
> On Fri, Aug 9, 2024 at 8:22 PM Gabor Somogyi 
> wrote:
>
> > Hi David,
> >
> > Thanks for sharing your thoughts!
> >
> > > It sounds like you might already have an end-to-end solution in mind.
> It
> > would be really helpful if you could put that into writing so we can all
> > align our thinking.
> >
> > It makes sense to create a high level vision.
> >
> > > I’m not a fan of the mindset of “this is how it was done in Spark, so
> > we’ll
> > just replicate it” without proper discussion. We’ve had similar
> > conversations before.
> >
> > I think we've had this conversation already in case of delegation token
> > framework
> > and I can say the same. No intention to take over things blindly but it's
> > not a shame
> > to be inspired by solutions which are welcome by users.
> > The intention is similar just like in scalable authentication area where
> > Flink is now ahead of Spark.
> >
> > > Would it be too much to ask for a FLIP that outlines the overall vision
> > (without delving too deeply into the details) to ensure everyone is
> aligned
> > and moving in the same direction?
> >
> > That's a fair point and a constructive way how we can proceed.
> > I'm going to come back with the details...
> >
> > BR,
> > G
> >
> >
> > On Fri, Aug 9, 2024 at 1:36 PM David Morávek  wrote:
> >
> > > Hi Gabor,
> > >
> > > Thanks for taking the initiative on this. It’s clear that significant
> > > improvements are needed in this area, and parsing state files can be
> > > incredibly challenging, even for those who are well-versed in it.
> > >
> > > > Just to make it crystal clear, I’m not shooting for an ad-hoc tiny
> fix
> > > but started a path where we fill each and every gap which will end up
> in
> > a
> > > functionality and UX bar just like the Spark solution.
> > >
> > > It sounds like you might already have an end-to-end solution in mind.
> It
> > > would be really helpful if you could put that into writing so we can
> all
> > > align our thinking.
> > >
> > > I’m not a fan of the mindset of “this is how it was done in Spark, so
> > we’ll
> > > just replicate it” without proper discussion. We’ve had similar
> > > conversations before.
> > >
> > > > But this doesn’t mean we create a single giga big FLIP after several
> > > months of discussion.
> > >
> > > I don’t think anyone is asking for a massive FLIP after lengthy
> > > discussions, but having a document that outlines the overall vision
> could
> > > be incredibly valuable, especially in a distributed setting. It also
> > opens
> > > the door for others to contribute to and shape this shared vision,
> which
> > is
> > > a core principle of community-driven open-source development.
> > >
> > > Would it be too much to ask for a FLIP that outlines the overall vision
> > > (without delving too deeply into the details) to ensure everyone is
> > aligned
> > > and moving in the