RE: [ANNOUNCE] New Apache Flink PMC Member - Sergey Nuyanzin

2025-04-07 Thread David Radley
Congratuations - well deserved Sergey, I really appreciate your deep knowledge, 
prolific contributions, support and community minded attitude,

Kind regards David.

From: Arvid Heise 
Date: Monday, 7 April 2025 at 07:54
To: dev@flink.apache.org 
Subject: [EXTERNAL] Re: [ANNOUNCE] New Apache Flink PMC Member - Sergey Nuyanzin
Congratz!

On Mon, Apr 7, 2025 at 6:44 AM Ferenc Csaky 
wrote:

> Congrats, Sergey!
>
> Best,
> Ferenc
>
>
> On Monday, April 7th, 2025 at 03:30, weijie guo 
> wrote:
>
> >
> >
> > Congratulations, Sergey~
> >
> > Best regards,
> >
> > Weijie
> >
> >
> > Lincoln Lee lincoln.8...@gmail.com 于2025年4月3日周四 18:14写道:
> >
> > > Congratulations, Sergey!
> > >
> > > Best,
> > > Lincoln Lee
> > >
> > > Yuepeng Pan panyuep...@apache.org 于2025年4月3日周四 17:48写道:
> > >
> > > > Congratulations!
> > > >
> > > > Best regards,
> > > > Yuepeng Pan
> > > >
> > > > At 2025-04-03 17:42:21, "Shengkai Fang" fskm...@gmail.com wrote:
> > > >
> > > > > Congratulations, Sergey!
> > > > >
> > > > > Best,
> > > > > Shengkai
> > > > >
> > > > > Xuyang xyzhong...@163.com 于2025年4月3日周四 16:19写道:
> > > > >
> > > > > > Cheers, Sergey!
> > > > > >
> > > > > > --
> > > > > >
> > > > > > Best!
> > > > > > Xuyang
> > > > > >
> > > > > > 在 2025-04-03 15:58:48,"Zakelly Lan" zakelly@gmail.com 写道:
> > > > > >
> > > > > > > Congratulations, Sergey!
> > > > > > >
> > > > > > > Best,
> > > > > > > Zakelly
> > > > > > >
> > > > > > > On Thu, Apr 3, 2025 at 3:10 PM Dawid Wysakowicz <
> > > > > > > wysakowicz.da...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > On behalf of the PMC, I'm very happy to announce a new
> Apache Flink
> > > > > > > > PMC
> > > > > > > > Member
> > > > > > > > - Sergey Nuyanzin.
> > > > > > > >
> > > > > > > > Sergey has been a dedicated and invaluable contributor to
> the Flink
> > > > > > > > community for several years. He became a committer in 2023
> and has
> > > > > > > > been
> > > > > > > > one
> > > > > > > > of the most active contributors, particularly in the areas
> of Flink
> > > > > > > > SQL.
> > > > > > > >
> > > > > > > > Sergey has developed several key features and improvements
> in Flink
> > > > > > > > SQL,
> > > > > > > > including upgrades of Apache Calcite as well as other fixes
> for the
> > > > > > > > interaction between Calcite and Flink. His work has
> significantly
> > > > > > > > improved
> > > > > > > > the usability and performance of Flink’s SQL engine. In
> addition to
> > > > > > > > code
> > > > > > > > contributions, he has also driven discussions on design
> > > > > > > > improvements
> > > > > > > > and
> > > > > > > > actively reviewed many pull requests.
> > > > > > > >
> > > > > > > > Sergey helps tremendously with keeping the project healthy.
> He
> > > > > > > > often
> > > > > > > > helps
> > > > > > > > with releases of connectors, flink-shaded and verifying
> Flink’s
> > > > > > > > releases.Beyond development, Sergey is committed to
> fostering the
> > > > > > > > Flink
> > > > > > > > community. He regularly shares his opinions on the future and
> > > > > > > > direction
> > > > > > > > in
> > > > > > > > which the project should go.
> > > > > > > >
> > > > > > > > Please join me in welcoming and congratulating Sergey!
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Dawid Wysakowicz (on behalf of the Flink PMC)
>

Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: Building C, IBM Hursley Office, Hursley Park Road, 
Winchester, Hampshire SO21 2JN


[jira] [Created] (FLINK-37622) Exactly once Kafka sink does not produce any records in batch mode

2025-04-07 Thread Arvid Heise (Jira)
Arvid Heise created FLINK-37622:
---

 Summary: Exactly once Kafka sink does not produce any records in 
batch mode
 Key: FLINK-37622
 URL: https://issues.apache.org/jira/browse/FLINK-37622
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: kafka-3.4.0
Reporter: Arvid Heise
Assignee: Arvid Heise
 Fix For: kafka-4.0.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Kafka connector 4.0.0 release

2025-04-07 Thread Leonard Xu
+1, thanks @Arvid for driving this release!

Best,
Leonard

> 2025 4月 7 18:02,Tom Cooper  写道:
> 
> +1 from me. Thanks for driving this. 
> 
> Is there anything I can do to help?
> 
> Tom Cooper
> @tomcooper.dev | https://tomcooper.dev
> 
> 
> On Monday, 7 April 2025 at 08:54, Arvid Heise  wrote:
> 
>> Hi folks,
>> 
>> I would like to volunteer to drive the Kafka connector 4.0.0 release to
>> make the connector finally available to Flink 2.0. If there are no
>> objections, I'd start tomorrow with the branch cut.
>> 
>> LMK if there are any issues that should be merged until then.
>> 
>> Best,
>> 
>> Arvid



[DISCUSS] FLIP-519: Introduce async lookup key ordered mode

2025-04-07 Thread shuai xu
Hi devs,

I'd like to start a discussion on FLIP-519: Introduce async lookup key
ordered mode[1].

The Flink system currently supports both record-level ordered and
unordered output modes for asynchronous lookup joins. However, it does
not guarantee the processing order of records sharing the same key.

As highlighted in [2], there are two key requirements for enhancing
async io operations:
1. Ensuring the processing order of records with the same key is a
common requirement in DataStream.
2. Sequential processing of records sharing the same upsertKey when
performing lookup join in Flink SQL is essential for maintaining
correctness.

This optimization aims to balance correctness and performance for
stateful streaming workloads.Then the FLIP introduce a new operator
KeyedAsyncWaitOperator to supports the optimization. Besides, a new
option is added to control the behaviour avoid influencing existing
jobs.

please find more details in the FLIP wiki document[1]. Looking forward
to your feedback.

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-519%3A++Introduce+async+lookup+key+ordered+mode
[2] https://lists.apache.org/thread/wczzjhw8g0jcbs8lw2jhtrkw858cmx5n

Best,
Xu Shuai


Re: [DISCUSS] FLIP-Draft - Amazon CloudWatch Metric Sink Connector

2025-04-07 Thread Ahmed Hamdy
Hi Daren thanks for the FLIP

Just a couple of questions and comments?

> Usable in both DataStream and Table API/SQL
What about python API? this is sth we should consider ahead since the
abstract element converter doesn't have a Flink type mapping to be used
from python, this is a issue we faced with DDB before

> Therefore, the connector will provide a CloudWatchMetricInput model that
user can use to pass as input to the connector. For example, in DataStream
API, it could be a MapFunction called just before passing to the sink as
follows:
I am not quite sure I follow, are you suggesting we introduce a
specific new converter class or relay that to users? also since you
mentioned FLIP-171, are you suggesting to implement this sink as an
extension to Async Sink, in that case It is more confusing to me how we are
going to use the map function with the AsyncSink.ElementConvertor.

>public class SampleToCloudWatchMetricInputMapper implements MapFunction<
Sample, CloudWatchMetricInput>

Is CloudWatchMetricInput a newly introduced model class, I couldn't find it
in the sdkv2, If we are introducing it then it might be useful to add to
the FLIP since this is part of the API.


> Supports both Bounded (Batch) and Unbounded (Streaming)

What do you propose to handle them differently? I can't find a specific
thing in the FLIP

Regarding table API

> 'metric.dimension.keys' = 'cw_dim',

I am not in favor of doing this as this will complicate the schema
validation on table creation, maybe we can use the whole schema as
dimensions excluding the values and the count, let me know your thoughts
here.

> 'metric.name.key' = 'cw_metric_name',

So we are making the metric part of the row data? have we considered not
doing that instead and having 1 table map to 1 metric instead of namespace?
It might be more suitable to enforce some validations on the dimensions
schema this way. Ofc this will probably have is introduce some intermediate
class in the model to hold the dimensions, values and counts without the
metric name and namespace that we will extract from the sink definition,
let me know your thoughts here?


>`cw_value` BIGINT,
Are we going to allow all numeric types for values?

>protected void submitRequestEntries(
  List requestEntries,
  Consumer> requestResult)

nit: This method should be deprecated after 1.20. I hope the repo is
upgraded by the time we implement this

> Error Handling
Away from poison pills, what error handling are you suggesting? Are we
following the footsteps of the other AWS connectors with error
classification, is there any effort to abstract it on the AWS side?

And on the topic of poison pills, If I understand correctly that is a topic
that has been discussed for a while, this ofc breaks the at-least-once
semantic and might be confusing to the users, additionally since cloud
watch API fails the full batch how are you suggesting we identify the
poison pills? I am personally in favor of global failures in this case but
would love to hear the feedback here.



Best Regards
Ahmed Hamdy


On Mon, 7 Apr 2025 at 11:29, Wong, Daren 
wrote:

> Hi Dev,
>
> I would like to start a discussion about FLIP: Amazon CloudWatch Metric
> Sink Connector
> https://docs.google.com/document/d/1G2sQogV8S6M51qeAaTmvpClOSvklejjEXbRFFCv_T-c/edit?usp=sharing
>
> This FLIP is proposing to add support for Amazon CloudWatch Metric sink in
> flink-connector-aws repo. Looking forward to your feedback, thank you
>
> Regards,
> Daren
>


Re: [DISCUSS] FLIP-511: Support transaction id pooling in Kafka connector

2025-04-07 Thread Hongshun Wang
 Hi Heise,
I agree with you. Let us push forward. I will review
https://github.com/apache/flink-connector-kafka/pull/154 and discuss it. I
will reply for some time because there seems to be too much code in this PR.

Best
Hongshun


On Mon, Apr 7, 2025 at 3:12 PM Arvid Heise  wrote:

> Hi Hongshun,
>
> 1. In case of unsupported API, the kafka-client library throws an error,
> which will failover in Flink. We could make the error more user friendly
> but I think fail over is correct instead of silently switching to an
> undesired strategy. David already asked if we could check that during
> validation of the plan, but imho it's too situational as you may not have
> access to the Kafka cluster when you build the plan. I'm worried about
> false positives here but I'm open to suggestions.
>
> 2. This FLIP is about the cleanup period and is not touching the write path
> at all.
> Transactional commits happen on notifyCheckpointComplete where they block
> the task thread in case where committer and writer are chained (common
> case). Have you seen an impact on performance?
> I'm also not exactly sure where your concurrency is coming from. Could you
> elaborate? I'm assuming you see more producers open because
> notifyCheckpointComplete may occur after having multiple snapshotState. But
> I don't see how to improve the situation except foregoing exactly once and
> setting it at least once.
> Checkpointing interval is an orthogonal configuration. I don't see how
> anything on the connector side can change that. That sounds like a radical
> new idea and deserves its own FLIP. If I get you correctly, you want to
> decline/delay new checkpoints until old once are committed such that you
> have no more than X ongoing transactions. That sounds very useful but it's
> about automatically configuring the lowest possible checkpointing interval
> that satisfies certain conditions. This is way outside of the scope of
> connectors because it requires deep changes in the checkpoint coordinator
> (only).
>
> 3. Yes, we will keep the list of returned transactions as small as
> possible. The current call chain is
> - The Flink sink may be configured to write into a (set of) static topics
> or with a pattern target topic (target patterns are supported in OSS Flink
> only). In the latter case, run ListTopics to collect the list of topics.
> - Run DescribeProducer with the list of topics to get a list of producer
> ids.
> - Run ListTransaction API with the list of producer ids.
> - Remove all transactions that are closed (ideally filter those with
> request options already).
> - Remove all transactions that are not owned by the subtask s by extracting
> the original subtask id from the transactional id s’ and applying the
> predicate s == s’ % p where p is the current parallelism.
> - Remove all transactions that are part of the writer state (finalized
> transactions).
> - Abort all of these transactions by calling initTransaction.
> We are currently working with Kafka devs to replace the first three calls
> with a single call to ListTransaction with a new parameter that filters on
> prefix. But that is available earliest in Kafka 4.1, so we need the current
> approach anyhow.
>
> Note that the FLIP is already approved, so I can't change anything on it
> anymore. Both 1. and 3. can be discussed on the pull request however [1].
> 2. is clearly out of scope.
>
> Best,
>
> Arvid
>
> [1] https://github.com/apache/flink-connector-kafka/pull/154
>
>
> On Mon, Apr 7, 2025 at 4:05 AM Hongshun Wang 
> wrote:
>
> > Hi Heise,
> >
> > This sounds like a good idea. Just some tiny questions left:
> >
> >1.
> >
> >Behavior when ListTransactions is unsupported:
> >If the server does not support the ListTransactions API, what should
> the
> >system’s behavior be?
> >- Should it throw an exception to signal incompatibility?
> >   - Or fallback to the legacy Kafka connector behavior (e.g.,
> >   checkpointing without transactional checks)?
> >   Clear guidance here would help users and implementers understand
> >   pitfall recovery.
> >2.
> >
> >Impact of frequent checkpoints on write performance:
> >When the checkpoint interval is small (e.g., configured to occur very
> >frequently), could this lead to write performance bottlenecks if
> >multiple producers (e.g., 3 producers) block the write pipeline?
> >Could this FLIP include a tunable configuration parameter to let users
> >adjust the checkpoint interval or producers’ concurrency limits if
> > needed?
> >3.
> >
> >Optimizing transaction lookup with producerIdFilters:
> >The proposal currently states: *"The writer uses ListTransactions to
> >receive a list of all open transactions."*
> >Would it be feasible to instead filter transactions using
> >producerIdFilters (if supported) to retrieve only those transactions
> >relevant to this writer? Doing so could reduce unnecessary overhead,
> >especially in en

[jira] [Created] (FLINK-37629) Use Checkpointed Offset while migrating clusters in DynamicKafkaSource

2025-04-07 Thread Chirag Dewan (Jira)
Chirag Dewan created FLINK-37629:


 Summary: Use Checkpointed Offset while migrating clusters in 
DynamicKafkaSource
 Key: FLINK-37629
 URL: https://issues.apache.org/jira/browse/FLINK-37629
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Kafka
Affects Versions: 1.20.1
Reporter: Chirag Dewan


In my use case, I have a 2 cluster Kafka deployment. One is primary and other 
one is replicated to using MM2. Producers can't directly write to the 
replicated cluster. It's just used for consuming records. 

I want my KafkaSource to failover to replicated cluster when the primary 
cluster fails. And I want the KafkaSource to resume reading the records from 
where it left off on the primary. If the checkpointed offset is not yet 
replicated on that cluster, KafkaSource can use the latest offset (means it 
will sit idle since new data isnt produced on this cluster)

Fallback can also rely on Checkpointed offset, because I am sure replicated 
cluster will always trail the primary cluster. 

I thought of using DynamicKafkaSource for this purpose. However, currently, 
DynamicKafkaSource relies on the consumer group offset in Kafka (startingOffset 
= -3) to start reading data from the secondary cluster after failover. 

I understand it would be problematic to use checkpointed offset while falling 
back to the primary cluster, generally. But it works well in my use case.

So the ask is - Can we make DynamicKafkaSource use the checkpointed offset? 
Maybe even as a configurable option? 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Kafka connector 4.0.0 release

2025-04-07 Thread Tom Cooper
+1 from me. Thanks for driving this. 

Is there anything I can do to help?

Tom Cooper
@tomcooper.dev | https://tomcooper.dev


On Monday, 7 April 2025 at 08:54, Arvid Heise  wrote:

> Hi folks,
> 
> I would like to volunteer to drive the Kafka connector 4.0.0 release to
> make the connector finally available to Flink 2.0. If there are no
> objections, I'd start tomorrow with the branch cut.
> 
> LMK if there are any issues that should be merged until then.
> 
> Best,
> 
> Arvid


[VOTE] FLIP-514: Custom Evaluator plugin for Flink Autoscaler

2025-04-07 Thread Pradeepta Choudhury
Hi everyone,

I'd like to start a vote on the FLIP-514: Custom Evaluator plugin for Flink 
Autoscaler [1]. The discussion thread is here [2].

The vote will be open for at least 72 hours unless there is an objection or 
insufficient votes.

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-514%3A+Custom+Evaluator+plugin+for+Flink+Autoscaler
 

[2] https://lists.apache.org/thread/5vw6b6w6lctd2p7vfo9w9fcpzs7834fj 



Thanks
Pradeepta

[jira] [Created] (FLINK-37623) Async state support for `process()` in Datastream API

2025-04-07 Thread Zakelly Lan (Jira)
Zakelly Lan created FLINK-37623:
---

 Summary: Async state support for `process()`  in Datastream API
 Key: FLINK-37623
 URL: https://issues.apache.org/jira/browse/FLINK-37623
 Project: Flink
  Issue Type: Improvement
Affects Versions: 2.0.0
Reporter: Zakelly Lan
Assignee: Zakelly Lan
 Fix For: 2.1.0, 2.0.1






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-37624) Support enableAsyncState and switch operator after datastream transformation

2025-04-07 Thread Zakelly Lan (Jira)
Zakelly Lan created FLINK-37624:
---

 Summary: Support enableAsyncState and switch operator after 
datastream transformation
 Key: FLINK-37624
 URL: https://issues.apache.org/jira/browse/FLINK-37624
 Project: Flink
  Issue Type: Improvement
Affects Versions: 2.0.0
Reporter: Zakelly Lan
 Fix For: 2.1.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] FLIP-514: Custom Evaluator plugin for Flink Autoscaler

2025-04-07 Thread Gyula Fóra
+1 (binding)

Gyula

On Mon, Apr 7, 2025 at 9:23 AM Pradeepta Choudhury
 wrote:

> Hi everyone,
>
> I'd like to start a vote on the FLIP-514: Custom Evaluator plugin for
> Flink Autoscaler [1]. The discussion thread is here [2].
>
> The vote will be open for at least 72 hours unless there is an objection
> or insufficient votes.
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-514%3A+Custom+Evaluator+plugin+for+Flink+Autoscaler
> <
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-514%3A+Custom+Evaluator+plugin+for+Flink+Autoscaler
> >
> [2] https://lists.apache.org/thread/5vw6b6w6lctd2p7vfo9w9fcpzs7834fj <
> https://lists.apache.org/thread/5vw6b6w6lctd2p7vfo9w9fcpzs7834fj>
>
>
> Thanks
> Pradeepta


Re: [VOTE] FLIP-514: Custom Evaluator plugin for Flink Autoscaler

2025-04-07 Thread Rui Fan
+1(binding)

Best,
Rui

On Mon, Apr 7, 2025 at 3:25 PM Gyula Fóra  wrote:

> +1 (binding)
>
> Gyula
>
> On Mon, Apr 7, 2025 at 9:23 AM Pradeepta Choudhury
>  wrote:
>
> > Hi everyone,
> >
> > I'd like to start a vote on the FLIP-514: Custom Evaluator plugin for
> > Flink Autoscaler [1]. The discussion thread is here [2].
> >
> > The vote will be open for at least 72 hours unless there is an objection
> > or insufficient votes.
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-514%3A+Custom+Evaluator+plugin+for+Flink+Autoscaler
> > <
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-514%3A+Custom+Evaluator+plugin+for+Flink+Autoscaler
> > >
> > [2] https://lists.apache.org/thread/5vw6b6w6lctd2p7vfo9w9fcpzs7834fj <
> > https://lists.apache.org/thread/5vw6b6w6lctd2p7vfo9w9fcpzs7834fj>
> >
> >
> > Thanks
> > Pradeepta
>


Re: [VOTE] Release flink-connector-elasticsearch v4.0.0, release candidate #1

2025-04-07 Thread Zakelly Lan
+1 (binding)

I have verified:

 - Checksum and signature
 - There are no binaries in the source archive
 - Release tag and staging jars
 - Built from source
 - Release notes and web PR

not a blocker: better update the NOTICE to year 2025

On Thu, Apr 3, 2025 at 2:12 PM weijie guo  wrote:

> Hi everyone,
>
>
> Please review and vote on the release candidate #1 for v4.0.0, as follows:
>
> [ ] +1, Approve the release
>
> [ ] -1, Do not approve the release (please provide specific comments)
>
>
> This release is mainly for flink 2.0.
>
>
> The complete staging area is available for your review, which includes:
>
> * JIRA release notes [1],
>
> * the official Apache source release to be deployed to dist.apache.org
> [2],
>
> which are signed with the key with fingerprint
> 8D56AE6E7082699A4870750EA4E8C4C05EE6861F [3],
>
> * all artifacts to be deployed to the Maven Central Repository [4],
>
> * source code tag v4.0.0-rc1 [5],
>
> * website pull request listing the new release [6].
>
> * CI build of tag [7].
>
>
> The vote will be open for at least 72 hours. It is adopted by majority
>
> approval, with at least 3 PMC affirmative votes.
>
>
> Thanks,
>
> Weijie
>
> [1]
>
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12355810
>
> [2]
>
>
> https://dist.apache.org/repos/dist/dev/flink/flink-connector-elasticsearch-4.0.0-rc1/
>
> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
>
> [4]
> https://repository.apache.org/content/repositories/orgapacheflink-1796/
>
> [5]
>
>
> https://github.com/apache/flink-connector-elasticsearch/releases/tag/v4.0.0-rc1
>
> [6] https://github.com/apache/flink-web/pull/786
>
> [7]
>
>
> https://github.com/apache/flink-connector-elasticsearch/actions/runs/14215074324
>


[jira] [Created] (FLINK-37626) Flaky test: ForStFlinkFileSystemTest.testSstFileInCache

2025-04-07 Thread Gabor Somogyi (Jira)
Gabor Somogyi created FLINK-37626:
-

 Summary: Flaky test: ForStFlinkFileSystemTest.testSstFileInCache
 Key: FLINK-37626
 URL: https://issues.apache.org/jira/browse/FLINK-37626
 Project: Flink
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Gabor Somogyi


https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=67041



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] Release flink-connector-elasticsearch v3.1.0, release candidate #1

2025-04-07 Thread Zakelly Lan
+1 (binding)

I have verified:

 - Checksum and signature
 - There are no binaries in the source archive
 - Release tag and staging jars
 - Built from source
 - Release notes and web PR


Best,
Zakelly

On Thu, Apr 3, 2025 at 12:18 PM weijie guo 
wrote:

> Hi everyone,
>
>
> Please review and vote on the release candidate #1 for v3.1.0, as follows:
>
> [ ] +1, Approve the release
>
> [ ] -1, Do not approve the release (please provide specific comments)
>
>
> This release supports Flink 1.18, 1.19 and 1.20.
>
>
> The complete staging area is available for your review, which includes:
>
> * JIRA release notes [1],
>
> * the official Apache source release to be deployed to dist.apache.org
> [2],
>
> which are signed with the key with fingerprint
> 8D56AE6E7082699A4870750EA4E8C4C05EE6861F [3],
>
> * all artifacts to be deployed to the Maven Central Repository [4],
>
> * source code tag v3.1.0-rc1 [5],
>
> * website pull request listing the new release [6].
>
> * CI build of tag [7].
>
>
> The vote will be open for at least 72 hours. It is adopted by majority
>
> approval, with at least 3 PMC affirmative votes.
>
>
> Thanks,
>
> Weijie
>
>
> [1]
>
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12352520
>
> [2]
>
>
> https://dist.apache.org/repos/dist/dev/flink/flink-connector-elasticsearch-3.1.0-rc1/
>
> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
>
> [4]
> https://repository.apache.org/content/repositories/orgapacheflink-1795/
>
> [5]
>
>
> https://github.com/apache/flink-connector-elasticsearch/releases/tag/v3.1.0-rc1
>
> [6] https://github.com/apache/flink-web/pull/785
>
> [7]
>
>
> https://github.com/apache/flink-connector-elasticsearch/actions/runs/14210840013
>


[DISCUSS] FLIP-Draft - Amazon CloudWatch Metric Sink Connector

2025-04-07 Thread Wong, Daren
Hi Dev,

I would like to start a discussion about FLIP: Amazon CloudWatch Metric Sink 
Connector 
https://docs.google.com/document/d/1G2sQogV8S6M51qeAaTmvpClOSvklejjEXbRFFCv_T-c/edit?usp=sharing

This FLIP is proposing to add support for Amazon CloudWatch Metric sink in 
flink-connector-aws repo. Looking forward to your feedback, thank you

Regards,
Daren


Re: [VOTE] Release flink-connector-kudu v2.0.0, release candidate #1

2025-04-07 Thread Zakelly Lan
+1 (binding)

I have verified:
 - Signatures and checksum
 - There are no binaries in the source archive
 - Release tag and staging jars
 - Built from source
 - Web PR and release notes


Best,
Zakelly




On Sat, Apr 5, 2025 at 11:22 PM Gyula Fóra  wrote:

> +1 (binding)
>
> Verified:
>  - Checkpoints, signatures
>  - Checked notice files + no binaries in source release
>  - Built from source
>  - Verified release tag, release notes and website PR
>
> Cheers
> Gyula
>
> On Fri, Mar 28, 2025 at 2:34 PM Ferenc Csaky 
> wrote:
>
> > Hi everyone,
> >
> > Please review and vote on release candidate #1 for
> > flink-connector-kudu v2.0.0, as follows:
> > [ ] +1, Approve the release
> > [ ] -1, Do not approve the release (please provide specific comments)
> >
> >
> > The complete staging area is available for your review, which
> > includes:
> > * JIRA release notes [1],
> > * the official Apache source release to be deployed to
> >   dist.apache.org [2], which are signed with the key with
> >   fingerprint 16AE0DDBBB2F380B [3],
> > * all artifacts to be deployed to the Maven Central Repository [4],
> > * source code tag v2.0.0-rc1 [5],
> > * website pull request listing the new release [6],
> > * CI build of the tag [7].
> >
> > The vote will be open for at least 72 hours. It is adopted by
> > majority approval, with at least 3 PMC affirmative votes.
> >
> > Thanks,
> > Ferenc
> >
> > [1]
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12354673
> > [2]
> >
> https://dist.apache.org/repos/dist/dev/flink/flink-connector-kudu-2.0.0-rc1
> > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > [4]
> > https://repository.apache.org/content/repositories/orgapacheflink-1793/
> > [5]
> https://github.com/apache/flink-connector-kudu/releases/tag/v2.0.0-rc1
> > [6] https://github.com/apache/flink-web/pull/784
> > [7]
> > https://github.com/apache/flink-connector-kudu/actions/runs/14128565162
> >
>


[jira] [Created] (FLINK-37627) Restarting from a checkpoint/savepoint which coincides with shard split causes data loss

2025-04-07 Thread Keith Lee (Jira)
Keith Lee created FLINK-37627:
-

 Summary: Restarting from a checkpoint/savepoint which coincides 
with shard split causes data loss
 Key: FLINK-37627
 URL: https://issues.apache.org/jira/browse/FLINK-37627
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kinesis
Affects Versions: aws-connector-5.0.0
Reporter: Keith Lee


Similar to DDB stream connector's issue 
https://issues.apache.org/jira/browse/FLINK-37416

This is less likely to happen on Kinesis connector due to much lower frequency 
of re-sharding / assigning new split but technically possible so we'd like to 
fix this to avoid data
loss.

The scenario is as follow:

- A checkpoint started
- KinesisStreamsSourceEnumerator takes a checkpoint (shard was assigned here)
- KinesisStreamsSourceEnumerator sends checkpoint event to reader
- Before taking reader checkpoint, a SplitFinishedEvent came up in reader
- Reader takes checkpoint
- Now, just after checkpoint complete, job restarted

This can lead to a shard lineage getting lost because of a shard being in 
ASSIGNED state in enumerator and not being part of any task manager state.

See DDB Connector issue's PR for reference fix: 
https://issues.apache.org/jira/browse/FLINK-37416



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Re: [ANNOUNCE] New Apache Flink PMC Member - Sergey Nuyanzin

2025-04-07 Thread Gustavo de Morais
Congrats, Sergey!

Kind regards,
Gustavo

Am Do., 3. Apr. 2025 um 11:42 Uhr schrieb Shengkai Fang :

> Congratulations, Sergey!
>
> Best,
> Shengkai
>
> Xuyang  于2025年4月3日周四 16:19写道:
>
> > Cheers, Sergey!
> >
> >
> > --
> >
> > Best!
> > Xuyang
> >
> >
> >
> >
> >
> > 在 2025-04-03 15:58:48,"Zakelly Lan"  写道:
> > >Congratulations, Sergey!
> > >
> > >
> > >Best,
> > >Zakelly
> > >
> > >On Thu, Apr 3, 2025 at 3:10 PM Dawid Wysakowicz <
> > wysakowicz.da...@gmail.com>
> > >wrote:
> > >
> > >> On behalf of the PMC, I'm very happy to announce a new Apache Flink
> PMC
> > >> Member
> > >> - Sergey Nuyanzin.
> > >>
> > >> Sergey has been a dedicated and invaluable contributor to the Flink
> > >> community for several years. He became a committer in 2023 and has
> been
> > one
> > >> of the most active contributors, particularly in the areas of Flink
> SQL.
> > >>
> > >> Sergey has developed several key features and improvements in Flink
> SQL,
> > >> including upgrades of Apache Calcite as well as other fixes for the
> > >> interaction between Calcite and Flink. His work has significantly
> > improved
> > >> the usability and performance of Flink’s SQL engine. In addition to
> code
> > >> contributions, he has also driven discussions on design improvements
> and
> > >> actively reviewed many pull requests.
> > >>
> > >> Sergey helps tremendously with keeping the project healthy. He often
> > helps
> > >> with releases of connectors, flink-shaded and verifying Flink’s
> > >> releases.Beyond development, Sergey is committed to fostering the
> Flink
> > >> community. He regularly shares his opinions on the future and
> direction
> > in
> > >> which the project should go.
> > >>
> > >> Please join me in welcoming and congratulating Sergey!
> > >>
> > >> Best,
> > >> Dawid Wysakowicz (on behalf of the Flink PMC)
> > >>
> >
>


Re: [ANNOUNCE] New Apache Flink PMC Member - Sergey Nuyanzin

2025-04-07 Thread Hong Liang
Congratulations Sergey! Well deserved.


Hong

On Mon, Apr 7, 2025 at 8:03 AM David Radley  wrote:

> Congratuations - well deserved Sergey, I really appreciate your deep
> knowledge, prolific contributions, support and community minded attitude,
>
> Kind regards David.
>
> From: Arvid Heise 
> Date: Monday, 7 April 2025 at 07:54
> To: dev@flink.apache.org 
> Subject: [EXTERNAL] Re: [ANNOUNCE] New Apache Flink PMC Member - Sergey
> Nuyanzin
> Congratz!
>
> On Mon, Apr 7, 2025 at 6:44 AM Ferenc Csaky 
> wrote:
>
> > Congrats, Sergey!
> >
> > Best,
> > Ferenc
> >
> >
> > On Monday, April 7th, 2025 at 03:30, weijie guo <
> guoweijieres...@gmail.com>
> > wrote:
> >
> > >
> > >
> > > Congratulations, Sergey~
> > >
> > > Best regards,
> > >
> > > Weijie
> > >
> > >
> > > Lincoln Lee lincoln.8...@gmail.com 于2025年4月3日周四 18:14写道:
> > >
> > > > Congratulations, Sergey!
> > > >
> > > > Best,
> > > > Lincoln Lee
> > > >
> > > > Yuepeng Pan panyuep...@apache.org 于2025年4月3日周四 17:48写道:
> > > >
> > > > > Congratulations!
> > > > >
> > > > > Best regards,
> > > > > Yuepeng Pan
> > > > >
> > > > > At 2025-04-03 17:42:21, "Shengkai Fang" fskm...@gmail.com wrote:
> > > > >
> > > > > > Congratulations, Sergey!
> > > > > >
> > > > > > Best,
> > > > > > Shengkai
> > > > > >
> > > > > > Xuyang xyzhong...@163.com 于2025年4月3日周四 16:19写道:
> > > > > >
> > > > > > > Cheers, Sergey!
> > > > > > >
> > > > > > > --
> > > > > > >
> > > > > > > Best!
> > > > > > > Xuyang
> > > > > > >
> > > > > > > 在 2025-04-03 15:58:48,"Zakelly Lan" zakelly@gmail.com 写道:
> > > > > > >
> > > > > > > > Congratulations, Sergey!
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Zakelly
> > > > > > > >
> > > > > > > > On Thu, Apr 3, 2025 at 3:10 PM Dawid Wysakowicz <
> > > > > > > > wysakowicz.da...@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > On behalf of the PMC, I'm very happy to announce a new
> > Apache Flink
> > > > > > > > > PMC
> > > > > > > > > Member
> > > > > > > > > - Sergey Nuyanzin.
> > > > > > > > >
> > > > > > > > > Sergey has been a dedicated and invaluable contributor to
> > the Flink
> > > > > > > > > community for several years. He became a committer in 2023
> > and has
> > > > > > > > > been
> > > > > > > > > one
> > > > > > > > > of the most active contributors, particularly in the areas
> > of Flink
> > > > > > > > > SQL.
> > > > > > > > >
> > > > > > > > > Sergey has developed several key features and improvements
> > in Flink
> > > > > > > > > SQL,
> > > > > > > > > including upgrades of Apache Calcite as well as other fixes
> > for the
> > > > > > > > > interaction between Calcite and Flink. His work has
> > significantly
> > > > > > > > > improved
> > > > > > > > > the usability and performance of Flink’s SQL engine. In
> > addition to
> > > > > > > > > code
> > > > > > > > > contributions, he has also driven discussions on design
> > > > > > > > > improvements
> > > > > > > > > and
> > > > > > > > > actively reviewed many pull requests.
> > > > > > > > >
> > > > > > > > > Sergey helps tremendously with keeping the project healthy.
> > He
> > > > > > > > > often
> > > > > > > > > helps
> > > > > > > > > with releases of connectors, flink-shaded and verifying
> > Flink’s
> > > > > > > > > releases.Beyond development, Sergey is committed to
> > fostering the
> > > > > > > > > Flink
> > > > > > > > > community. He regularly shares his opinions on the future
> and
> > > > > > > > > direction
> > > > > > > > > in
> > > > > > > > > which the project should go.
> > > > > > > > >
> > > > > > > > > Please join me in welcoming and congratulating Sergey!
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Dawid Wysakowicz (on behalf of the Flink PMC)
> >
>
> Unless otherwise stated above:
>
> IBM United Kingdom Limited
> Registered in England and Wales with number 741598
> Registered office: Building C, IBM Hursley Office, Hursley Park Road,
> Winchester, Hampshire SO21 2JN
>


Re: Flink Kafka Connector: Release and Kafka client versions

2025-04-07 Thread Tom Cooper
Hi Arvid,

Thanks for the information. I think your plan makes sense given that we want to 
get a Flink 2.0 compatible connector out ASAP.

I can keep updating my Kafka 4.0.0 PR [1] until we are ready to do the release 
containing it. In my opinion version numbers are not a constrained resource and 
given that moving to the 4.x Kafka clients will drop support for Kafka 2.0 and 
older, a 5.0 release of the connector might make sense. It would also have the 
advantage of moving the connector's version number out of the range of the 
Kafka versions, which would reduce confusion on whether the connector version 
equates to the supported Kafka version.

However, that conversation can wait until later. I am happy to push ahead with 
a 4.0 connector release with Flink 2.0.0 and Kafka 3.9.0.

Tom Cooper
@tomcooper.dev | https://tomcooper.dev

[1] https://github.com/apache/flink-connector-kafka/pull/161

On Tuesday, 1 April 2025 at 21:23, Arvid Heise  wrote:

> Hi Tom,
> 
> thanks for driving this.
> 
> Updating the Kafka client library has been done only occasionally and
> there was no formal process. It makes sense to give stronger
> guidelines. I guess a rigid process would only make sense if we
> enforce it for example through bots and that was highly debated a
> couple of years ago in Apache projects (I would be very much in favor
> of getting bot support for dependencies).
> 
> Here is my proposal:
> * Kafka-connector-4.0.0 uses kafka-client-3.9.0 because it's the
> latest minor release in the 3.X. I went over the things that they
> (plan to) fix in 3.9.1 and nothing major pops out. 3.9.0 gives folks
> the option to relatively effortless downgrade to 3.8.1 in DataStreams
> (it's just a transitive dependency). For SQL, folks are bound to 3.9.0
> unless they rebuild the module from scratch, which is also a moderate
> effort. No code change is required.
> * Kafka-connector-3.5.0 with a kafka-client-3.9.0 for Flink 1.X also
> makes a lot of sense if Kafka-connector-4.0.0 cannot run on Flink 1.X
> (to confirm). Else, folks can just use Kafka-connector-4.X.
> * Release Kafka-connector-4.0.1 once kafka-client-3.9.1 is out + any
> bug fix that is needed in our code.
> * Release Kafka-connector-4.1.0 once kafka-client-4.0.1 or 4.1.0 is
> out. In my experience, the very first version of a major release is
> usually for early adopters only. Most of the new features in that
> release are only available once you also update brokers to 4.0.0,
> which will probably take a longer time for most setups. Of course, the
> effort can be expedited if there is actual demand for
> kafka-client-4.0.0.
> 
> As Yanquan mentioned, the biggest need is currently on
> Kafka-connector-4.0.0. We need to find a release manager asap and get
> that release out next week (so probably start this week with cut and
> voting). We need to bundle our resources on that.
> 
> Once that is done, we should look at Kafka-connector-3.5.0 release.
> Then, the others.
> 
> Note that I don't see the need for another major release to bump
> kafka-client to 4.0.0. From the user's perspective, this change is
> mostly transparent. I think that release basically just cuts off some
> very old broker versions <3.0. I'm assuming most folks are on 3.5+
> atm. In contrast, for kafka-connector-4.0.0, Yanquan deleted 22k LOC
> to drop all the legacy consumer/producer.
> 
> LMK what you think about that proposal.
> 
> Arvid
> 
> On Tue, Apr 1, 2025 at 6:10 PM Yanquan Lv decq12y...@gmail.com wrote:
> 
> > Hi, Cooper. Thank you for bringing up the discussion about Kafka 4.0.0.
> > I tend to release Kafka Connector 4.0 with Flink 2.0.0 and Kafka 3.9.0, or 
> > Kafka 3.8.1 if If the community is concerned about 3.9.0.
> > 
> > Here are some of my opinions:
> > 
> > 1. The plan for Kafka Connector 4.0.0 was proposed[1] a long time ago, 
> > mainly to support the Flink 2.0 API. Although it has been delayed for some 
> > time, users' demand for this version is becoming increasingly strong (as we 
> > can see from emails or Slack), so it is more important to release a version 
> > that supports Flink 2.0 as soon as possible.
> > 
> > 2. For Bump to Kafka 4.0.0, I personally believe that this is a decision 
> > that requires careful analysis, because rolling back to Kafka client 
> > version 3. x will require additional workload, and the impact on 
> > performance and API compatibility may not be reflected solely through IT 
> > cases(Of course, I believe it will have a positive impact).
> > I don't recommend making two major changes in one version,So perhaps 
> > 4.1/5.0 is more suitable for doing this.
> > 
> > 3. As for Kafka 3.9.0, I don't have a big opinion as we have been using 
> > Kafka client 3.4.0 for almost two years since FLINK-31599[2]. Of course, I 
> > also approve of using Kafka 3.8.1.
> > But for using 3.9.1, the benefits are not very significant and it will 
> > block the activities of other developers who rely on Flink connector Kafka 
> > release 4.0. As I know,

[jira] [Created] (FLINK-37625) PyFlink Table API skips validation for Rows created with positional arguments

2025-04-07 Thread Mika Naylor (Jira)
Mika Naylor created FLINK-37625:
---

 Summary: PyFlink Table API skips validation for Rows created with 
positional arguments
 Key: FLINK-37625
 URL: https://issues.apache.org/jira/browse/FLINK-37625
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Reporter: Mika Naylor
Assignee: Mika Naylor


When creating a table using {{{}TableEnvironment.from_elements{}}}, the Table 
API skips type validation on any Row elements that were created using 
positional arguments, rather than keyword arguments.

 

For example, take a table with a single column, whose type is an array of 
{{{}Row{}}}s. These rows have 2 columns, {{a VARCHAR}} and {{{}b BOOLEAN{}}}. 
If we create a table with elements where one of these rows has columns in the 
wrong order:
{code:java}
schema = DataTypes.ROW(
[
DataTypes.FIELD(
"col",
DataTypes.ARRAY(
DataTypes.ROW(
[
DataTypes.FIELD("a", DataTypes.STRING()),
DataTypes.FIELD("b", DataTypes.BOOLEAN()),
]
)
),
),
]
) 
elements = [(
[("pyflink", True), ("pyflink", False), (True, "pyflink")],
)] 
table = self.t_env.from_elements(elements, schema)
table_result = list(table.execute().collect()){code}
This results in a type validation error:
{code:java}
TypeError: field a in element in array field col: VARCHAR can not accept object 
True in type {code}
In an example where we use {{Row}} instead of tuples, but with column arguments:
{code:java}
elements = [(
[Row(a="pyflink", b=True), Row(a="pyflink", b=False), Row(a=True, 
b="pyflink")],
)]{code}
We also get the same type validation error. However, when we use {{Row}} with 
positional arguments:
{code:java}
elements = [(
[Row("pyflink", True), Row("pyflink", False), Row(True, "pyflink")],
)]{code}
the type validation is skipped, leading to an unpickling error when collecting:
{code:java}
>           data = pickle.loads(data)
E           EOFError: Ran out of input {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)