[jira] [Created] (FLINK-37811) Flink Job stuck in suspend state after losing leadership in Zookeeper HA

2025-05-16 Thread Arun Lakshman (Jira)
Arun Lakshman created FLINK-37811:
-

 Summary: Flink Job stuck in suspend state after losing leadership 
in Zookeeper HA
 Key: FLINK-37811
 URL: https://issues.apache.org/jira/browse/FLINK-37811
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.20.0, 1.15.0
Reporter: Arun Lakshman
 Attachments: notRecovered.csv

We have observed an inconsistent behavior pattern where the JobManager 
encounters ZooKeeper session timeout exceptions, leading to leadership loss 
across multiple components including Resource Manager, Job Master, and 
Dispatcher. When this occurs, the system exhibits an unexpected sequence - 
while components are in the process of shutting down, the ZooKeeper connection 
gets RECONNECTED, but jobs still enter a SUSPENDED state. Notably, the 
JobManager process continues to run without performing a system exit. The 
initial trigger appears as a session timeout exception with message "Client 
session timed out, have not heard from server in 26678ms".



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] FLIP-529 Connections in Flink SQL and Table API

2025-05-16 Thread Yash Anand
Hi Mayank,

Thanks for the initiative! I think this functionality would be a really
great addition to Flink.

+1 for the proposal.

Thanks,
Yash

On Tue, May 6, 2025 at 9:36 AM Mayank Juneja 
wrote:

> Hi Ferenc,
>
> Thanks for your question! We agree that a CONNECTION could logically be a
> catalog-level resource, especially since it’s intended to be reused across
> multiple tables. However, we think there’s value in defining it at the
> database level to introduce a layer of isolation and scoping.
>
> This is particularly useful in shared or multi-tenant environments where
> different databases within the same catalog might need to connect to
> different external systems or use distinct credentials.
>
> That said, we’re open to revisiting this if there's strong consensus around
> catalog-level scoping. Appreciate the feedback!
>
> Regards,
> Mayank
>
> On Fri, May 2, 2025 at 12:12 AM Ferenc Csaky 
> wrote:
>
> > Hi Mayank,
> >
> > Thank you for starting the discussion! In general, I think such
> > functionality
> > would be a really great addition to Flink.
> >
> > Could you pls. elaborate a bit more one what is the reason of defining a
> > `connection` resource on the database level instead of the catalog level?
> > If I think about `JdbcCatalog`, or `HiveCatalog`, the catalog is in
> 1-to-1
> > mapping with an RDBMS, or a HiveMetastore, so my initial thinking is
> that a
> > `connection` seems more like a catalog level resource.
> >
> > WDYT?
> >
> > Thanks,
> > Ferenc
> >
> >
> >
> > On Tuesday, April 29th, 2025 at 17:08, Mayank Juneja <
> > mayankjunej...@gmail.com> wrote:
> >
> > >
> > >
> > > Hi all,
> > >
> > > I would like to open up for discussion a new FLIP-529 [1].
> > >
> > > Motivation:
> > > Currently, Flink SQL handles external connectivity by defining
> endpoints
> > > and credentials in table configuration. This approach prevents
> > reusability
> > > of these connections and makes table definition less secure by exposing
> > > sensitive information.
> > > We propose the introduction of a new "connection" resource in Flink.
> This
> > > will be a pluggable resource configured with a remote endpoint and
> > > associated access key. Once defined, connections can be reused across
> > table
> > > definitions, and eventually for model definition (as discussed in
> > FLIP-437)
> > > for inference, enabling seamless and secure integration with external
> > > systems.
> > > The connection resource will provide a new, optional way to manage
> > external
> > > connectivity in Flink. Existing methods for table definitions will
> remain
> > > unchanged.
> > >
> > > [1] https://cwiki.apache.org/confluence/x/cYroF
> > >
> > > Best Regards,
> > > Mayank Juneja
> >
>
>
> --
> *Mayank Juneja*
> Product Manager | Data Streaming and AI
>


Re: [DISCUSS] FLIP-529 Connections in Flink SQL and Table API

2025-05-16 Thread Ryan van Huuksloot
Hi Mayank,

Overall I think the idea of having shared properties (secrets or otherwise)
at the database layer is really interesting/useful. Even outside of the ML
scope.

However, I have a concern:
When I describe a table, I would not like it if all properties did not
appear in the returned results of the describe. It adds a hidden layer of
properties.
CONNECTION is not a standard SQL syntax (at least that I am aware of,
please correct me if I am wrong) and it adds a FlinkSQL specific layer of
complexity.

I like the idea of Grants to hide secrets rather than nesting them in a
Connection object. The Connection object could still be described.
Something akin to Snowflake's Secrets (
https://docs.snowflake.com/en/sql-reference/sql/show-secrets) that can be
referenced within table properties and could be obfuscated during a
describe.

Then if we hide secrets, we can figure out a different name than Connection
for common properties and they can be shown as part of the table spec.

Thoughts?

Ryan van Huuksloot
Sr. Production Engineer | Streaming Platform
[image: Shopify]



On Fri, May 16, 2025 at 1:35 PM Yash Anand 
wrote:

> Hi Mayank,
>
> Thanks for the initiative! I think this functionality would be a really
> great addition to Flink.
>
> +1 for the proposal.
>
> Thanks,
> Yash
>
> On Tue, May 6, 2025 at 9:36 AM Mayank Juneja 
> wrote:
>
> > Hi Ferenc,
> >
> > Thanks for your question! We agree that a CONNECTION could logically be a
> > catalog-level resource, especially since it’s intended to be reused
> across
> > multiple tables. However, we think there’s value in defining it at the
> > database level to introduce a layer of isolation and scoping.
> >
> > This is particularly useful in shared or multi-tenant environments where
> > different databases within the same catalog might need to connect to
> > different external systems or use distinct credentials.
> >
> > That said, we’re open to revisiting this if there's strong consensus
> around
> > catalog-level scoping. Appreciate the feedback!
> >
> > Regards,
> > Mayank
> >
> > On Fri, May 2, 2025 at 12:12 AM Ferenc Csaky  >
> > wrote:
> >
> > > Hi Mayank,
> > >
> > > Thank you for starting the discussion! In general, I think such
> > > functionality
> > > would be a really great addition to Flink.
> > >
> > > Could you pls. elaborate a bit more one what is the reason of defining
> a
> > > `connection` resource on the database level instead of the catalog
> level?
> > > If I think about `JdbcCatalog`, or `HiveCatalog`, the catalog is in
> > 1-to-1
> > > mapping with an RDBMS, or a HiveMetastore, so my initial thinking is
> > that a
> > > `connection` seems more like a catalog level resource.
> > >
> > > WDYT?
> > >
> > > Thanks,
> > > Ferenc
> > >
> > >
> > >
> > > On Tuesday, April 29th, 2025 at 17:08, Mayank Juneja <
> > > mayankjunej...@gmail.com> wrote:
> > >
> > > >
> > > >
> > > > Hi all,
> > > >
> > > > I would like to open up for discussion a new FLIP-529 [1].
> > > >
> > > > Motivation:
> > > > Currently, Flink SQL handles external connectivity by defining
> > endpoints
> > > > and credentials in table configuration. This approach prevents
> > > reusability
> > > > of these connections and makes table definition less secure by
> exposing
> > > > sensitive information.
> > > > We propose the introduction of a new "connection" resource in Flink.
> > This
> > > > will be a pluggable resource configured with a remote endpoint and
> > > > associated access key. Once defined, connections can be reused across
> > > table
> > > > definitions, and eventually for model definition (as discussed in
> > > FLIP-437)
> > > > for inference, enabling seamless and secure integration with external
> > > > systems.
> > > > The connection resource will provide a new, optional way to manage
> > > external
> > > > connectivity in Flink. Existing methods for table definitions will
> > remain
> > > > unchanged.
> > > >
> > > > [1] https://cwiki.apache.org/confluence/x/cYroF
> > > >
> > > > Best Regards,
> > > > Mayank Juneja
> > >
> >
> >
> > --
> > *Mayank Juneja*
> > Product Manager | Data Streaming and AI
> >
>


[jira] [Created] (FLINK-37812) Fix broken links to Apache Avro in doc and code

2025-05-16 Thread Mingliang Liu (Jira)
Mingliang Liu created FLINK-37812:
-

 Summary: Fix broken links to Apache Avro in doc and code
 Key: FLINK-37812
 URL: https://issues.apache.org/jira/browse/FLINK-37812
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 1.20.1, 2.0.0
Reporter: Mingliang Liu


Recently Apache Avro changed the documentation website. Lovely as it is, some 
links are broken linked from Apache Flink documentation and comments in code. 
Let's simply fix them by updating to new working links.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


FW: Upgrade Flink's log4j to to 2.24.3

2025-05-16 Thread David Radley
Hi,
I raised a JIRA [1] and PR [2] for this, I can’t see any downsides – only that 
it fixes critical vulnerabilities. Unless there are any concerns , could we 
merge this please,

Kind regards, David.

[1] https://issues.apache.org/jira/browse/FLINK-37810
[2] https://github.com/apache/flink/pull/26570

From: David Radley 
Date: Tuesday, 13 May 2025 at 16:55
To: dev 
Subject: [EXTERNAL] Upgrade Flink's log4j to to 2.24.3
Hi,
I notice that Flink log4j is at level 2.24.1. There are critical bugs [2] that 
are fixed if we move to 2.24.3. I am happy to raise a Jira, make the update and 
backport to v2 and v1.20 (which is currently at an even lower level), if 
someone is willing to merge,
  Kind regards, David.

[1] 
https://github.com/apache/flink/blob/9d97eef879f11aedbd83e75bba5050c00a76bf7a/pom.xml#L136C18-L136C25
[2]
https://logging.apache.org/log4j/2.x/release-notes.html#release-notes-2-24-3

Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: Building C, IBM Hursley Office, Hursley Park Road, 
Winchester, Hampshire SO21 2JN

Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: Building C, IBM Hursley Office, Hursley Park Road, 
Winchester, Hampshire SO21 2JN


[DISCUSS] FLIP-518: Introduce a community review process

2025-05-16 Thread David Radley
Hi,
I have amended the design slightly for Flip-518 [1] in 2 respects

  *   I intend the set the COMMUNITY-REVIEW label for all types for reviews 
(not just approves) , as negative and comment reviews should be tracked as they 
are often more valuable than approvals.
  *   I have changed the Flip to indicate there will only ever be a maximum of 
one community review related label. I do not want a PR swamped with these new 
labels.

I will proceed to implement this. Let me know if you have any concerns with 
these amendments.

Kind regards, David.

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-518%3A+Introduce+a+community+review+process

Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: Building C, IBM Hursley Office, Hursley Park Road, 
Winchester, Hampshire SO21 2JN


[jira] [Created] (FLINK-37810) update log4j to 2.24.3 to fix critical vulnerabilities

2025-05-16 Thread david radley (Jira)
david radley created FLINK-37810:


 Summary: update  log4j to 2.24.3 to fix critical vulnerabilities
 Key: FLINK-37810
 URL: https://issues.apache.org/jira/browse/FLINK-37810
 Project: Flink
  Issue Type: Technical Debt
Reporter: david radley


flink Master and v2 are at 2.24.1 but there are 2 critical issues at that level.

See [2.24.3 release notes 
|[https://logging.apache.org/log4j/2.x/release-notes.html#release-notes-2-24-3]]
 for details. Update  log4j to 2.24.3 to resolve those issues. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[ANNOUNCE] Apache Flink CDC 3.4.0 released

2025-05-16 Thread Yanquan Lv
The Apache Flink community is very happy to announce the release of
Apache Flink CDC 3.4.0.

Apache Flink CDC is a distributed data integration tool for real time
data and batch data, bringing the simplicity and elegance of data
integration via YAML to describe the data movement and transformation
in a data pipeline.

Please check out the release blog post for an overview of the release:
https://flink.apache.org/2025/05/16/apache-flink-cdc-3.4.0
-release-announcement/

The release is available for download at:
https://flink.apache.org/downloads.html

Maven artifacts for Flink CDC can be found at:
https://search.maven.org/search?q=g:org.apache.flink%20cdc

The full release notes are available in Jira:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12355589&styleName=&projectId=12315522

We would like to thank all contributors of the Apache Flink community
who made this release possible!

Regards,
Yanquan


2025/05/15 CHI Western time zone workgroup meeting minutes

2025-05-16 Thread David Radley
Hi all,
We held the 23rd Chi meeting.

  *   The open PR numbers are down to under 200 now – much more manageable. I 
am curious if anyone thinks this is too aggressive.
  *   We have identified some PRs [2], they are approved and trivial, so in our 
opinion should be easy merges. Could a committer merge these please?

Kind regards, David.

[1] Full minutes
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=358812211

[2] PRs to merge

>From this week:

https://github.com/apache/flink/pull/26515

https://github.com/apache/flink/pull/26570 log4j version upgrade

https://github.com/apache/flink/pull/26544 trivial change

PRs to merge from previous weeks

https://github.com/apache/flink/pull/26481

https://github.com/apache/flink/pull/26449 - trivial change

https://github.com/apache/flink/pull/26462 - trivial change

https://github.com/apache/flink/pull/26450 - looks safe and fixes a security 
issue

Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: Building C, IBM Hursley Office, Hursley Park Road, 
Winchester, Hampshire SO21 2JN