RE: Re: Flink technical debt.

2024-11-07 Thread Ammu P
Hi David,

Thanks for this initiative. It looks really helpful and please count me in.

Regards,
Ammu Parvathy

On 2024/11/07 10:35:43 Hong Liang wrote:
> Hi David,
> 
> Thanks for this proposal. I agree that this is something we can strive to
> do better in the Flink community, and I would be keen to help out here.
> 
> +1 to the suggestion for a recurring working group meeting to triage and
> assign PRs.
> 
> I think the suggestions we have on the thread are great, but can be
> explored independently!
> 1. Review open PRs. We could simply get started by sorting by updated date
> in descending order (LIFO ensures we are looking at the freshest ones), or
> choosing a particular component
> https://github.com/apache/flink/pulls?q=is%3Apr+is%3Aopen+sort%3Aupdated-desc
> 2. Bot to followup on stale PRs, and close them if not needed.
> 
> Side note: Also did some analysis of the PRs by labels, and attached the
> data below. I imagine we could close out the Documentation ones quite
> quickly!
> 
> Regards,
> Hong
> 
> Open PRs as of (2024-11-07) by label.
> 
> review=description? : 275
> component=TableSQL/Planner : 138
> component=TableSQL/API : 109
> component=Documentation : 90
> component= : 70
> component=Runtime/StateBackends : 54
> component=Runtime/Coordination : 44
> component=chinese-translation : 41
> component=TableSQL/Runtime : 39
> component=Formats : 37
> component=Connectors/Hive : 36
> component=API/DataStream : 32
> component=Runtime/Checkpointing : 32
> component=Tests : 29
> component=Deployment/Kubernetes : 25
> component=Connectors/FileSystem : 24
> component=Connectors/Common : 21
> component=API/Core : 20
> component=Runtime/WebFrontend : 20
> component=TableSQL/Ecosystem : 19
> component=Runtime/Task : 17
> dependencies : 16
> component=API/Python : 15
> component=FileSystems : 15
> component=TableSQL/Client : 15
> component=Runtime/REST : 14
> component=Deployment/YARN : 13
> component=Runtime/Metrics : 13
> component=BuildSystem/CI : 12
> component=API/TypeSerializationSystem : 11
> component=Client/JobSubmission : 11
> component=Runtime/Configuration : 11
> component=BuildSystem : 9
> component=Library/CEP : 8
> java : 8
> javascript : 8
> component=CommandLineClient : 7
> component=Connectors/HBase : 6
> component=Runtime/Network : 6
> component=Connectors/Kafka : 5
> component=Connectors/Kinesis : 5
> component=TestInfrastructure : 5
> component=Connectors/HadoopCompatibility : 4
> component=Deployment/Scripts : 4
> component=API/DataSet : 3
> component=Connectors/Cassandra : 3
> component=TableSQL/LegacyPlanner : 3
> review=consensus? : 3
> component=Connectors/GoogleCloudPubSub : 2
> component=Connectors/Pulsar : 2
> component=Examples : 2
> component=API/Scala : 1
> component=BuildSystem/AzurePipelines : 1
> component=BuildSystem/Shaded : 1
> component=Documentation/Training : 1
> component=flink-docker : 1
> component=ProjectWebsite : 1
> component=Runtime/QueryableState : 1
> post-ui-rework : 1
> 
> 
> On Thu, Nov 7, 2024 at 5:06 AM Lyrics Cool  wrote:
> 
> > Hello David,
> >
> > I believe this is a valuable initiative that will significantly enhance the
> > codebase as well. I would also like to join this group and contribute to
> > the community.
> >
> > Regards,
> > Anu K T
> >
> > On Mon, Nov 4, 2024 at 8:36 PM David Radley 
> > wrote:
> >
> > > Hello,
> > > I have been looking at the Flink Jira and git. I see a large number of
> > > Flink Jira issues that are open and critical or blockers
> > >
> > https://issues.apache.org/jira/browse/FLINK-36655?jql=project%20%3D%20FLINK%20AND%20priority%20in%20(Blocker%2C%20Critical)
> > > I realise some of these issues may not actually be critical as they have
> > > been labelled by the submitter.
> > >
> > > I see 1239 open unmerged PRs
> > > https://github.com/apache/flink/pulls?q=is%3Apr+is%3Aopen. Some of these
> > > are not associated with assigned  issues, so may never be merged. This
> > > amount of unmerged PRs, means that many people have put a lot of time and
> > > effort into creating code that has not made it into the codebase, so they
> > > do not get the credit for the contribution, which must be disheartening
> > and
> > > the codebase does not get the benefit of the contribution.
> > >
> > > This is a large amount of technical debt. I would like to help address
> > > this problem by setting up a workgroup, with others in the community who
> > > would like this addressed. The scope of the workgroup would be to improve
> > > these numbers by activities such as:
> > >
> > >   *   Triaging PRs so it is easier for committers to merge or close them.
> > >   *   Identifying prs that could be closed out as no longer relevant.
> > >   *   Getting committer buy in.
> > >
> > > Are there other ideas from the community around how this could be
> > improved
> > > with or without a workgroup, or whether the existing processes should be
> > > sufficient or enhanced?
> > >
> > > Is there an appetite to address this in the community? I

RE: [jira] [Created] (FLINK-28897) Fail to use udf in added jar when enabling checkpoint

2024-11-28 Thread Ammu P
Hi Team,

I have raised a PR (https://github.com/apache/flink/pull/25656) to 1.20 version 
with probable fix for this issue. Can I get a review done for this please. 
Thanks in advance.

Regards,
Ammu Parvathy
On 2022/08/10 03:39:00 "Liu (Jira)" wrote:
> Liu created FLINK-28897:
> ---
> 
>  Summary: Fail to use udf in added jar when enabling checkpoint
>  Key: FLINK-28897
>  URL: https://issues.apache.org/jira/browse/FLINK-28897
>  Project: Flink
>   Issue Type: Bug
> Affects Versions: 1.16.0
> Reporter: Liu
> 
> 
> 
> 
> 
> 
> --
> This message was sent by Atlassian Jira
> (v8.20.10#820010)
> 

RE: Re: [DISCUSS] Flink 1.20.1 release

2024-12-10 Thread Ammu P
+1 for a 1.20.1 release, Thanks for driving!

There was one issue[1] for which I have raised a fix PR[2] which I would like 
to be included as part of this release. Please let me know if anything needs to 
be done for this.

[1] https://issues.apache.org/jira/browse/FLINK-28897
[2] https://github.com/apache/flink/pull/25656

Regards,
Ammu

On 2024/11/26 03:27:46 Yuepeng Pan wrote:
> Thanks for driving this !
> +1 for a 1.20.1 release, 
> 
> And there's a pending issue here[1][2], and judging by the results of the 
> current email thread[3] call for comments, 
> it seems need to be merged into the 1.x series. 
> Would we consider merging it into these two upcoming releases? 
> I'd like to do something about it if needed.
> 
> [1] https://github.com/apache/flink/pull/25218
> [2] https://issues.apache.org/jira/browse/FLINK-33977[3] 
> https://lists.apache.org/thread/24xtcnrhv8504ldf5lm58plqm498b89k
> 
> Best, 
> Yuepeng Pan
> 
> 
> 
> 
> 
> 
> 
> 
> 
> At 2024-11-26 01:50:05, "Tom Cooper"  wrote:
> >+1 for a 1.20.1 release, thanks for driving this Alex.
> >
> >I am new to Flink and not totally clear on the release process. 
> >Is there going to be a code freeze date? There are a number of CVE fixes 
> >([1],[2]) I would really like to be included.
> >
> >Tom Cooper
> >
> >[1](https://github.com/apache/flink/pull/25573)
> >[2](https://github.com/apache/flink/pull/25606)
> >
> >On Friday, 22 November 2024 at 14:41, Alexander Fedulov  
> >wrote:
> >
> >> Hi everyone,
> >> 
> >> I would like to discuss creating the first patch release for 1.20. This
> >> version was released almost 4 months ago, and more than 70 commits have
> >> accumulated since then [1].
> >> 
> >> If there are no objections to the release, I would like to volunteer as the
> >> release manager.
> >> 
> >> Best regards,
> >> Alex
> >> 
> >> [1] https://github.com/apache/flink/compare/release-1.20.0...release-1.20
> 

Re: [DISCUSS] Flink 1.20.1 release

2024-12-17 Thread Ammu P
Hi Alex,

Thanks for looking into it. I am able to back port the fix to 1.19.2 as is and 
a PR[1] is raised for the same. 

[1] https://github.com/apache/flink/pull/25809

Regards,
Ammu

> On 17 Dec 2024, at 12:05 AM, Alexander Fedulov  
> wrote:
> 
> Hi Ammu,
> 
> Thanks for bringing this up. Could you please verify if this fix can also
> be backported to 1.19.2 as is?
> 
> Best,
> Alex
> 
> On Wed, 11 Dec 2024 at 03:01, ConradJam  wrote:
> 
>> +1 for a 1.20.1 release , best ~
>> 
>> David Radley  于2024年12月11日周三 00:07写道:
>> 
>>> +1 for a 1.20.1 release, Thanks for driving Alex!
>>> 
>>> There is a PR [1] against master that fixes a lot of vulnerabilities in
>>> the Web UI, this requires a PR that brings in a later level of Node [2].
>> We
>>> would like to backport these and have them part of the 1.20.1 release,
>>> Kind regards, David.
>>> 
>>> [1] https://github.com/apache/flink/pull/25718
>>> [2] https://github.com/apache/flink/pull/25670
>>> From: Ammu P 
>>> Date: Tuesday, 10 December 2024 at 15:12
>>> To: dev@flink.apache.org 
>>> Subject: [EXTERNAL] RE: Re: [DISCUSS] Flink 1.20.1 release
>>> +1 for a 1.20.1 release, Thanks for driving!
>>> 
>>> There was one issue[1] for which I have raised a fix PR[2] which I would
>>> like to be included as part of this release. Please let me know if
>> anything
>>> needs to be done for this.
>>> 
>>> [1] https://issues.apache.org/jira/browse/FLINK-28897
>>> [2] https://github.com/apache/flink/pull/25656
>>> 
>>> Regards,
>>> Ammu
>>> 
>>> On 2024/11/26 03:27:46 Yuepeng Pan wrote:
>>>> Thanks for driving this !
>>>> +1 for a 1.20.1 release,
>>>> 
>>>> And there's a pending issue here[1][2], and judging by the results of
>>> the current email thread[3] call for comments,
>>>> it seems need to be merged into the 1.x series.
>>>> Would we consider merging it into these two upcoming releases?
>>>> I'd like to do something about it if needed.
>>>> 
>>>> [1] https://github.com/apache/flink/pull/25218
>>>> [2] https://issues.apache.org/jira/browse/FLINK-33977[3]<
>>> https://issues.apache.org/jira/browse/FLINK-33977%5b3%5d>
>>> https://lists.apache.org/thread/24xtcnrhv8504ldf5lm58plqm498b89k
>>>> 
>>>> Best,
>>>> Yuepeng Pan
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> At 2024-11-26 01:50:05, "Tom Cooper"  wrote:
>>>>> +1 for a 1.20.1 release, thanks for driving this Alex.
>>>>> 
>>>>> I am new to Flink and not totally clear on the release process.
>>>>> Is there going to be a code freeze date? There are a number of CVE
>>> fixes ([1],[2]) I would really like to be included.
>>>>> 
>>>>> Tom Cooper
>>>>> 
>>>>> [1](https://github.com/apache/flink/pull/25573)
>>>>> [2](https://github.com/apache/flink/pull/25606)
>>>>> 
>>>>> On Friday, 22 November 2024 at 14:41, Alexander Fedulov <
>>> al...@gmail.com> wrote:
>>>>> 
>>>>>> Hi everyone,
>>>>>> 
>>>>>> I would like to discuss creating the first patch release for 1.20.
>>> This
>>>>>> version was released almost 4 months ago, and more than 70 commits
>>> have
>>>>>> accumulated since then [1].
>>>>>> 
>>>>>> If there are no objections to the release, I would like to volunteer
>>> as the
>>>>>> release manager.
>>>>>> 
>>>>>> Best regards,
>>>>>> Alex
>>>>>> 
>>>>>> [1]
>>> https://github.com/apache/flink/compare/release-1.20.0...release-1.20
>>>> 
>>> 
>>> Unless otherwise stated above:
>>> 
>>> IBM United Kingdom Limited
>>> Registered in England and Wales with number 741598
>>> Registered office: Building C, IBM Hursley Office, Hursley Park Road,
>>> Winchester, Hampshire SO21 2JN
>>> 
>> 



Re: [jira] [Created] (FLINK-28897) Fail to use udf in added jar when enabling checkpoint

2025-02-26 Thread Ammu P
Hi Team,

Here is the summary on the outcomes from the PR review:
Externally added jars are resolved using the FlinkUserCodeClassLoader(child 
classloader) in flink. This fix involved updating the class loader at the graph 
execution level with the user code class loader which is expected as per naming 
convention of the variable here[1] in the source code.
However, the issue is already fixed in a safe tested way in Flink 2.0 as part 
of this PR[2]. If this issue exists for several Flink versions (from 1.16) then 
we should not introduce large and risky changes in a patch version to fix it. 
Since the issue has existed since version 1.16 (over 2 years) and is only 
relevant for the Table API the risk is not worth it [3]. So we can conclude 
that ADD JAR capability with Table API  will be a known limitation till version 
20 and the same can be used with Flink 2.0.
Based on this I believe we are good to close the PR[4] and tag the fix version 
for the related issue[5] as 2.0. However I’d love to hear your thoughts on 
this. Please let me know if there are any suggestions or concerns.

[1] 
https://github.com/apache/flink/blob/a4563caa7a4914dfd9fa5d488f5b2e541ecc582a/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/StreamExecutionEnvironment.java#L2472
[2] https://github.com/apache/flink/pull/25472
[3] https://github.com/apache/flink/pull/25656#issuecomment-2582320914
[4] https://github.com/apache/flink/pull/25656
[5] https://issues.apache.org/jira/browse/FLINK-28897

Regards,
Ammu

> On 29 Nov 2024, at 11:49 AM, Ammu P  wrote:
> 
> Hi Team,
> 
> I have raised a PR (https://github.com/apache/flink/pull/25656) to 1.20 
> version with probable fix for this issue. Can I get a review done for this 
> please. Thanks in advance.
> 
> Regards,
> Ammu Parvathy
> On 2022/08/10 03:39:00 "Liu (Jira)" wrote:
> > Liu created FLINK-28897:
> > ---
> > 
> >  Summary: Fail to use udf in added jar when enabling checkpoint
> >  Key: FLINK-28897
> >  URL: https://issues.apache.org/jira/browse/FLINK-28897
> >  Project: Flink
> >   Issue Type: Bug
> > Affects Versions: 1.16.0
> > Reporter: Liu
> > 
> > 
> > 
> > 
> > 
> > 
> > --
> > This message was sent by Atlassian Jira
> > (v8.20.10#820010)
> > 



Question on usage of SQL Gateway with a remote Flink cluster

2025-03-04 Thread Ammu P
Hi everyone, 

I am trying to run a SQL script through SQL Client in gateway mode. The gateway 
is running in a separate container not associated with the Flink cluster. When 
trying to issue a command from SQL Client locally to the gateway resulted in a 
connection error something like : 

WARN  org.apache.flink.client.program.rest.RestClusterClient   [] - Attempt 
to submit job '' to 'http://localhost:8081' has failed.

It seems like it is trying to hit the Flink cluster on localhost from inside 
the gateway instead of using the jobmanager.rpc.address configured in the 
docker compose. I would like to deploy a SQL job to job manager from outside 
the cluster. I couldn’t find any configuration variable other than the 
jobmanager.rpc.address to set this communication properly. 
FYI, This is the docker-compose sample I am using for the gateway :

services:
  sql-gateway:
image: flink-1.20
ports:
  - "8083:8083"
command: sql-gateway.sh start-foreground 
-Dsql-gateway.endpoint.rest.address=localhost
depends_on:
  - jobmanager
environment:
  FLINK_PROPERTIES: |
jobmanager.rpc.address: jobmanager

This is the script used for job deployment: 
flink/bin/sql-client.sh gateway --endpoint localhost:8083 --file TestDeployment

Am I missing something? Any help here is appreciated. Many thanks.

Regards.
Ammu

Re: Question on usage of SQL Gateway with a remote Flink cluster

2025-03-12 Thread Ammu P
Hi Shengkai,

Thanks for responding. It works after setting rest.address. Now the docker file 
looks like below:

services:
  sql-gateway:
image: flink-1.20
ports:
  - "8083:8083"
command: sql-gateway.sh start-foreground 
-Dsql-gateway.endpoint.rest.address=localhost
depends_on:
  - jobmanager
environment:
  FLINK_PROPERTIES: |
jobmanager.rpc.address: jobmanager
rest.address: jobmanager

However I just noticed that, with this gateway configuration I am able to 
submit a job through cURL command without having to set any additional 
properties. But with the same configuration when I try to submit a job through 
SQL Client I am getting connection error. So now I updated the SQL Client 
initialisation script to explicitly pass the `rest.address` variable as below:

flink/bin/sql-client.sh gateway --endpoint localhost:8083 
-Drest.address=jobmanager --file TestDeployment

This is submitting the job successfully. I guess the SQL client in gateway mode 
is consuming the job manager address available in SQL Client configuration 
irrespective of the gateway configuration. Could you please confirm if this is 
the expected behaviour? 

My understanding is that if we are submitting a job through the SQL gateway it 
should consume the configuration for jobmanager related information from the 
gateway rather than the SQL Client configuration. WDYT ? Please feel free to 
correct me if I am wrong. 

Regards,
Ammu



Re: Question on usage of SQL Gateway with a remote Flink cluster

2025-03-23 Thread Ammu P
Hi Shengkai,

Thanks for the detailing. That clarified the setup related questions. However 
it would be nice to have this captured somewhere in the gateway documentation 
so that the doc becomes much more user friendly. I have created a ticket[1] 
with my suggestions on the improvement and I am happy to contribute to it.

[1] https://issues.apache.org/jira/browse/FLINK-37536

Regards,
Ammu