[jira] [Created] (FLINK-36108) Wait for state download on cancellation to enforce cleanup

2024-08-20 Thread Piotr Nowojski (Jira)
Piotr Nowojski created FLINK-36108:
--

 Summary: Wait for state download on cancellation to enforce cleanup
 Key: FLINK-36108
 URL: https://issues.apache.org/jira/browse/FLINK-36108
 Project: Flink
  Issue Type: Sub-task
Reporter: Piotr Nowojski
Assignee: Piotr Nowojski






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-36109) Improve FlinkStateSnapshot resource labels for efficient filtering

2024-08-20 Thread Gyula Fora (Jira)
Gyula Fora created FLINK-36109:
--

 Summary: Improve FlinkStateSnapshot resource labels for efficient 
filtering
 Key: FLINK-36109
 URL: https://issues.apache.org/jira/browse/FLINK-36109
 Project: Flink
  Issue Type: Sub-task
  Components: Kubernetes Operator
Reporter: Gyula Fora
 Fix For: kubernetes-operator-1.10.0


The new FlinkStateSnapshot CRs currently have the following label only:

```
snapshot.type: XY
```

We need to investigate if we can efficiently filter based on snapshot resource 
type (savepoint/checkpoint), target job etc. I am not sure if we can do this 
based on the spec directly or we need labels for it.

Also we should probably rename the snapshot.type label to trigger.type to be 
more straightforward.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-36110) Periodic save points trigger continuously

2024-08-20 Thread Gyula Fora (Jira)
Gyula Fora created FLINK-36110:
--

 Summary: Periodic save points trigger continuously 
 Key: FLINK-36110
 URL: https://issues.apache.org/jira/browse/FLINK-36110
 Project: Flink
  Issue Type: Sub-task
  Components: Kubernetes Operator
Reporter: Gyula Fora
 Fix For: kubernetes-operator-1.10.0
 Attachments: image-2024-08-20-09-50-38-660.png

Periodic savepoints seem to trigger continuously with the given config:
```
kubernetes.operator.periodic.savepoint.interval: 30s
```

we get
!image-2024-08-20-09-50-38-660.png|width=693,height=81!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] FLIP-XXX Amazon SQS Source Connector

2024-08-20 Thread Ahmed Hamdy
Hi Abhisagar and Saurabh
I have created the FLIP page and assigned it FLIP-477[1]. Feel free to
resume with the next steps.

1-
https://cwiki.apache.org/confluence/display/FLINK/FLIP-+477+Amazon+SQS+Source+Connector
Best Regards
Ahmed Hamdy


On Tue, 20 Aug 2024 at 06:05, Abhisagar Khatri 
wrote:

> Hi Flink Devs,
>
> Gentle Reminder for the request. We'd like to ask the PMC/Committers to
> transfer the content from the Amazon SQS Source Connector Google Doc [1]
> and assign a FLIP Number for us, which we can use further for voting.
> We are following the procedure outlined on the Flink Improvement Proposal
> Confluence page [2].
>
> [1]
> https://docs.google.com/document/d/1lreo27jNh0LkRs1Mj9B3wj3itrzMa38D4_XGryOIFks/edit?usp=sharing
> [2]
> https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals#FlinkImprovementProposals-Process
>
> Regards,
> Abhi & Saurabh
>
>
> On Tue, Aug 13, 2024 at 12:50 PM Saurabh Singh 
> wrote:
>
>> Hi Flink Devs,
>>
>> Thanks for all the feedback flink devs.
>>
>> Following the procedure outlined on the Flink Improvement Proposal
>> Confluence page [1], we kindly ask the PMC/Committers to transfer the
>> content from the Amazon SQS Source Connector Google Doc [2] and assign a
>> FLIP Number for us, which we will use for voting.
>>
>> [1]
>> https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals#FlinkImprovementProposals-Process
>> [2]
>> https://docs.google.com/document/d/1lreo27jNh0LkRs1Mj9B3wj3itrzMa38D4_XGryOIFks/edit?usp=sharing
>>
>> Regards
>> Saurabh & Abhi
>>
>>
>> On Thu, Aug 8, 2024 at 5:12 PM Saurabh Singh 
>> wrote:
>>
>>> Hi Ahmed,
>>>
>>> Yes, you're correct. Currently, we're utilizing the "record emitter" to
>>> send messages into the queue for deletion. However, for the actual deletion
>>> process, which is dependent on the checkpoints, we've been using the source
>>> reader class because it allows us to override the notifyCheckpointComplete
>>> method.
>>>
>>> Regards
>>> Saurabh & Abhi
>>>
>>> On Wed, Aug 7, 2024 at 2:18 PM Ahmed Hamdy  wrote:
>>>
 Hi Saurabh
 Thanks for addressing, I see the FLIP is in much better state.
 Could we specify where we queue messages for deletion, In my opinion
 the record emitter is a good place for that where we delete messages that
 are forwarded to the next operator.
 Other than that I don't have further comments.
 Thanks again for the effort.

 Best Regards
 Ahmed Hamdy


 On Wed, 31 Jul 2024 at 10:34, Saurabh Singh 
 wrote:

> Hi Ahmed,
>
> Thank you very much for the detailed, valuable review. Please find our
> responses below:
>
>
>- In the FLIP you mention the split is going to be 1 sqs Queue,
>does this mean we would support reading from multiple queues? This is 
> also
>not clear in the implementation of `addSplitsBack` whether we are 
> planning
>to support multiple sqs topics or not.
>
> *Our current implementation assumes that each source reads from a
> single SQS queue. If you need to read from multiple SQS queues, you can
> define multiple sources accordingly. We believe this approach is clearer
> and more organized compared to having a single source switch between
> multiple queues. This design choice is based on weighing the benefits, but
> we can support multiple queues per source if the need arises.*
>
>- Regarding Client creation, there has been some effort in the
>common `aws-util` module like createAwsSyncClient, we should reuse 
> that for
>`SqsClient` creation.
>
>*Thank you for bringing this to our attention. Yes, we will
>utilize the existing createClient methods available in the libraries. 
> Our
>goal is to avoid any code duplication on our end.*
>
>
>- On the same point for clients, Is there a reason the FLIP
>suggests async clients? sync clients have proven more stable and the 
> source
>threading model already guarantees no blocking by sync clients.
>
> *We were not aware of this, and we have been using async clients for
> our in-house use cases. However, since we already have sync clients in the
> aws-util that ensure no blocking, we are in a good position. We will use
> these sync clients during our development and testing efforts, and we will
> share the results and keep the community updated.*
>
>- On mentioning threading, the FLIP doesn’t mention the fetcher
>manager. Is it going to be `SingleThreadFetcherManager`? Would it be 
> better
>to make the source reader extend the SingleThreadedMultiplexReaderBase 
> or
>are we going to implement a more simple version?
>
> *Yes, we are considering implementing
> SingleThreadMultiplexSourceReaderBase for the Reader. We have included the
> implementation snippet in the FLIP for r

[jira] [Created] (FLINK-36111) fix: adjust MultiTableCommittableChannelComputer Topology name

2024-08-20 Thread MOBIN (Jira)
MOBIN created FLINK-36111:
-

 Summary: fix: adjust MultiTableCommittableChannelComputer Topology 
name 
 Key: FLINK-36111
 URL: https://issues.apache.org/jira/browse/FLINK-36111
 Project: Flink
  Issue Type: Bug
  Components: Flink CDC
Reporter: MOBIN
 Attachments: image-2024-08-20-17-46-28-115.png, 
image-2024-08-20-17-46-53-781.png

before fix:

!image-2024-08-20-17-46-28-115.png|width=759,height=173!

after fix:

!image-2024-08-20-17-46-53-781.png|width=578,height=150!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-36112) Add Support for CreateFlag.NO_LOCAL_WRITE in FLINK on YARN's File Creation to Manage Disk Space and Network Load in Labeled YARN Nodes

2024-08-20 Thread liang yu (Jira)
liang yu created FLINK-36112:


 Summary: Add Support for CreateFlag.NO_LOCAL_WRITE in FLINK on 
YARN's File Creation to Manage Disk Space and Network Load in Labeled YARN Nodes
 Key: FLINK-36112
 URL: https://issues.apache.org/jira/browse/FLINK-36112
 Project: Flink
  Issue Type: Improvement
Reporter: liang yu


{*}Description{*}: I am currently using Apache Flink to write files into 
Hadoop. The Flink application runs on a labeled YARN queue. During operation, 
it has been observed that the local disks on these labeled nodes get filled up 
quickly, and the network load is significantly high. This issue arises because 
Hadoop prioritizes writing files to the local node first, and the number of 
these labeled nodes is quite limited.

 

{*}Problem{*}: The current behavior leads to inefficient disk space utilization 
and high network traffic on these few labeled nodes, which could potentially 
affect the performance and reliability of the application. As shown in the 
picture, the host I circled have a average net_bytes_sent speed 1.2GB/s while 
the others are just 50MB/s, this imbalance in network and disk space nearly 
destroyed the whole cluster. 

 

{*}Implementation{*}: The implementation would involve adding a method of 
FileSystem.class to support the {{CreateFlag.NO_LOCAL_WRITE}}  when we try to 
create a new file through HadoopFileSystem.create() API. What's more, I modify 
the code of FileSink class so that we can choose to enable no_local_write or 
disable this feature. This will provide flexibility to  Flink running in 
labeled Yarn queues to opt for non-local writes when necessary.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-36113) Condition 'transform instanceof PhysicalTransformation' is always 'false'

2024-08-20 Thread tiancx (Jira)
tiancx created FLINK-36113:
--

 Summary: Condition 'transform instanceof PhysicalTransformation' 
is always 'false' 
 Key: FLINK-36113
 URL: https://issues.apache.org/jira/browse/FLINK-36113
 Project: Flink
  Issue Type: Improvement
 Environment: master
Reporter: tiancx
 Attachments: Snipaste_2024-08-21_01-29-52.png

Judgment optimization: The legacyTransform method in the StreamGraphGenerator 
class has a judgment: the transform instance of PhysicalTransformation is 
always false
{code:java}
//代码占位符
private Collection legacyTransform(Transformation transform) {
Collection transformedIds;
if (transform instanceof FeedbackTransformation) {
transformedIds = transformFeedback((FeedbackTransformation) 
transform);
} else if (transform instanceof CoFeedbackTransformation) {
transformedIds = transformCoFeedback((CoFeedbackTransformation) 
transform);
} else if (transform instanceof SourceTransformationWrapper) {
transformedIds = transform(((SourceTransformationWrapper) 
transform).getInput());
} else {
throw new IllegalStateException("Unknown transformation: " + transform);
}

if (transform.getBufferTimeout() >= 0) {
streamGraph.setBufferTimeout(transform.getId(), 
transform.getBufferTimeout());
} else {
streamGraph.setBufferTimeout(transform.getId(), getBufferTimeout());
}

if (transform.getUid() != null) {
streamGraph.setTransformationUID(transform.getId(), transform.getUid());
}
if (transform.getUserProvidedNodeHash() != null) {
streamGraph.setTransformationUserHash(
transform.getId(), transform.getUserProvidedNodeHash());
}

if (!streamGraph.getExecutionConfig().hasAutoGeneratedUIDsEnabled()) {
if (transform instanceof PhysicalTransformation
&& transform.getUserProvidedNodeHash() == null
&& transform.getUid() == null) {
throw new IllegalStateException(
"Auto generated UIDs have been disabled "
+ "but no UID or hash has been assigned to operator 
"
+ transform.getName());
}

}

if (transform.getMinResources() != null && 
transform.getPreferredResources() != null) {
streamGraph.setResources(
transform.getId(),
transform.getMinResources(),
transform.getPreferredResources());
}

streamGraph.setManagedMemoryUseCaseWeights(
transform.getId(),
transform.getManagedMemoryOperatorScopeUseCaseWeights(),
transform.getManagedMemorySlotScopeUseCases());

return transformedIds;
} {code}
I'm willing to optimize this.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-36114) Schema registry should block when handling existing requests

2024-08-20 Thread yux (Jira)
yux created FLINK-36114:
---

 Summary: Schema registry should block when handling existing 
requests
 Key: FLINK-36114
 URL: https://issues.apache.org/jira/browse/FLINK-36114
 Project: Flink
  Issue Type: Improvement
  Components: Flink CDC
Reporter: yux


Currently, SchemaRegistry asynchronously receives schema change requests from 
SchemaOperator, and results of multiple requests might got mixed up together, 
causing incorrect logic flow in multiple parallelism cases.

Changing SchemaRegistry's behavior to accept requests in serial should resolve 
this problem.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-36115) Allow to scan newly table DDL during incremental reading stage.

2024-08-20 Thread LvYanquan (Jira)
LvYanquan created FLINK-36115:
-

 Summary: Allow to scan newly table DDL during incremental reading 
stage.
 Key: FLINK-36115
 URL: https://issues.apache.org/jira/browse/FLINK-36115
 Project: Flink
  Issue Type: Bug
  Components: Flink CDC
Affects Versions: cdc-3.2.0
Reporter: LvYanquan
 Fix For: cdc-3.2.0
 Attachments: image-2024-08-21-10-06-17-758.png

Currently, MySQL pipeline source will determine all captured tables before 
building 
MySqlDataSource. However, this will lead to Ignore of the create table 
statement for the new table.

!image-2024-08-21-10-06-17-758.png!
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[VOTE] FLIP-473: Introduce New SQL Operators Based on Asynchronous State APIs

2024-08-20 Thread Xuyang
Hi, everyone.

I would like to start a vote on FLIP-473: Introduce New SQL Operators Based

on Asynchronous State APIs [1]. The discussion thread can be found here [2].

The vote will be open for at least 72 hours unless there are any objections

or insufficient votes.




[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-473+Introduce+New+SQL+Operators+Based+on+Asynchronous+State+APIs

[2] https://lists.apache.org/thread/k6x03x7vjtn3gl1vrknkx8zvyn319bk9




--

Best!
Xuyang

Re: [VOTE] FLIP-473: Introduce New SQL Operators Based on Asynchronous State APIs

2024-08-20 Thread Zakelly Lan
+1 (binding)

Best,
Zakelly

On Wed, Aug 21, 2024 at 10:33 AM Xuyang  wrote:

> Hi, everyone.
>
> I would like to start a vote on FLIP-473: Introduce New SQL Operators Based
>
> on Asynchronous State APIs [1]. The discussion thread can be found here
> [2].
>
> The vote will be open for at least 72 hours unless there are any objections
>
> or insufficient votes.
>
>
>
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-473+Introduce+New+SQL+Operators+Based+on+Asynchronous+State+APIs
>
> [2] https://lists.apache.org/thread/k6x03x7vjtn3gl1vrknkx8zvyn319bk9
>
>
>
>
> --
>
> Best!
> Xuyang


Re: Re: [VOTE] FLIP-455: Declare async state processing and checkpoint the in-flight requests

2024-08-20 Thread Zakelly Lan
+1 (binding)

Best,
Zakelly

On Thu, Aug 15, 2024 at 9:02 PM Hangxiang Yu  wrote:

> +1 (binding)
>
> Thanks for driving this.
>
> On Thu, Aug 15, 2024 at 8:42 PM Ahmed Hamdy  wrote:
>
> > +1 (non-binding)
> > Thanks for driving.
> > Best Regards
> > Ahmed Hamdy
> >
> >
> > On Wed, 14 Aug 2024 at 07:33, Xuyang  wrote:
> >
> > > +1 (non-binding)
> > >
> > >
> > > --
> > >
> > > Best!
> > > Xuyang
> > >
> > >
> > >
> > >
> > >
> > > 在 2024-08-14 11:04:18,"Yuepeng Pan"  写道:
> > > >
> > > >
> > > >
> > > >+1 (non-binding)
> > > >Thanks for driving it !
> > > >
> > > >
> > > >Best regards,
> > > >
> > > >Yuepeng Pan
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >在 2024-08-14 10:32:03,"Yanfei Lei"  写道:
> > > >>Hi Zakelly,
> > > >>
> > > >>Thanks for driving this, +1.
> > > >>
> > > >>Zakelly Lan  于2024年8月13日周二 13:42写道:
> > > >>>
> > > >>> Hi devs,
> > > >>>
> > > >>> I'd like to start a vote on the FLIP-455: Declare async state
> > > processing
> > > >>> and checkpoint the in-flight requests[1]. The discussion thread is
> > > here [2].
> > > >>>
> > > >>> The vote will be open for at least 72 hours unless there is an
> > > objection or
> > > >>> insufficient votes.
> > > >>>
> > > >>> [1] https://cwiki.apache.org/confluence/x/C4owEg
> > > >>> [2]
> https://lists.apache.org/thread/9ccxkbf7ww8dscybfzh9p8dwrdy3olpp
> > > >>>
> > > >>>
> > > >>> Best,
> > > >>> Zakelly
> > > >>
> > > >>
> > > >>
> > > >>--
> > > >>Best,
> > > >>Yanfei
> > >
> >
>
>
> --
> Best,
> Hangxiang.
>