from:"David \(Jira\)"

[jira] [Created] (FLINK-28103) Job cancelling api returns 404 when job is actually running

2022-06-16 Thread David (Jira)

David created FLINK-28103:
-

 Summary: Job cancelling api returns 404 when job is actually 
running
 Key: FLINK-28103
 URL: https://issues.apache.org/jira/browse/FLINK-28103
 Project: Flink
  Issue Type: Bug
  Components: Runtime / REST
Affects Versions: 1.13.5
Reporter: David
 Attachments: image-2022-06-17-14-04-44-307.png, 
image-2022-06-17-14-05-36-264.png

The job is still running:

!image-2022-06-17-14-04-44-307.png!

 

but the cancelling api returns 404:

!image-2022-06-17-14-05-36-264.png!



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Commented] (FLINK-28103) Job cancelling api returns 404 when job is actually running

2022-06-17 Thread David (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-28103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17555842#comment-17555842
 ] 

David commented on FLINK-28103:
---

[~martijnvisser] I'll try that next week

> Job cancelling api returns 404 when job is actually running
> ---
>
> Key: FLINK-28103
> URL: https://issues.apache.org/jira/browse/FLINK-28103
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / REST
>Affects Versions: 1.13.5
>Reporter: David
>Priority: Major
> Attachments: image-2022-06-17-14-04-44-307.png, 
> image-2022-06-17-14-05-36-264.png
>
>
> The job is still running:
> !image-2022-06-17-14-04-44-307.png!
>  
> but the cancelling api returns 404:
> !image-2022-06-17-14-05-36-264.png!



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Commented] (FLINK-28103) Job cancelling api returns 404 when job is actually running

2022-06-17 Thread David (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-28103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17555843#comment-17555843
 ] 

David commented on FLINK-28103:
---

[~chesnay] PATCH is supported neither in Java (HttpURLConnection) nor in http 
tools such as Insomnia. The header X-HTTP-Method-Override is a workaround but 
seems not working with the flink rest service.

> Job cancelling api returns 404 when job is actually running
> ---
>
> Key: FLINK-28103
> URL: https://issues.apache.org/jira/browse/FLINK-28103
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / REST
>Affects Versions: 1.13.5
>Reporter: David
>Priority: Major
> Attachments: image-2022-06-17-14-04-44-307.png, 
> image-2022-06-17-14-05-36-264.png
>
>
> The job is still running:
> !image-2022-06-17-14-04-44-307.png!
>  
> but the cancelling api returns 404:
> !image-2022-06-17-14-05-36-264.png!



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Updated] (FLINK-37747) GlobalCommitterOperator cannot commit after scaling writer/committer

2025-04-28 Thread David (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-37747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David updated FLINK-37747:
--
Priority: Blocker  (was: Major)

> GlobalCommitterOperator cannot commit after scaling writer/committer
> 
>
> Key: FLINK-37747
> URL: https://issues.apache.org/jira/browse/FLINK-37747
> Project: Flink
>  Issue Type: Bug
>Reporter: David
>Priority: Blocker
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-37747) GlobalCommitterOperator cannot commit after scaling writer/committer

2025-04-28 Thread David (Jira)

David created FLINK-37747:
-

 Summary: GlobalCommitterOperator cannot commit after scaling 
writer/committer
 Key: FLINK-37747
 URL: https://issues.apache.org/jira/browse/FLINK-37747
 Project: Flink
  Issue Type: Bug
Reporter: David






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-37747) GlobalCommitterOperator cannot commit after scaling writer/committer

2025-04-28 Thread David (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-37747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David updated FLINK-37747:
--
Description: 
Hey,

Our FLINK job stopped writing into Delta table with FLINK Delta connector 
frequently. After checking the issue, we found in GlobalCommitterOperator, in 
[commit|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/streaming/runtime/operators/sink/GlobalCommitterOperator.java#L207]
 function, it was returned directly when checking some checkpoint has finished 
or not(this 
[code|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/streaming/runtime/operators/sink/GlobalCommitterOperator.java#L211]).
 The issue was happened when:
 * auto-scaler scales up chained writer/committer(the direct upstream operator 
of GlobalCommitterOperator)
 * job ran limited TM first with lower parallelism for writer/committer, and 
then writer/committer was scaled up to higher parallelism

After debugging with more logs, we found the cause of the issue. An example is:
 * for checkpoint 3, FLINK job completed successfully with 3 writer/committer 
in parallel
 ** All committable objects in writer/committer were saved into checkpoint 
state in checkpoint 3
 * writer/committer was scaled up to 5 parallel tasks
 * writer/committer restore state from checkpoint 3, they will emit committable 
objects from checkpoint 3. code is 
[here|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/streaming/runtime/operators/sink/CommitterOperator.java#L139]
 ** latest parallelism of writer/committer is used, which is 5 in 
CommittableSummary. Code is 
[here|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/streaming/runtime/operators/sink/CommitterOperator.java#L186]
 * GlobalCommitterOpeartor received committable summary from writer/committer, 
it knows:
 ** 5 parallel writer/committer from upstreams
 ** it will look for committable summary from 5 upstream writer/committer
 * 3 writer/committers emit CommittableSummary to global committer operator as 
only 3 restore state from checkpoint 3
 * Global committer operator stuck here forever as it looks for committable 
summary for 5 subtasks from upstream operator

We have a quick solution for this case and raise a PR to fix this. 

We are using FLINK 1.20 but we found the issue is still existed in master 
branch.

> GlobalCommitterOperator cannot commit after scaling writer/committer
> 
>
> Key: FLINK-37747
> URL: https://issues.apache.org/jira/browse/FLINK-37747
> Project: Flink
>  Issue Type: Bug
>Reporter: David
>Priority: Blocker
>
> Hey,
> Our FLINK job stopped writing into Delta table with FLINK Delta connector 
> frequently. After checking the issue, we found in GlobalCommitterOperator, in 
> [commit|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/streaming/runtime/operators/sink/GlobalCommitterOperator.java#L207]
>  function, it was returned directly when checking some checkpoint has 
> finished or not(this 
> [code|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/streaming/runtime/operators/sink/GlobalCommitterOperator.java#L211]).
>  The issue was happened when:
>  * auto-scaler scales up chained writer/committer(the direct upstream 
> operator of GlobalCommitterOperator)
>  * job ran limited TM first with lower parallelism for writer/committer, and 
> then writer/committer was scaled up to higher parallelism
> After debugging with more logs, we found the cause of the issue. An example 
> is:
>  * for checkpoint 3, FLINK job completed successfully with 3 writer/committer 
> in parallel
>  ** All committable objects in writer/committer were saved into checkpoint 
> state in checkpoint 3
>  * writer/committer was scaled up to 5 parallel tasks
>  * writer/committer restore state from checkpoint 3, they will emit 
> committable objects from checkpoint 3. code is 
> [here|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/streaming/runtime/operators/sink/CommitterOperator.java#L139]
>  ** latest parallelism of writer/committer is used, which is 5 in 
> CommittableSummary. Code is 
> [here|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/streaming/runtime/operators/sink/CommitterOperator.java#L186]
>  * GlobalCommitterOpeartor received committable summary from 
> writer/committer, it knows:
>  ** 5 parallel writer/committer from upstreams
>  ** it will look for committable summary from 5 upstream writer/committer
>  * 3 writer/committers emit CommittableSummary to global committer operator 
> as only 3 restore state from checkpoint 3
>  * Global c

[jira] [Commented] (FLINK-37747) GlobalCommitterOperator cannot commit after scaling writer/committer

2025-05-04 Thread David (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-37747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17949356#comment-17949356
 ] 

David commented on FLINK-37747:
---

Hey [~arvid] Thanks for reviewing the PR. I updated the PR accordingly. Do you 
think the PR is OK to merge?

This issue is also in 1.20.1 release. Do you have guideline to apply the fix to 
1.20.1 release? Thanks!

> GlobalCommitterOperator cannot commit after scaling writer/committer
> 
>
> Key: FLINK-37747
> URL: https://issues.apache.org/jira/browse/FLINK-37747
> Project: Flink
>  Issue Type: Bug
>Reporter: David
>Assignee: David
>Priority: Blocker
>  Labels: pull-request-available
>
> Hey,
> Our FLINK job stopped writing into Delta table with FLINK Delta connector 
> frequently. After checking the issue, we found in GlobalCommitterOperator, in 
> [commit|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/streaming/runtime/operators/sink/GlobalCommitterOperator.java#L207]
>  function, it was returned directly when checking some checkpoint has 
> finished or not(this 
> [code|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/streaming/runtime/operators/sink/GlobalCommitterOperator.java#L211]).
>  The issue was happened when:
>  * auto-scaler scales up chained writer/committer(the direct upstream 
> operator of GlobalCommitterOperator)
>  * job ran limited TM first with lower parallelism for writer/committer, and 
> then writer/committer was scaled up to higher parallelism
> After debugging with more logs, we found the cause of the issue. An example 
> is:
>  * for checkpoint 3, FLINK job completed successfully with 3 writer/committer 
> in parallel
>  ** All committable objects in writer/committer were saved into checkpoint 
> state in checkpoint 3
>  * writer/committer was scaled up to 5 parallel tasks
>  * writer/committer restore state from checkpoint 3, they will emit 
> committable objects from checkpoint 3. code is 
> [here|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/streaming/runtime/operators/sink/CommitterOperator.java#L139]
>  ** latest parallelism of writer/committer is used, which is 5 in 
> CommittableSummary. Code is 
> [here|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/streaming/runtime/operators/sink/CommitterOperator.java#L186]
>  * GlobalCommitterOpeartor received committable summary from 
> writer/committer, it knows:
>  ** 5 parallel writer/committer from upstreams
>  ** it will look for committable summary from 5 upstream writer/committer
>  * 3 writer/committers emit CommittableSummary to global committer operator 
> as only 3 restore state from checkpoint 3
>  * Global committer operator stuck here forever as it looks for committable 
> summary for 5 subtasks from upstream operator
> We have a quick solution for this case and raise a PR to fix this. 
> We are using FLINK 1.20 but we found the issue is still existed in master 
> branch.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-37747) GlobalCommitterOperator cannot commit after scaling writer/committer

2025-04-29 Thread David (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-37747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17948068#comment-17948068
 ] 

David commented on FLINK-37747:
---

Thanks [~arvid] . We are using FLINK 1.20.1. Could you give me an example of IT 
case? I can try to make one and update the PR.

> GlobalCommitterOperator cannot commit after scaling writer/committer
> 
>
> Key: FLINK-37747
> URL: https://issues.apache.org/jira/browse/FLINK-37747
> Project: Flink
>  Issue Type: Bug
>Reporter: David
>Priority: Blocker
>  Labels: pull-request-available
>
> Hey,
> Our FLINK job stopped writing into Delta table with FLINK Delta connector 
> frequently. After checking the issue, we found in GlobalCommitterOperator, in 
> [commit|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/streaming/runtime/operators/sink/GlobalCommitterOperator.java#L207]
>  function, it was returned directly when checking some checkpoint has 
> finished or not(this 
> [code|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/streaming/runtime/operators/sink/GlobalCommitterOperator.java#L211]).
>  The issue was happened when:
>  * auto-scaler scales up chained writer/committer(the direct upstream 
> operator of GlobalCommitterOperator)
>  * job ran limited TM first with lower parallelism for writer/committer, and 
> then writer/committer was scaled up to higher parallelism
> After debugging with more logs, we found the cause of the issue. An example 
> is:
>  * for checkpoint 3, FLINK job completed successfully with 3 writer/committer 
> in parallel
>  ** All committable objects in writer/committer were saved into checkpoint 
> state in checkpoint 3
>  * writer/committer was scaled up to 5 parallel tasks
>  * writer/committer restore state from checkpoint 3, they will emit 
> committable objects from checkpoint 3. code is 
> [here|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/streaming/runtime/operators/sink/CommitterOperator.java#L139]
>  ** latest parallelism of writer/committer is used, which is 5 in 
> CommittableSummary. Code is 
> [here|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/streaming/runtime/operators/sink/CommitterOperator.java#L186]
>  * GlobalCommitterOpeartor received committable summary from 
> writer/committer, it knows:
>  ** 5 parallel writer/committer from upstreams
>  ** it will look for committable summary from 5 upstream writer/committer
>  * 3 writer/committers emit CommittableSummary to global committer operator 
> as only 3 restore state from checkpoint 3
>  * Global committer operator stuck here forever as it looks for committable 
> summary for 5 subtasks from upstream operator
> We have a quick solution for this case and raise a PR to fix this. 
> We are using FLINK 1.20 but we found the issue is still existed in master 
> branch.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-37747) GlobalCommitterOperator cannot commit after scaling writer/committer

2025-05-11 Thread David (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-37747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17950839#comment-17950839
 ] 

David commented on FLINK-37747:
---

[~arvid] yep, I am making PRs to 1.20 and 1.19.

> GlobalCommitterOperator cannot commit after scaling writer/committer
> 
>
> Key: FLINK-37747
> URL: https://issues.apache.org/jira/browse/FLINK-37747
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.19.2, 1.20.1, 2.1.0
>Reporter: David
>Assignee: David
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 2.1.0
>
>
> Hey,
> Our FLINK job stopped writing into Delta table with FLINK Delta connector 
> frequently. After checking the issue, we found in GlobalCommitterOperator, in 
> [commit|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/streaming/runtime/operators/sink/GlobalCommitterOperator.java#L207]
>  function, it was returned directly when checking some checkpoint has 
> finished or not(this 
> [code|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/streaming/runtime/operators/sink/GlobalCommitterOperator.java#L211]).
>  The issue was happened when:
>  * auto-scaler scales up chained writer/committer(the direct upstream 
> operator of GlobalCommitterOperator)
>  * job ran limited TM first with lower parallelism for writer/committer, and 
> then writer/committer was scaled up to higher parallelism
> After debugging with more logs, we found the cause of the issue. An example 
> is:
>  * for checkpoint 3, FLINK job completed successfully with 3 writer/committer 
> in parallel
>  ** All committable objects in writer/committer were saved into checkpoint 
> state in checkpoint 3
>  * writer/committer was scaled up to 5 parallel tasks
>  * writer/committer restore state from checkpoint 3, they will emit 
> committable objects from checkpoint 3. code is 
> [here|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/streaming/runtime/operators/sink/CommitterOperator.java#L139]
>  ** latest parallelism of writer/committer is used, which is 5 in 
> CommittableSummary. Code is 
> [here|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/streaming/runtime/operators/sink/CommitterOperator.java#L186]
>  * GlobalCommitterOpeartor received committable summary from 
> writer/committer, it knows:
>  ** 5 parallel writer/committer from upstreams
>  ** it will look for committable summary from 5 upstream writer/committer
>  * 3 writer/committers emit CommittableSummary to global committer operator 
> as only 3 restore state from checkpoint 3
>  * Global committer operator stuck here forever as it looks for committable 
> summary for 5 subtasks from upstream operator
> We have a quick solution for this case and raise a PR to fix this. 
> We are using FLINK 1.20 but we found the issue is still existed in master 
> branch.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-37747) GlobalCommitterOperator cannot commit after scaling writer/committer

2025-05-11 Thread David (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-37747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17950842#comment-17950842
 ] 

David commented on FLINK-37747:
---

hey [~arvid] i made a PR to path release-2.0: 
[https://github.com/apache/flink/pull/26546]

for release-1.20 and release-1.19, some extra commits are needed and I am not 
sure which commit log should be included. Can you help on this? Thank you!

> GlobalCommitterOperator cannot commit after scaling writer/committer
> 
>
> Key: FLINK-37747
> URL: https://issues.apache.org/jira/browse/FLINK-37747
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.19.2, 1.20.1, 2.1.0
>Reporter: David
>Assignee: David
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 2.1.0
>
>
> Hey,
> Our FLINK job stopped writing into Delta table with FLINK Delta connector 
> frequently. After checking the issue, we found in GlobalCommitterOperator, in 
> [commit|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/streaming/runtime/operators/sink/GlobalCommitterOperator.java#L207]
>  function, it was returned directly when checking some checkpoint has 
> finished or not(this 
> [code|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/streaming/runtime/operators/sink/GlobalCommitterOperator.java#L211]).
>  The issue was happened when:
>  * auto-scaler scales up chained writer/committer(the direct upstream 
> operator of GlobalCommitterOperator)
>  * job ran limited TM first with lower parallelism for writer/committer, and 
> then writer/committer was scaled up to higher parallelism
> After debugging with more logs, we found the cause of the issue. An example 
> is:
>  * for checkpoint 3, FLINK job completed successfully with 3 writer/committer 
> in parallel
>  ** All committable objects in writer/committer were saved into checkpoint 
> state in checkpoint 3
>  * writer/committer was scaled up to 5 parallel tasks
>  * writer/committer restore state from checkpoint 3, they will emit 
> committable objects from checkpoint 3. code is 
> [here|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/streaming/runtime/operators/sink/CommitterOperator.java#L139]
>  ** latest parallelism of writer/committer is used, which is 5 in 
> CommittableSummary. Code is 
> [here|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/streaming/runtime/operators/sink/CommitterOperator.java#L186]
>  * GlobalCommitterOpeartor received committable summary from 
> writer/committer, it knows:
>  ** 5 parallel writer/committer from upstreams
>  ** it will look for committable summary from 5 upstream writer/committer
>  * 3 writer/committers emit CommittableSummary to global committer operator 
> as only 3 restore state from checkpoint 3
>  * Global committer operator stuck here forever as it looks for committable 
> summary for 5 subtasks from upstream operator
> We have a quick solution for this case and raise a PR to fix this. 
> We are using FLINK 1.20 but we found the issue is still existed in master 
> branch.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-37747) GlobalCommitterOperator cannot commit after scaling writer/committer

2025-05-07 Thread David (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-37747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17950147#comment-17950147
 ] 

David commented on FLINK-37747:
---

Thanks [~arvid] for the review! Updated the PR :) 

> GlobalCommitterOperator cannot commit after scaling writer/committer
> 
>
> Key: FLINK-37747
> URL: https://issues.apache.org/jira/browse/FLINK-37747
> Project: Flink
>  Issue Type: Bug
>Reporter: David
>Assignee: David
>Priority: Blocker
>  Labels: pull-request-available
>
> Hey,
> Our FLINK job stopped writing into Delta table with FLINK Delta connector 
> frequently. After checking the issue, we found in GlobalCommitterOperator, in 
> [commit|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/streaming/runtime/operators/sink/GlobalCommitterOperator.java#L207]
>  function, it was returned directly when checking some checkpoint has 
> finished or not(this 
> [code|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/streaming/runtime/operators/sink/GlobalCommitterOperator.java#L211]).
>  The issue was happened when:
>  * auto-scaler scales up chained writer/committer(the direct upstream 
> operator of GlobalCommitterOperator)
>  * job ran limited TM first with lower parallelism for writer/committer, and 
> then writer/committer was scaled up to higher parallelism
> After debugging with more logs, we found the cause of the issue. An example 
> is:
>  * for checkpoint 3, FLINK job completed successfully with 3 writer/committer 
> in parallel
>  ** All committable objects in writer/committer were saved into checkpoint 
> state in checkpoint 3
>  * writer/committer was scaled up to 5 parallel tasks
>  * writer/committer restore state from checkpoint 3, they will emit 
> committable objects from checkpoint 3. code is 
> [here|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/streaming/runtime/operators/sink/CommitterOperator.java#L139]
>  ** latest parallelism of writer/committer is used, which is 5 in 
> CommittableSummary. Code is 
> [here|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/streaming/runtime/operators/sink/CommitterOperator.java#L186]
>  * GlobalCommitterOpeartor received committable summary from 
> writer/committer, it knows:
>  ** 5 parallel writer/committer from upstreams
>  ** it will look for committable summary from 5 upstream writer/committer
>  * 3 writer/committers emit CommittableSummary to global committer operator 
> as only 3 restore state from checkpoint 3
>  * Global committer operator stuck here forever as it looks for committable 
> summary for 5 subtasks from upstream operator
> We have a quick solution for this case and raise a PR to fix this. 
> We are using FLINK 1.20 but we found the issue is still existed in master 
> branch.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-28103) Job cancelling api returns 404 when job is actually running

[jira] [Commented] (FLINK-28103) Job cancelling api returns 404 when job is actually running

[jira] [Commented] (FLINK-28103) Job cancelling api returns 404 when job is actually running

[jira] [Updated] (FLINK-37747) GlobalCommitterOperator cannot commit after scaling writer/committer

[jira] [Created] (FLINK-37747) GlobalCommitterOperator cannot commit after scaling writer/committer

[jira] [Updated] (FLINK-37747) GlobalCommitterOperator cannot commit after scaling writer/committer

[jira] [Commented] (FLINK-37747) GlobalCommitterOperator cannot commit after scaling writer/committer

[jira] [Commented] (FLINK-37747) GlobalCommitterOperator cannot commit after scaling writer/committer

[jira] [Commented] (FLINK-37747) GlobalCommitterOperator cannot commit after scaling writer/committer

[jira] [Commented] (FLINK-37747) GlobalCommitterOperator cannot commit after scaling writer/committer

[jira] [Commented] (FLINK-37747) GlobalCommitterOperator cannot commit after scaling writer/committer

11 matches

Site Navigation

Mail list logo

Footer information