Selman Kayrancioglu created FLINK-37639:
-------------------------------------------
Summary: KeyedStateBootstrapFunction hangs if an exception is
thrown within
Key: FLINK-37639
URL: https://issues.apache.org/jira/browse/FLINK-37639
Project: Flink
Issue Type: Bug
Components: API / State Processor
Affects Versions: 1.19.2, 1.19.1, 1.20.0
Reporter: Selman Kayrancioglu
KeyedStateBootstrapFunction hangs if an exception is thrown within
When an exception occurs within
`KeyedStateBootstrapFunction<>.processElements`, the job fails to terminate
properly. Instead of failing with an error, the operator appears to hang
indefinitely when viewed from the UI or monitoring tools.
I've created a minimal reproducer demonstrating this issue at:
https://github.com/seruman/flink-bootstrap-function-hangs-on-exception-reproducer
This behavior is consistent across multiple environments - I've confirmed it
occurs when:
- Running locally with `start-cluster.sh`
- Deploying to Kubernetes with flink-kubernetes-operator
- Executing unit tests with `MiniCluster` (as shown in the repository)
I would expect it to to fail with the appropriate exception rather than
becoming unresponsive.
Notably, there were no error logs generated in either the JobManager or
TaskManager.
So far I've tried versions 1.19.1, 1.19.2, 1.20.0.
Sample config;
```
pipeline.max-parallelism=10
parallelism.default=2
execution.runtime-mode=BATCH
execution.batch-shuffle-mode=ALL_EXCHANGES_PIPELINED
jobmanager.scheduler=Default
```
Please let me know if you need any additional information.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)