Hi. My Flink Deployment is set to use savepoint for upgrades and for taking 
savepoint before stopping.
When rescaling happens, for some reason it scales the JobManager to zero 
(“Scaling JobManager Deployment to zero with 300 seconds timeout”) and the job 
goes into FINISHED state. It doesn’t seem to be able to continue. Any ideas why 
is it deleting itself?
The savepoints are stored on s3.
I can restart the job from savepoint manually but at the next rescaling 
operation it deletes itself again.

Thanks!

Log:
2024-04-22 23:39:17,016 o.a.f.k.o.o.d.ApplicationObserver [INFO 
][flink/f-d7681d0f-c093-5d8a-b5f5-2b66b4547bf6] Observing JobManager 
deployment. Previous status: DEPLOYING
2024-04-22 23:39:17,024 o.a.f.k.o.o.d.ApplicationObserver [INFO 
][flink/f-d7681d0f-c093-5d8a-b5f5-2b66b4547bf6] JobManager is being deployed
2024-04-22 23:39:17,045 o.a.f.a.JobAutoScalerImpl      [INFO 
][flink/f-d7681d0f-c093-5d8a-b5f5-2b66b4547bf6] Cleaning up autoscaling meta 
data
2024-04-22 23:39:17,046 o.a.f.k.o.s.AbstractFlinkService [INFO 
][flink/f-d7681d0f-c093-5d8a-b5f5-2b66b4547bf6] Deleting cluster with 
Foreground propagation
2024-04-22 23:39:17,046 o.a.f.k.o.s.AbstractFlinkService [INFO 
][flink/f-d7681d0f-c093-5d8a-b5f5-2b66b4547bf6] Scaling JobManager Deployment 
to zero with 300 seconds timeout...
2024-04-22 23:39:18,941 o.a.f.r.u.c.m.ProcessMemoryUtils [INFO 
][flink/f-cdda7a8d-8259-5137-8397-8125b212e556] The derived from fraction jvm 
overhead memory (102.400mb (107374184 bytes)) is less than its min value 
192.000mb (201326592 bytes), min value will be used instead
2024-04-22 23:39:18,982 o.a.f.a.ScalingMetricCollector [WARN 
][flink/f-cdda7a8d-8259-5137-8397-8125b212e556] pendingRecords metric for 
e22861b1943d40ca6f5a40ae6332d42b could not be found. Either a legacy source or 
an idle source. Assuming no pending records.
2024-04-22 23:39:18,984 o.a.f.a.ScalingMetricCollector [WARN 
][flink/f-cdda7a8d-8259-5137-8397-8125b212e556] pendingRecords metric for 
e503bd1c0fb6799d36f0c4b786e69fd9 could not be found. Either a legacy source or 
an idle source. Assuming no pending records.
2024-04-22 23:39:18,985 o.a.f.a.ScalingMetricCollector [WARN 
][flink/f-cdda7a8d-8259-5137-8397-8125b212e556] pendingRecords metric for 
de478b666343f04920f1cd3e6e65548c could not be found. Either a legacy source or 
an idle source. Assuming no pending records.
2024-04-22 23:39:18,987 o.a.f.a.ScalingMetricCollector [WARN 
][flink/f-cdda7a8d-8259-5137-8397-8125b212e556] pendingRecords metric for 
0956ec3e5721a3bc416c208d690e220a could not be found. Either a legacy source or 
an idle source. Assuming no pending records.
2024-04-22 23:39:18,988 o.a.f.a.ScalingMetricCollector [WARN 
][flink/f-cdda7a8d-8259-5137-8397-8125b212e556] pendingRecords metric for 
2a0e1e1304dc61d997d3e6f7025df9b3 could not be found. Either a legacy source or 
an idle source. Assuming no pending records.
2024-04-22 23:39:18,989 o.a.f.a.ScalingMetricCollector [WARN 
][flink/f-cdda7a8d-8259-5137-8397-8125b212e556] pendingRecords metric for 
09152f8c760b2503dd4174abf81781b6 could not be found. Either a legacy source or 
an idle source. Assuming no pending records.
2024-04-22 23:39:18,991 o.a.f.a.ScalingMetricCollector [WARN 
][flink/f-cdda7a8d-8259-5137-8397-8125b212e556] pendingRecords metric for 
cbe38acc7ac41cc91794215391eedc28 could not be found. Either a legacy source or 
an idle source. Assuming no pending records.
2024-04-22 23:39:18,992 o.a.f.a.ScalingMetricCollector [WARN 
][flink/f-cdda7a8d-8259-5137-8397-8125b212e556] pendingRecords metric for 
cc2707c5c5b5cdd319d28980a6ad99d0 could not be found. Either a legacy source or 
an idle source. Assuming no pending records.
2024-04-22 23:39:18,994 o.a.f.a.ScalingMetricCollector [WARN 
][flink/f-cdda7a8d-8259-5137-8397-8125b212e556] pendingRecords metric for 
d2fd710697779b81072031f8924b5967 could not be found. Either a legacy source or 
an idle source. Assuming no pending records.
2024-04-22 23:39:18,995 o.a.f.a.ScalingMetricCollector [WARN 
][flink/f-cdda7a8d-8259-5137-8397-8125b212e556] pendingRecords metric for 
9892bf1d00288ef951498dd18f18dd24 could not be found. Either a legacy source or 
an idle source. Assuming no pending records.
2024-04-22 23:39:18,997 o.a.f.a.ScalingMetricCollector [WARN 
][flink/f-cdda7a8d-8259-5137-8397-8125b212e556] pendingRecords metric for 
acf3064b984e12010bfeccbe4e28d9a5 could not be found. Either a legacy source or 
an idle source. Assuming no pending records.
2024-04-22 23:39:18,998 o.a.f.a.ScalingMetricCollector [WARN 
][flink/f-cdda7a8d-8259-5137-8397-8125b212e556] pendingRecords metric for 
8737aeacfef5f24708290203c9de47e1 could not be found. Either a legacy source or 
an idle source. Assuming no pending records.
2024-04-22 23:39:18,999 o.a.f.a.ScalingMetricCollector [WARN 
][flink/f-cdda7a8d-8259-5137-8397-8125b212e556] pendingRecords metric for 
45aa9886e32677c3f13caa2aa54ec3ad could not be found. Either a legacy source or 
an idle source. Assuming no pending records.
2024-04-22 23:39:19,001 o.a.f.a.ScalingMetricCollector [WARN 
][flink/f-cdda7a8d-8259-5137-8397-8125b212e556] pendingRecords metric for 
1d4f6379179c05db8001430cd4b772f7 could not be found. Either a legacy source or 
an idle source. Assuming no pending records.
2024-04-22 23:39:19,003 o.a.f.a.ScalingMetricCollector [WARN 
][flink/f-cdda7a8d-8259-5137-8397-8125b212e556] pendingRecords metric for 
2de822e03c5f4f6b0ca9ea8210622acb could not be found. Either a legacy source or 
an idle source. Assuming no pending records.
2024-04-22 23:39:19,151 o.a.f.a.ScalingExecutor        [INFO 
][flink/f-cdda7a8d-8259-5137-8397-8125b212e556] All job vertices are currently 
running at their target parallelism.
2024-04-22 23:39:19,218 o.a.f.k.o.r.d.AbstractFlinkResourceReconciler [INFO 
][flink/f-cdda7a8d-8259-5137-8397-8125b212e556] Resource fully reconciled, 
nothing to do...
2024-04-22 23:39:20,535 o.a.f.k.o.s.AbstractFlinkService [INFO 
][flink/f-d7681d0f-c093-5d8a-b5f5-2b66b4547bf6] Completed Scaling JobManager 
Deployment to zero
2024-04-22 23:39:20,536 o.a.f.k.o.s.AbstractFlinkService [INFO 
][flink/f-d7681d0f-c093-5d8a-b5f5-2b66b4547bf6] Deleting JobManager Deployment 
with 296 seconds timeout...
2024-04-22 23:39:20,631 o.a.f.k.o.s.AbstractFlinkService [INFO 
][flink/f-d7681d0f-c093-5d8a-b5f5-2b66b4547bf6] Completed Deleting JobManager 
Deployment


________________________________

COGILITY SOFTWARE CORPORATION LEGAL DISCLAIMER: The information in this email 
is confidential and is intended solely for the addressee. Access to this email 
by anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful.

Reply via email to