Hi Zhao.

Yes, I am deploying our solution as a Session Cluster on Kubernetes, via Flink 
Kubernetes Operator. So, the JM is the session cluster leader.

So, what is the correct approach? Should I explicitly set these checkpointing 
properties in my jobs?

Funny thing is, it used to work on the same version of Flink 1.20. And then it 
stopped, after a redeploy. I do not recall changing anything in the config of 
the cluster, so it felt like a “gremlin”.

Nikola.

From: Zhanghao Chen <zhanghao.c...@outlook.com>
Date: Friday, January 24, 2025 at 2:50 AM
To: Nikola Milutinovic <n.milutino...@levi9.com>, user@flink.apache.org 
<user@flink.apache.org>
Subject: Re: table.exec.source.idle-timeout support
Hi,

Are you deploying the job in Session mode? Underneath, Flink distinguishes 
cluster-level and job-level configs. For Application mode, the two are unified. 
When a job is submitted to the session cluster though, the values of 
cluster-level config options, such as the memory for JM and TM, will be defined 
by the JM/TM configuration. And the values of job-level config options, such as 
parallelism, checkpointing, and idle time in this case, will be defined by the 
job configuration. Additionally, some config options will fall back to the 
values defined in the configuration from JM/TM if their values are not defined 
in job configuration, such as restart strategies. Unfornately the distinction 
is not explictly exposed to the user. An effort [1] is trying to improve on it.

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-478+Introduce+Config+Option+Scope

Best,
Zhanghao Chen
________________________________
From: Nikola Milutinovic <n.milutino...@levi9.com>
Sent: Friday, January 10, 2025 23:48
To: user@flink.apache.org <user@flink.apache.org>
Subject: Re: table.exec.source.idle-timeout support


Hi Nic.



I do not have a solution (sorrrry), but have seen something similar. And have 
complained about it, already. Look up my mail in the archives.



I am also deploying our session cluster using Flink Kubernetes operator. I was 
setting “execution.checkpointing.interval: 300000”. This should set a global 
value for periodic checkpoints. They do work in Docker Compose.



I can see the value is read and accepted in Job Manager, expect the same in 
Task Manager – haven’t checked the logs. However, all running jobs are stating 
that “Periodic Checkpointing Disabled”. If I alter the job to set periodic 
checkpointing on its execution environment, then it works.



So, it seems like JM or TM is not passing certain options to our jobs. Mind 
you, all other checkpointing options were passed into the execution environment 
of the job.



Is this a bug or a feature?



Nix,



From: Nic Townsend <nictowns...@uk.ibm.com>
Date: Friday, January 10, 2025 at 4:22 PM
To: user@flink.apache.org <user@flink.apache.org>
Subject: table.exec.source.idle-timeout support

Hi, I’m deploying Flink 1.19 via the k8s operator.



I’m setting `table.exec.source.idle-timeout: 30 s` in the 
`spec.flinkConfiguration` section of the FlinkDeployment CR.



The Flink UI is showing the JobManager has been configured with the value and 
the JM and TM logs both show `INFO [] - Loading configuration property: 
table.exec.source.idle-timeout, 30 s`.



However, when I run a simple SQL job (select * from <kafkatable>) – where the 
Kafka topic has 3 partitions,  the watermark does not appear to update when a 
partition is idle (confirmed using the job graph UI).



If I instead use the option `'scan.watermark.idle-timeout'=30 s ` in the CREATE 
TABLE then the watermark does increase when the partition goes idle.



I can’t find an error in the JobManager to suggest the config option is invalid 
or malformed, and I’m not seeing a Jira issue that reflects this.



So I feel like I must be doing something wrong, but would appreciate any 
advice, thank you!



--



Nic Townsend

IBM Event Processing

Senior Engineer / Technical Lead



Slack: @nictownsend





Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: Building C, IBM Hursley Office, Hursley Park Road, 
Winchester, Hampshire SO21 2JN

Reply via email to