回复: HA in k8s operator

2023-09-16 Thread Chen Zhanghao
Hi Krzysztof, TM HA is taken charge by the Flink cluster itself is beyond K8s operator's responsibility. Flink will try to recover a failed Task as long as the restart limit is not reached otherwise the job will transition into terminal FAILED status. You may check the job restart strategy [1]

HA in k8s operator

2023-09-16 Thread Krzysztof Chmielewski
Hi community, I would like to test flink k8s operator's HA capabilities for TM and JM failover. The simple test I did for TM failover was as follows: - run Flink session cluster in native mode - submit FlinkSessionJob resource with SAVEPOINT upgreade mode. - kill task manager pod It turns out tha

Re: Urgent: Mitigating Slow Consumer Impact and Seeking Open-SourceSolutions in Apache Kafka Consumers

2023-09-16 Thread Wei Chen
Hi Karthick, We’ve experienced the similar issue before. What we were doing at that time was to define multiple topics and each has a different # of partitions which means some of the topics with more partitions will have the high parallelisms for processing. And you can further divide the topic

Re: Urgent: Mitigating Slow Consumer Impact and Seeking Open-Source Solutions in Apache Kafka Consumers

2023-09-16 Thread Giannis Polyzos
Can you provide some more context on what your Flink job will be doing? There might be some things you can do to fix the data skew on the link side, but first, you want to start with Kafka. For starters, you need to better size and estimate the required number of partitions you will need on the Kaf

Re: Urgent: Mitigating Slow Consumer Impact and Seeking Open-Source Solutions in Apache Kafka Consumers

2023-09-16 Thread Karthick
Hi Gowtham i agree with you, I'm eager to resolve the issue or gain a better understanding. Your assistance would be greatly appreciated. If there are any additional details or context needed to address my query effectively, please let me know, and I'll be happy to provide them. Thank you in adv