Re: Paimon missing CatalogFactory

2025-02-04 Thread Yanquan Lv
Hi, Dominik. It seems that you only used Paimon's version number, But actually you should include the version number of Flink, like org.apache.paimon:paimon-flink-1.19:1.0.0. You can see the main difference from the content of the two links below: [1] https://mvnrepository.com/artifact/org.apache

Re: Unsubscribe

2025-02-04 Thread Zhanghao Chen
Please send email to user-unsubscr...@flink.apache.org if you want to unsubscribe the mail from user@flink.apache.org. Best, Zhanghao Chen From: Mujahid Niaz Sent: Wednesday, February 5, 2025 9:29 Cc: user@flink.apache.org Subject: Unsubscribe Unsubscribe

Unsubscribe

2025-02-04 Thread Mujahid Niaz
Unsubscribe

Re: Dead Letter Queue for FlinkSQL

2025-02-04 Thread Zhanghao Chen
You'll need to implement an custom sink for that. Best, Zhanghao Chen From: Ilya Karpov Sent: Monday, February 3, 2025 18:30 To: user Subject: Dead Letter Queue for FlinkSQL Hi there, Because sink connectors can throw exceptions in real time (for example, due

Re: Flink High Availability Data Cleanup

2025-02-04 Thread Zhanghao Chen
Hi Yang, When the job failed temporarily, e.g. due to single machine failure, Flink will retain the HA metadata and try to recover. However, when the job has already reached the terminal failed status (controlled by the restart strategy [1]), Flink will delete all metadata and exit. In your cas

Flink High Availability Data Cleanup

2025-02-04 Thread Chen Yang via user
Hi Flink Community, I'm running the Flink jobs (standalone mode) with high availability in Kubernetes (Flink version 1.17.2). The job is deployed with two job managers. I noticed that the leader job manager would delete the ConfigMap when the job failed and restarted. Thus the standby job manager

Re: How to read a savepoint fast without exploding the memory

2025-02-04 Thread Gabor Somogyi
Just to give an update. I've applied the mentioned patch and the execution time drastically decreased (the gain is 98.9%): 2025-02-04 16:52:54,448 INFO o.a.f.e.s.r.FlinkTestStateReader [] - Execution time: PT14.690426S I need to double check what that would mean to correctness and all

Re: How to read a savepoint fast without exploding the memory

2025-02-04 Thread Gabor Somogyi
Please report back on how the patch behaves including any side effects. Now I'm in testing the state reading with processor API vs the mentioned job where we control the keys. The difference is extreme, especially because the numbers are coming from reading ~40Mb state file😅 2025-02-04 13:21:53,5

Re: How to read a savepoint fast without exploding the memory

2025-02-04 Thread Jean-Marc Paulin
That's a good idea, Sadly I have no control over the keys I was going to patch Flink with the suggestion in FLINK-37109 first to see how that goes. If that brings RockDb performance in an acceptable range for us we might go that way. I really

Re: How to read a savepoint fast without exploding the memory

2025-02-04 Thread Gabor Somogyi
What I could imagine is to create a normal Flink job, use execution.state-recovery.path=/path/to/savepoint set the operator UID on a custom written operator, which opens the state info for you. The only drawback is that you must know the keyBy range... this can be problematic but if you can do it

Re: How to read a savepoint fast without exploding the memory

2025-02-04 Thread Jean-Marc Paulin
Hi Gabor, I thought so. I was hoping for a way to read the savepoint in pages, instead of as a single blob up front which I think is what the hashmap does... we just want to be called for each entry and extract the bit we want in that scenario. Never mind Thank you for the insight. Saves me a lo

Re: How to read a savepoint fast without exploding the memory

2025-02-04 Thread Gabor Somogyi
Hi Jean-Marc, We've already realized that the RocksDB approach is not reaching the performance criteria which it should be. There is an open issue for it [1]. The hashmap based approach was and is always expecting more memory. So if the memory footprint is a hard requirement then RocksDB is the on

How to read a savepoint fast without exploding the memory

2025-02-04 Thread Jean-Marc Paulin
What would be the best approach to read a savepoint and minimise the memory consumption. We just need to transform it into something else for investigation. Our flink 1.20 streaming job is using HashMap backend, and is spread over 6 task slots in 6 pods (under k8s). Savepoints are saved on S3. A s

Unsubscribe

2025-02-04 Thread Lin Hou