Re: OperatorStateFromBackend can't complete initialisation because of high number of savepoint files reads

2024-10-17 Thread Gabor Somogyi
Hi Mate, Thanks for the deep dive! I've had a slight look at the code and it makes sense why you and William is not seeing slowness with compressed state. Tomorrow I'll do some tests and come back with the results... @William Wallace I think the restore should work without the memory-threshold s

Re: OperatorStateFromBackend can't complete initialisation because of high number of savepoint files reads

2024-10-17 Thread Mate Czagany
Hi William, I think your findings are correct, I could easily reproduce the issue with snapshot-compression set to false, but I was unable to with snapshot-compression set to true. When using compressed state, the available() call will return the number of bytes in the Snappy internal buffer that

Re: OperatorStateFromBackend can't complete initialisation because of high number of savepoint files reads

2024-10-17 Thread William Wallace
Hi G, We did a test today using ``` execution.checkpointing.snapshot-compression: true state.storage.fs.memory-threshold: 500kb ``` across 6 jobs with different parallelism and volume load. I will use one as an example - 70 slots - I had 70 files of 670kb corresponding to the subtask state contain

[ANNOUNCE] Apache flink-connector-kafka 3.3.0 released

2024-10-17 Thread Arvid Heise
The Apache Flink community is very happy to announce the release of Apache flink-connector-kafka 3.3.0. Apache FlinkĀ® is an open-source stream processing framework for distributed, high-performing, always-available, and accurate data streaming applications. The release is available for downlo