Hi Gyula, I looked into this a bit recently as well and did some experiments (on my local machine). The only parameter that significantly changed anything in this setup was reducing the total size of the write buffers (number or size memtables). I was not able to find any online resources on the performance of checkpoint creation in RocksDB, so looking forward to your findings...
Cheers, Konstantin On Fri, May 3, 2019 at 12:10 PM Gyula Fóra <[email protected]> wrote: > Thanks Piotr for the tips we will play around with some settings. > > @Stefan > It is a few columns but a lot of rows > > Gyula > > On Fri, May 3, 2019 at 11:43 AM Stefan Richter <[email protected]> > wrote: > >> Hi, >> >> out of curiosity, does it happen with jobs that have a large number of >> states (column groups) or also for jobs with few column groups and just >> “big state”? >> >> Best, >> Stefan >> >> On 3. May 2019, at 11:04, Piotr Nowojski <[email protected]> wrote: >> >> Hi Gyula, >> >> Have you read our tuning guide? >> >> https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/large_state_tuning.html#tuning-rocksdb >> >> Synchronous part is mostly about flushing data to disks, so you could try >> to optimise your setup having that in mind. Limiting the size of a page >> cache, speeding up the writes (using more/faster disks…), etc… Maybe you >> can also look at online resources how to speedup calls to >> `org.rocksdb.Checkpoint#create`. >> >> Piotrek >> >> On 3 May 2019, at 10:30, Gyula Fóra <[email protected]> wrote: >> >> Hi! >> >> Does anyone know what parameters might affect the RocksDB native >> checkpoint time? (basically the sync part of the rocksdb incremental >> snapshots) >> >> It seems to take 60-70 secs in some cases for larger state sizes, and I >> wonder if there is anything we could tune to reduce this. Maybe its only a >> matter of size i dont know. >> >> Any ideas would be appreciated :) >> Gyula >> >> >> >> -- Konstantin Knauf | Solutions Architect +49 160 91394525 Planned Absences: - <https://www.ververica.com/> Follow us @VervericaData -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Data Artisans GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
