Re: Capping RocksDb memory usage

Cam Mach Thu, 08 Aug 2019 13:59:11 -0700

Hi Biao, Yun and Ning.

Thanks for your response and pointers. Those are very helpful!


So far, we have tried with some of those parameters (WriterBufferManager,
write_buffer_size, write_buffer_count, ...), but still continuously having
issues with memory.
Here are our cluster configurations:

   - 1 Job Controller (32 GB RAM and 8 cores)
   - 10 Task Managers: (32 GB RAM, 8 cores CPU, and 300GB SSD configured
   for RocksDB, and we set 10GB heap for each)
   - Running under Kuberntes

We have a pipeline that read/transfer 500 million records (around 1kb
each), and write to our sink. Our total data is around 1.2 Terabytes. Our
pipeline configurations are as follows:

   - 13 operators - some of them (around 6) are stateful
   - Parallelism: 60
   - Task slots: 6

We have run several tests and observed that memory just keep growing while
our TM's CPU stay around 10 - 15% usage. We are now just focusing limiting
memory usage from Flink and RocksDB, so Kubernetes won't kill it.

Any recommendations or advices are greatly appreciated!

Thanks,




On Thu, Aug 8, 2019 at 6:57 AM Yun Tang <myas...@live.com> wrote:

> Hi Cam
>
> I think FLINK-7289 [1] might offer you some insights to control RocksDB
> memory, especially the idea using write buffer manager [2] to control the
> total write buffer memory. If you do not have too many sst files, write
> buffer memory usage would consume much more space than index and filter
> usage. Since Flink would use per state per column family, and the write
> buffer number increase when more column families created.
>
>
> [1] https://issues.apache.org/jira/browse/FLINK-7289
> [2] https://github.com/dataArtisans/frocksdb/pull/4
>
> Best
> Yun Tang
>
>
> ------------------------------
> *From:* Cam Mach <cammac...@gmail.com>
> *Sent:* Thursday, August 8, 2019 21:39
> *To:* Biao Liu <mmyy1...@gmail.com>
> *Cc:* miki haiat <miko5...@gmail.com>; user <user@flink.apache.org>
> *Subject:* Re: Capping RocksDb memory usage
>
> Thanks for your response, Biao.
>
>
>
> On Wed, Aug 7, 2019 at 11:41 PM Biao Liu <mmyy1...@gmail.com> wrote:
>
> Hi Cam,
>
> AFAIK, that's not an easy thing. Actually it's more like a Rocksdb issue.
> There is a document explaining the memory usage of Rocksdb [1]. It might be
> helpful.
>
> You could define your own option to tune Rocksdb through
> "state.backend.rocksdb.options-factory" [2]. However I would suggest not to
> do this unless you are fully experienced of Rocksdb. IMO it's quite
> complicated.
>
> Meanwhile I can share a bit experience of this. We have tried to put the
> cache and filter into block cache before. It's useful to control the memory
> usage. But the performance might be affected at the same time. Anyway you
> could try and tune it. Good luck!
>
> 1. https://github.com/facebook/rocksdb/wiki/Memory-usage-in-RocksDB
> 2.
> https://ci.apache.org/projects/flink/flink-docs-master/ops/state/large_state_tuning.html#tuning-rocksdb
>
> Thanks,
> Biao /'bɪ.aʊ/
>
>
>
> On Thu, Aug 8, 2019 at 11:44 AM Cam Mach <cammac...@gmail.com> wrote:
>
> Yes, that is correct.
> Cam Mach
> Software Engineer
> E-mail: cammac...@gmail.com
> Tel: 206 972 2768
>
>
>
> On Wed, Aug 7, 2019 at 8:33 PM Biao Liu <mmyy1...@gmail.com> wrote:
>
> Hi Cam,
>
> Do you mean you want to limit the memory usage of RocksDB state backend?
>
> Thanks,
> Biao /'bɪ.aʊ/
>
>
>
> On Thu, Aug 8, 2019 at 2:23 AM miki haiat <miko5...@gmail.com> wrote:
>
> I think using metrics exporter is the easiest way
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/metrics.html#rocksdb
>
>
> On Wed, Aug 7, 2019, 20:28 Cam Mach <cammac...@gmail.com> wrote:
>
> Hello everyone,
>
> What is the most easy and efficiently way to cap RocksDb's memory usage?
>
> Thanks,
> Cam
>
>

Re: Capping RocksDb memory usage

Reply via email to