Hi Xiaogang, Thank you for your inputs.
Yes I have already tried setting MaxBackgroundFlushes and MaxBackgroundCompactions to higher value (tried with 2, 4, 8) , still not getting expected results. System.getProperty("java.io.tmpdir") points to /tmp but there I could not find RocksDB logs, can you please let me know where can I find it ? Regards, Vinay Patil On Mon, Feb 20, 2017 at 7:32 AM, xiaogang.sxg [via Apache Flink User Mailing List archive.] <ml-node+s2336050n11731...@n4.nabble.com> wrote: > Hi Vinay > > Can you provide the LOG file in RocksDB? It helps a lot to figure out the > problems becuse it records the options and the events happened during the > execution. Otherwise configured, it should locate at the path set in > System.getProperty("java.io.tmpdir"). > > Typically, a large amount of memory is consumed by RocksDB to store > necessary indices. To avoid the unlimited growth in the memory consumption, > you can put these indices into block cache (set CacheIndexAndFilterBlock to > true) and properly set the block cache size. > > You can also increase the number of backgroud threads to improve the > performance of flushes and compactions (via MaxBackgroundFlushes and > MaxBackgroudCompactions). > > In YARN clusters, task managers will be killed if their memory utilization > exceeds the allocation size. Currently Flink does not count the memory used > by RocksDB in the allocation. We are working on fine-grained resource > allocation (see FLINK-5131). It may help to avoid such problems. > > May the information helps you. > > Regards, > Xiaogang > > > ------------------------------------------------------------------ > 发件人:Vinay Patil <[hidden email] > <http:///user/SendEmail.jtp?type=node&node=11731&i=0>> > 发送时间:2017年2月17日(星期五) 21:19 > 收件人:user <[hidden email] > <http:///user/SendEmail.jtp?type=node&node=11731&i=1>> > 主 题:Re: Checkpointing with RocksDB as statebackend > > Hi Guys, > > There seems to be some issue with RocksDB memory utilization. > > Within few minutes of job run the physical memory usage increases by 4-5 > GB and it keeps on increasing. > I have tried different options for Max Buffer Size(30MB, 64MB, 128MB , > 512MB) and Min Buffer to Merge as 2, but the physical memory keeps on > increasing. > > According to RocksDB documentation, these are the main options on which > flushing to storage is based. > > Can you please point me where am I doing wrong. I have tried different > configuration options but each time the Task Manager is getting killed > after some time :) > > Regards, > Vinay Patil > > On Thu, Feb 16, 2017 at 6:02 PM, Vinay Patil <[hidden email] > <http:///user/SendEmail.jtp?type=node&node=11731&i=2>> wrote: > I think its more of related to RocksDB, I am also not aware about RocksDB > but reading the tuning guide to understand the important values that can be > set > > Regards, > Vinay Patil > > On Thu, Feb 16, 2017 at 5:48 PM, Stefan Richter [via Apache Flink User > Mailing List archive.] <[hidden email] > <http:///user/SendEmail.jtp?type=node&node=11731&i=3>> wrote: > What kind of problem are we talking about? S3 related or RocksDB related. > I am not aware of problems with RocksDB per se. I think seeing logs for > this would be very helpful. > > Am 16.02.2017 um 11:56 schrieb Aljoscha Krettek <[hidden email] > <http:///user/SendEmail.jtp?type=node&node=11673&i=0>>: > > [hidden email] <http:///user/SendEmail.jtp?type=node&node=11673&i=1> and > [hidden > email] <http:///user/SendEmail.jtp?type=node&node=11673&i=2> could this > be the same problem that you recently saw when working with other people? > > On Wed, 15 Feb 2017 at 17:23 Vinay Patil <[hidden email] > <http:///user/SendEmail.jtp?type=node&node=11673&i=3>> wrote: > Hi Guys, > > Can anyone please help me with this issue > > Regards, > Vinay Patil > > On Wed, Feb 15, 2017 at 6:17 PM, Vinay Patil <[hidden email] > <http:///user/SendEmail.jtp?type=node&node=11673&i=4>> wrote: > Hi Ted, > > I have 3 boxes in my pipeline , 1st and 2nd box containing source and s3 > sink and the 3rd box is window operator followed by chained operators and a > s3 sink > > So in the details link section I can see that that S3 sink is taking time > for the acknowledgement and it is not even going to the window operator > chain. > > But as shown in the snapshot ,checkpoint id 19 did not get any > acknowledgement. Not sure what is causing the issue > > Regards, > Vinay Patil > > On Wed, Feb 15, 2017 at 5:51 PM, Ted Yu [via Apache Flink User Mailing > List archive.] <[hidden email] > <http:///user/SendEmail.jtp?type=node&node=11673&i=5>> wrote: > What did the More Details link say ? > > Thanks > > > On Feb 15, 2017, at 3:11 AM, vinay patil <[hidden email] > <http://user/SendEmail.jtp?type=node&node=11641&i=0>> wrote: > > > > Hi, > > > > I have kept the checkpointing interval to 6secs and minimum pause > between > > checkpoints to 5secs, while testing the pipeline I have observed that > that > > for some checkpoints it is taking long time , as you can see in the > attached > > snapshot checkpoint id 19 took the maximum time before it gets failed, > > although it has not received any acknowledgements, now during this > 10minutes > > the entire pipeline did not make any progress and no data was getting > > processed. (For Ex : In 13minutes 20M records were processed and when > the > > checkpoint took time there was no progress for the next 10minutes) > > > > I have even tried to set max checkpoint timeout to 3min, but in that > case as > > well multiple checkpoints were getting failed. > > > > I have set RocksDB FLASH_SSD_OPTION > > What could be the issue ? > > > > P.S. I am writing to 3 S3 sinks > > > > checkpointing_issue.PNG > > <http://apache-flink-user-mailing-list-archive.2336050. > n4.nabble.com/file/n11640/checkpointing_issue.PNG> > > > > > > > > -- > > View this message in context: http://apache-flink-user- > mailing-list-archive.2336050.n4.nabble.com/Checkpointing- > with-RocksDB-as-statebackend-tp11640.html > > Sent from the Apache Flink User Mailing List archive. mailing list > archive at Nabble.com. > > > ------------------------------ > If you reply to this email, your message will be added to the discussion > below: > http://apache-flink-user-mailing-list-archive.2336050. > n4.nabble.com/Checkpointing-with-RocksDB-as-statebackend- > tp11640p11641.html > To start a new topic under Apache Flink User Mailing List archive., email > [hidden > email] <http:///user/SendEmail.jtp?type=node&node=11673&i=6> > To unsubscribe from Apache Flink User Mailing List archive., click here > <#m_8892162958879126193_this>. > NAML > <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> > > > > ------------------------------ > If you reply to this email, your message will be added to the discussion > below: > http://apache-flink-user-mailing-list-archive.2336050. > n4.nabble.com/Checkpointing-with-RocksDB-as-statebackend- > tp11640p11673.html > To start a new topic under Apache Flink User Mailing List archive., email > [hidden > email] <http:///user/SendEmail.jtp?type=node&node=11731&i=4> > To unsubscribe from Apache Flink User Mailing List archive., click here. > NAML > <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> > > > > > ------------------------------ > If you reply to this email, your message will be added to the discussion > below: > http://apache-flink-user-mailing-list-archive.2336050. > n4.nabble.com/Checkpointing-with-RocksDB-as-statebackend- > tp11640p11731.html > To start a new topic under Apache Flink User Mailing List archive., email > ml-node+s2336050n1...@n4.nabble.com > To unsubscribe from Apache Flink User Mailing List archive., click here > <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1&code=dmluYXkxOC5wYXRpbEBnbWFpbC5jb218MXwxODExMDE2NjAx> > . > NAML > <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> > -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Re-Checkpointing-with-RocksDB-as-statebackend-tp11752.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.