Re: Checkpointing with RocksDB as statebackend

Ted Yu Tue, 21 Feb 2017 11:50:11 -0800

Stephan:
The links were in the other email from vinay.


> On Feb 21, 2017, at 10:46 AM, Stephan Ewen <se...@apache.org> wrote:
> 
> Hi!
> 
> I cannot find the screenshots you attached.
> The Apache Mailing lists sometimes don't support attachments, can you link to 
> the screenshots some way else?
> 
> Stephan
> 
> 
>> On Mon, Feb 20, 2017 at 8:36 PM, vinay patil <vinay18.pa...@gmail.com> wrote:
>> Hi Stephan,
>> 
>> Just saw your mail while I was explaining the answer to your earlier 
>> questions. I have attached some more screenshots which are taken from the 
>> latest run today.
>> Yes I will try to set it to higher value and check if performance improves
>> 
>> Let me know your thoughts
>> 
>> Regards,
>> Vinay Patil
>> 
>>> On Tue, Feb 21, 2017 at 12:51 AM, Stephan Ewen [via Apache Flink User 
>>> Mailing List archive.] <[hidden email]> wrote:
>>> @Vinay!
>>> 
>>> Just saw the screenshot you attached to the first mail. The checkpoint that 
>>> failed came after one that had an incredible heavy alignment phase (14 GB).
>>> I think that working that off threw the next checkpoint because the workers 
>>> were still working off the alignment backlog.
>>> 
>>> I think you can for now fix this by setting the minimum pause between 
>>> checkpoints a bit higher (it is probably set a bit too small for the state 
>>> of your application).
>>> 
>>> Also, can you describe what your sources are (Kafka / Kinesis or file 
>>> system)?
>>> 
>>> BTW: We are currently working on
>>>   - incremental RocksDB checkpoints
>>>   - the network stack to allow in the future for a new way of doing the 
>>> alignment
>>> 
>>> Both of that should help that the program is more resilient to these 
>>> situations.
>>> 
>>> Best,
>>> Stephan
>>> 
>>> 
>>> 
>>>> On Mon, Feb 20, 2017 at 7:51 PM, Stephan Ewen <[hidden email]> wrote:
>>>> Hi Vinay!
>>>> 
>>>> Can you start by giving us a bit of an environment spec?
>>>> 
>>>>   - What Flink version are you using?
>>>>   - What is your rough topology (what operations does the program use)
>>>>   - Where is the state (windows, keyBy)?
>>>>   - What is the rough size of your checkpoints and where does the time go? 
>>>> Can you attach a screenshot from 
>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.2/monitoring/checkpoint_monitoring.html
>>>>   - What is the size of the JVM?
>>>> 
>>>> Those things would be helpful to know...
>>>> 
>>>> Best,
>>>> Stephan
>>>> 
>>>> 
>>>>> On Mon, Feb 20, 2017 at 7:04 PM, vinay patil <[hidden email]> wrote:
>>>>> Hi Xiaogang,
>>>>> 
>>>>> Thank you for your inputs.
>>>>> 
>>>>> Yes I have already tried setting MaxBackgroundFlushes and 
>>>>> MaxBackgroundCompactions to higher value (tried with 2, 4, 8) , still not 
>>>>> getting expected results.
>>>>> 
>>>>> System.getProperty("java.io.tmpdir") points to /tmp but there I could not 
>>>>> find RocksDB logs, can you please let me know where can I find it ?
>>>>> 
>>>>> Regards,
>>>>> Vinay Patil
>>>>> 
>>>>>> On Mon, Feb 20, 2017 at 7:32 AM, xiaogang.sxg [via Apache Flink User 
>>>>>> Mailing List archive.] <[hidden email]> wrote:
>>>>>> Hi Vinay
>>>>>> 
>>>>>> Can you provide the LOG file in RocksDB? It helps a lot to figure out 
>>>>>> the problems becuse it records the options and the events happened 
>>>>>> during the execution. Otherwise configured, it should locate at the path 
>>>>>> set in System.getProperty("java.io.tmpdir"). 
>>>>>> 
>>>>>> Typically, a large amount of memory is consumed by RocksDB to store 
>>>>>> necessary indices. To avoid the unlimited growth in the memory 
>>>>>> consumption, you can put these indices into block cache (set 
>>>>>> CacheIndexAndFilterBlock to true) and properly set the block cache size.
>>>>>> 
>>>>>> You can also increase the number of backgroud threads to improve the 
>>>>>> performance of flushes and compactions (via MaxBackgroundFlushes and 
>>>>>> MaxBackgroudCompactions).
>>>>>> 
>>>>>> In YARN clusters, task managers will be killed if their memory 
>>>>>> utilization exceeds the allocation size. Currently Flink does not count 
>>>>>> the memory used by RocksDB in the allocation. We are working on 
>>>>>> fine-grained resource allocation (see FLINK-5131). It may help to avoid 
>>>>>> such problems.
>>>>>> 
>>>>>> May the information helps you.
>>>>>> 
>>>>>> Regards,
>>>>>> Xiaogang
>>>>>> 
>>>>>> 
>>>>>> ------------------------------------------------------------------
>>>>>> 发件人：Vinay Patil <[hidden email]>
>>>>>> 发送时间：2017年2月17日(星期五) 21:19
>>>>>> 收件人：user <[hidden email]>
>>>>>> 主　题：Re: Checkpointing with RocksDB as statebackend
>>>>>> 
>>>>>> Hi Guys,
>>>>>> 
>>>>>> There seems to be some issue with RocksDB memory utilization.
>>>>>> 
>>>>>> Within few minutes of job run the physical memory usage increases by 4-5 
>>>>>> GB and it keeps on increasing.
>>>>>> I have tried different options for Max Buffer Size(30MB, 64MB, 128MB , 
>>>>>> 512MB) and Min Buffer to Merge as 2, but the physical memory keeps on 
>>>>>> increasing.
>>>>>> 
>>>>>> According to RocksDB documentation, these are the main options on which 
>>>>>> flushing to storage is based.
>>>>>> 
>>>>>> Can you please point me where am I doing wrong. I have tried different 
>>>>>> configuration options but each time the Task Manager is getting killed 
>>>>>> after some time :)
>>>>>> 
>>>>>> Regards,
>>>>>> Vinay Patil
>>>>>> 
>>>>>> On Thu, Feb 16, 2017 at 6:02 PM, Vinay Patil <[hidden email]> wrote:
>>>>>> I think its more of related to RocksDB, I am also not aware about 
>>>>>> RocksDB but reading the tuning guide to understand the important values 
>>>>>> that can be set
>>>>>> 
>>>>>> Regards,
>>>>>> Vinay Patil
>>>>>> 
>>>>>> On Thu, Feb 16, 2017 at 5:48 PM, Stefan Richter [via Apache Flink User 
>>>>>> Mailing List archive.] <[hidden email]> wrote:
>>>>>> What kind of problem are we talking about? S3 related or RocksDB 
>>>>>> related. I am not aware of problems with RocksDB per se. I think seeing 
>>>>>> logs for this would be very helpful.
>>>>>> 
>>>>>> Am 16.02.2017 um 11:56 schrieb Aljoscha Krettek <[hidden email]>:
>>>>>> 
>>>>>> [hidden email] and [hidden email] could this be the same problem that 
>>>>>> you recently saw when working with other people?
>>>>>> 
>>>>>> On Wed, 15 Feb 2017 at 17:23 Vinay Patil <[hidden email]> wrote:
>>>>>> Hi Guys,
>>>>>> 
>>>>>> Can anyone please help me with this issue
>>>>>> 
>>>>>> Regards,
>>>>>> Vinay Patil
>>>>>> 
>>>>>> On Wed, Feb 15, 2017 at 6:17 PM, Vinay Patil <[hidden email]> wrote:
>>>>>> Hi Ted,
>>>>>> 
>>>>>> I have 3 boxes in my pipeline , 1st and 2nd box containing source and s3 
>>>>>> sink and the 3rd box is window operator followed by chained operators 
>>>>>> and a s3 sink
>>>>>> 
>>>>>> So in the details link section I can see that that S3 sink is taking 
>>>>>> time for the acknowledgement and it is not even going to the window 
>>>>>> operator chain.
>>>>>> 
>>>>>> But as shown in the snapshot ,checkpoint id 19 did not get any 
>>>>>> acknowledgement. Not sure what is causing the issue
>>>>>> 
>>>>>> Regards,
>>>>>> Vinay Patil
>>>>>> 
>>>>>> On Wed, Feb 15, 2017 at 5:51 PM, Ted Yu [via Apache Flink User Mailing 
>>>>>> List archive.] <[hidden email]> wrote:
>>>>>> What did the More Details link say ? 
>>>>>> 
>>>>>> Thanks 
>>>>>> 
>>>>>> > On Feb 15, 2017, at 3:11 AM, vinay patil <[hidden email]> wrote: 
>>>>>> > 
>>>>>> > Hi, 
>>>>>> > 
>>>>>> > I have kept the checkpointing interval to 6secs and minimum pause 
>>>>>> > between 
>>>>>> > checkpoints to 5secs, while testing the pipeline I have observed that 
>>>>>> > that 
>>>>>> > for some checkpoints it is taking long time , as you can see in the 
>>>>>> > attached 
>>>>>> > snapshot checkpoint id 19 took the maximum time before it gets failed, 
>>>>>> > although it has not received any acknowledgements, now during this 
>>>>>> > 10minutes 
>>>>>> > the entire pipeline did not make any progress and no data was getting 
>>>>>> > processed. (For Ex : In 13minutes 20M records were processed and when 
>>>>>> > the 
>>>>>> > checkpoint took time there was no progress for the next 10minutes) 
>>>>>> > 
>>>>>> > I have even tried to set max checkpoint timeout to 3min, but in that 
>>>>>> > case as 
>>>>>> > well multiple checkpoints were getting failed. 
>>>>>> > 
>>>>>> > I have set RocksDB FLASH_SSD_OPTION 
>>>>>> > What could be the issue ? 
>>>>>> > 
>>>>>> > P.S. I am writing to 3 S3 sinks 
>>>>>> > 
>>>>>> > checkpointing_issue.PNG
>>>>>> > <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/n11640/checkpointing_issue.PNG>
>>>>>> >    
>>>>>> > 
>>>>>> > 
>>>>>> > 
>>>>>> > -- 
>>>>>> > View this message in context: 
>>>>>> > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpointing-with-RocksDB-as-statebackend-tp11640.html
>>>>>> > Sent from the Apache Flink User Mailing List archive. mailing list 
>>>>>> > archive at Nabble.com.
>>>>>> If you reply to this email, your message will be added to the discussion 
>>>>>> below:
>>>>>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpointing-with-RocksDB-as-statebackend-tp11640p11641.html
>>>>>> To start a new topic under Apache Flink User Mailing List archive., 
>>>>>> email [hidden email]
>>>>>> To unsubscribe from Apache Flink User Mailing List archive., click here.
>>>>>> NAML
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> If you reply to this email, your message will be added to the discussion 
>>>>>> below:
>>>>>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpointing-with-RocksDB-as-statebackend-tp11640p11673.html
>>>>>> To start a new topic under Apache Flink User Mailing List archive., 
>>>>>> email [hidden email] 
>>>>>> To unsubscribe from Apache Flink User Mailing List archive., click here.
>>>>>> NAML
>>> 
>>> 
>>> 
>>> If you reply to this email, your message will be added to the discussion 
>>> below:
>>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpointing-with-RocksDB-as-statebackend-tp11640p11731.html
>>> To start a new topic under Apache Flink User Mailing List archive., email 
>>> [hidden email] 
>>> To unsubscribe from Apache Flink User Mailing List archive., click here.
>>> NAML
>>> 
>>> 
>>> View this message in context: Re: Checkpointing with RocksDB as statebackend
>>> 
>>> Sent from the Apache Flink User Mailing List archive. mailing list archive 
>>> at Nabble.com.
>> 
>> 
>> 
>> 
>> If you reply to this email, your message will be added to the discussion 
>> below:
>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Re-Checkpointing-with-RocksDB-as-statebackend-tp11752p11758.html
>> To start a new topic under Apache Flink User Mailing List archive., email 
>> [hidden email] 
>> To unsubscribe from Apache Flink User Mailing List archive., click here.
>> NAML
>> 
>> 
>> View this message in context: Re: Checkpointing with RocksDB as statebackend
>> Sent from the Apache Flink User Mailing List archive. mailing list archive 
>> at Nabble.com.
>

Re: Checkpointing with RocksDB as statebackend

Reply via email to