Re: S3 recovery and checkpoint directories exhibit explosive growth

2017-07-26 Thread prashantnayak
2336050.n4.nabble.com/S3-recovery-and-checkpoint-directories-exhibit-explosive-growth-tp14270p14479.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Re: S3 recovery and checkpoint directories exhibit explosive growth

2017-07-26 Thread Stefan Richter
> > -- > View this message in context: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/S3-recovery-and-checkpoint-directories-exhibit-explosive-growth-tp14270p14477.html > Sent from the Apache Flink User Mailing List archive. mailing list archive at > Nabble.com.

Re: S3 recovery and checkpoint directories exhibit explosive growth

2017-07-26 Thread prashantnayak
welcome as well. Flink job state is critical to us since we have very long running jobs (months) processing hundreds of millions of records. Thanks Prashant -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/S3-recovery-and-checkpoint-dire

Re: S3 recovery and checkpoint directories exhibit explosive growth

2017-07-26 Thread Stephan Ewen
(unless there is a bucket lifecycle > > policy)... I think you should recommend that Flink users that rely on S3 > > turn off bucket versioning since it seems to not really be a factor for > > Flink... > > > > Thanks > > Prashant > > > > > > &g

Re: S3 recovery and checkpoint directories exhibit explosive growth

2017-07-26 Thread Stefan Richter
o not really be a factor for > Flink... > > Thanks > Prashant > > > > -- > View this message in context: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/S3-recovery-and-checkpoint-directories-exhibit-explosive-growth-tp14270p14453.html > Sent from the Apache Flink User Mailing List archive. mailing list archive at > Nabble.com.

Re: S3 recovery and checkpoint directories exhibit explosive growth

2017-07-25 Thread prashantnayak
s to not really be a factor for Flink... Thanks Prashant -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/S3-recovery-and-checkpoint-directories-exhibit-explosive-growth-tp14270p14453.html Sent from the Apache Flink User Mailing List archive.

Re: S3 recovery and checkpoint directories exhibit explosive growth

2017-07-25 Thread prashantnayak
ption ignored) {} } ``` -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/S3-recovery-and-checkpoint-directories-exhibit-explosive-growth-tp14270p14452.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Re: S3 recovery and checkpoint directories exhibit explosive growth

2017-07-25 Thread Bowen Li
Hi Stephan, Making Flink's S3 integration independent of Hadoop is great. We've been running into a lot of Hadoop configuration trouble when trying to enabling Flink checkpointing with S3 on AWS EMR. Is there any concrete plan or tickets created yet for tracking? Thanks, Bowen On Mon, J

Re: S3 recovery and checkpoint directories exhibit explosive growth

2017-07-24 Thread Stephan Ewen
Hi Prashant! Flink's S3 integration currently goes through Hadoop's S3 file system (as you probably noticed). It seems that the Hadoop's S3 file system is not really well suited for what we want to do, and we are looking to drop it and replace it by something direct (independent of Hadoop) in the

Re: S3 recovery and checkpoint directories exhibit explosive growth

2017-07-24 Thread Stephan Ewen
Appreciate any another insights you might have around this problem. > > Thanks > Prashant > > > > -- > View this message in context: http://apache-flink-user- > mailing-list-archive.2336050.n4.nabble.com/S3-recovery-and- > checkpoint-directories-exhibit-explosive-growth-tp14270p14392.html > Sent from the Apache Flink User Mailing List archive. mailing list archive > at Nabble.com. >

Re: S3 recovery and checkpoint directories exhibit explosive growth

2017-07-23 Thread prashantnayak
etup shows improvement. Appreciate any another insights you might have around this problem. Thanks Prashant -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/S3-recovery-and-checkpoint-directories-exhibit-explosive-growth-tp14270p14392.html Sen

Re: S3 recovery and checkpoint directories exhibit explosive growth

2017-07-20 Thread prashantnayak
://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/S3-recovery-and-checkpoint-directories-exhibit-explosive-growth-tp14270p14375.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Re: S3 recovery and checkpoint directories exhibit explosive growth

2017-07-20 Thread prashantnayak
only have last hour worth, or just perhaps the completedCheckpoint files? Happy to provide any additional detail you need. Just let me know... Thanks Prashant -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/S3-recovery-and-checkpoint-di

Re: S3 recovery and checkpoint directories exhibit explosive growth

2017-07-16 Thread SHI Xiaogang
Hi Prashantnayak Thanks a lot for reporting this problem. Can you provide more details to address it? I am guessing master has to delete too many files when a checkpoint is subsumed, which is very common in our cases. The number of files in the recovery directory will increase if the master canno

Re: S3 recovery and checkpoint directories exhibit explosive growth

2017-07-14 Thread Stephan Ewen
're the only ones to see this... or we must be > configuring > something wrong while testing Flink 1.3.1 > > Thanks for your help in advance > > Prashant > > > > -- > View this message in context: http://apache-flink-user- > mailing-list-archive.2336050.n4.nabble.

Re: S3 recovery and checkpoint directories exhibit explosive growth

2017-07-13 Thread prashantnayak
ext: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/S3-recovery-and-checkpoint-directories-exhibit-explosive-growth-tp14270p14271.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

S3 recovery and checkpoint directories exhibit explosive growth

2017-07-13 Thread Prashant Nayak
We’re using Flink 1.3.1 on Mesos, with HA/recovery stored in S3 using RocksDB with incremental checkpointing. We have enabled external checkpoints (every 30s), retaining the two latest external checkpoints. We are trying to track down something we see happening where the recovery, checkpoint and