Hi, We recently removed some cleanup code, because it involved checking some store meta data to check when we can delete a directory. For certain stores (like S3), requesting this meta data whenever we delete a file was so expensive that it could bring down the job because removing state could not be processed fast enough. We have a temporary fix in place now, so that jobs at large scale can still run reliably on stores like S3. Currently, this comes at the cost of not cleaning up directories but we are clearly planning to introduce a different mechanism for directory cleanup in the future that is not as fine grained as doing meta data queries per file delete. In the meantime, unfortunately the best way is to cleanup empty directories with some external tool.
Best, Stefan > Am 20.09.2017 um 01:23 schrieb Hao Sun <ha...@zendesk.com>: > > Thanks Elias! Seems like there is no better answer than "do not care about > them now", or delete with a background job. > > On Tue, Sep 19, 2017 at 4:11 PM Elias Levy <fearsome.lucid...@gmail.com > <mailto:fearsome.lucid...@gmail.com>> wrote: > There are a couple of related JIRAs: > > https://issues.apache.org/jira/browse/FLINK-7587 > <https://issues.apache.org/jira/browse/FLINK-7587> > https://issues.apache.org/jira/browse/FLINK-7266 > <https://issues.apache.org/jira/browse/FLINK-7266> > > > On Tue, Sep 19, 2017 at 12:20 PM, Hao Sun <ha...@zendesk.com > <mailto:ha...@zendesk.com>> wrote: > Hi, I am using RocksDB and S3 as storage backend for my checkpoints. > Can flink delete these empty directories automatically? Or I need a > background job to do the deletion? > > I know this has been discussed before, but I could not get a concrete answer > for it yet. Thanks > > <image.png> >