Hi,

Maybe you can just list files in your basePath and filter out those that have 
inProgress or pending suffixes? 

I think you could wrap/implement your own Bucketer and track all the paths that 
it returns. However some of those might be pending or in progress files that 
will be committed in the future (or in case of crash some of them might be left 
over and should be discarded)

Another possibility is that you can copy the code of BucketingSink and track 
the fs.rename calls that move file to a final path (in:
notifyCheckpointComplete
handlePendingFilesForPreviousCheckpoints
handlePendingInProgressFile)

Piotrek

> On 20 Oct 2017, at 11:08, Rinat <r.shari...@cleverdata.ru> wrote:
> 
> Hi All !
> 
> I’m trying to create a meta-info file, that contains link to file, created by 
> Flink BucketingSink.
> At first I was trying to implement my own 
> org.apache.flink.streaming.connectors.fs.Writer, that creates a meta-file on 
> close method call. 
> But I understood, that it’s not completely right, because when writer is 
> closed, file, into which data were written, is in in-progress state and in 
> final state it will change it’s name. 
> So create any meta-info on writer closing, that links to the in-progress 
> file, will lead my system to inconsistent state.
> 
> I looked through the sources of BucketingSink, and have not found an elegant 
> way to perform any kind of subscription on moving file with data into final 
> state.
> Maybe someone already had the same issue and found elegant way how it could 
> be solved ?
> 
> Also maybe someone know how this issue could be solved using other Flink 
> tools/ components, because I'm not so long using Flink and maybe don’t know 
> some of it's features.
> 
> Thx.

Reply via email to