Re: Writing _SUCCESS Files (Streaming and Batch)

Jingsong Li Tue, 05 May 2020 18:41:43 -0700

Hi Peter,

The troublesome is how to know the "ending" for a bucket in streaming job.
In 1.11, we are trying to implement a watermark-related bucket ending
mechanism[1] in Table/SQL.


[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-115%3A+Filesystem+connector+in+Table

Best,
Jingsong Lee

On Tue, May 5, 2020 at 7:40 AM Peter Groesbeck <peter.groesb...@gmail.com>
wrote:

> I am replacing an M/R job with a Streaming job using the StreamingFileSink
> and there is a requirement to generate an empty _SUCCESS file like the old
> Hadoop job. I have to implement a similar Batch job to read from backup
> files in case of outages or downtime.
>
> The Batch job question was answered here and appears to be still relevant
> although if someone could confirm for me that would be great.
> https://stackoverflow.com/a/39413810
>
> The question of the Streaming job came up back in 2018 here:
>
> http://mail-archives.apache.org/mod_mbox/flink-user/201802.mbox/%3cff74eed5-602f-4eaa-9bc1-6cdf56611...@gmail.com%3E
>
> But the solution to use or extend the BucketingSink class seems out of
> date now that BucketingSink has been deprecated.
>
> Is there a way to implement a similar solution for StreamingFileSink?
>
> I'm currently on 1.8.1 although I hope to update to 1.10 in the near
> future.
>
> Thank you,
> Peter
>


-- 
Best, Jingsong Lee

Re: Writing _SUCCESS Files (Streaming and Batch)

Reply via email to