Ah looks like I need to use 1.12 for this.  I'm still on 1.11.

On Fri, Feb 5, 2021, 08:37 Dan Hill <quietgol...@gmail.com> wrote:

> Thanks Aljoscha!
>
> On Fri, Feb 5, 2021 at 1:48 AM Aljoscha Krettek <aljos...@apache.org>
> wrote:
>
>> Hi Dan,
>>
>> I'm afraid this is not easily possible using the DataStream API in
>> STREAMING execution mode today. However, there is one possible solution
>> and we're introducing changes that will also make this work on STREAMING
>> mode.
>>
>> The possible solution is to use the `FileSink` instead of the
>> `StreamingFileSink`. This is an updated version of the sink that works
>> in both BATCH and STREAMING mode (see [1]). If you use BATCH execution
>> mode all your files should be "completed" at the end.
>>
>> [1]
>> https://ci.apache.org/projects/flink/flink-docs-master/dev/datastream_execution_mode.html
>>
>> The thing we're currently working on is FLIP-147 [2], which will allow
>> sinks (and other operators) to always do one final checkpoint before
>> shutting down. This will allow them to move the last outstanding
>> inprogress files over to finished as well.
>>
>> [2] https://cwiki.apache.org/confluence/x/mw-ZCQ
>>
>> I hope that helps!
>>
>> Best,
>> Aljoscha
>>
>> On 2021/02/04 21:37, Dan Hill wrote:
>> >Hi Flink user group,
>> >
>> >*Background*
>> >I'm changing a Flink SQL job to use Datastream.  I'm updating an existing
>> >Minicluster test in my code.  It has a similar structure to other tests
>> in
>> >flink-tests.  I call StreamExecutionEnvironment.execute.  My tests sink
>> >using StreamingFileSink Bulk Formats to tmp local disk.
>> >
>> >*Issue*
>> >When I try to check the files on local disk, I see
>> >".part-0-0.inprogress.1234abcd-5678-uuid...".
>> >
>> >*Question*
>> >What's the best way to get the test to complete the outputs?  I tried
>> >checkpointing very frequently, sleeping, etc but these didn't work.
>> >
>> >Thanks!
>> >- Dan
>>
>

Reply via email to