I changed the test to use ExecutionMode.BATCH in v1.11 and it still doesn't work. How did devs write minicluster tests before for similar code? Did they not?
On Sat, Feb 6, 2021 at 5:38 PM Dan Hill <quietgol...@gmail.com> wrote: > Ah looks like I need to use 1.12 for this. I'm still on 1.11. > > On Fri, Feb 5, 2021, 08:37 Dan Hill <quietgol...@gmail.com> wrote: > >> Thanks Aljoscha! >> >> On Fri, Feb 5, 2021 at 1:48 AM Aljoscha Krettek <aljos...@apache.org> >> wrote: >> >>> Hi Dan, >>> >>> I'm afraid this is not easily possible using the DataStream API in >>> STREAMING execution mode today. However, there is one possible solution >>> and we're introducing changes that will also make this work on STREAMING >>> mode. >>> >>> The possible solution is to use the `FileSink` instead of the >>> `StreamingFileSink`. This is an updated version of the sink that works >>> in both BATCH and STREAMING mode (see [1]). If you use BATCH execution >>> mode all your files should be "completed" at the end. >>> >>> [1] >>> https://ci.apache.org/projects/flink/flink-docs-master/dev/datastream_execution_mode.html >>> >>> The thing we're currently working on is FLIP-147 [2], which will allow >>> sinks (and other operators) to always do one final checkpoint before >>> shutting down. This will allow them to move the last outstanding >>> inprogress files over to finished as well. >>> >>> [2] https://cwiki.apache.org/confluence/x/mw-ZCQ >>> >>> I hope that helps! >>> >>> Best, >>> Aljoscha >>> >>> On 2021/02/04 21:37, Dan Hill wrote: >>> >Hi Flink user group, >>> > >>> >*Background* >>> >I'm changing a Flink SQL job to use Datastream. I'm updating an >>> existing >>> >Minicluster test in my code. It has a similar structure to other tests >>> in >>> >flink-tests. I call StreamExecutionEnvironment.execute. My tests sink >>> >using StreamingFileSink Bulk Formats to tmp local disk. >>> > >>> >*Issue* >>> >When I try to check the files on local disk, I see >>> >".part-0-0.inprogress.1234abcd-5678-uuid...". >>> > >>> >*Question* >>> >What's the best way to get the test to complete the outputs? I tried >>> >checkpointing very frequently, sleeping, etc but these didn't work. >>> > >>> >Thanks! >>> >- Dan >>> >>