You have to use StreamExecutionEnvironment#createFileInput for
implementing CheckpointableInputFormat to have any effect. This
internally results in it being used by the MonitoringFileSource.
If you use StreamExecutionEnvironment#createInput nothing will be
checkpointed for the source; and yes this usually means having to
restart the entire job if an error occurs.
Checkpoints/savepoints cannot be taken if any task is no longer running,
see FLINK-2491.
On 03/10/2019 06:38, Lu Niu wrote:
Hi, Fabian
Thanks for replying!
I implemented a Custom RichInputFormat
implementing CheckpointableInputFormat. And I found it is executed
through InputFormatSourceFunction, which doesn't
use CheckpointableInputFormat during execution. If so, how does
checkpoint work here?
I also notice when one task finished, I cannot trigger savepoint
anymore. It throws exception "Not all tasks are running". Does that
imply no savepoint/checkpoint can be taken once any task finish?
Best
Lu
On Fri, Sep 6, 2019 at 6:33 AM Fabian Hueske <fhue...@gmail.com
<mailto:fhue...@gmail.com>> wrote:
Hi,
CheckpointableInputFormat is only relevant if you plan to use the
InputFormat in a MonitoringFileSource, i.e., in a streaming
application.
If you plan to use it in a DataSet (batch) program, InputFormat is
fine.
Btw. the latest release Flink 1.9.0 has major improvements for the
recovery of batch jobs.
Best, Fabian
Am Do., 5. Sept. 2019 um 19:01 Uhr schrieb Lu Niu
<qqib...@gmail.com <mailto:qqib...@gmail.com>>:
Hi, Team
I am implementing a custom InputFormat. Shall I
implement CheckpointableInputFormat interface? If I don't,
does that mean the whole job has to restart given only one
task fails? I ask because I found all InputFormat
implements CheckpointableInputFormat, which makes me confused.
Thank you!
Best
Lu