Hi, At the moment if the processing of any data input split fails, Flink will restart the batch job completely from scratch.
There is an ongoing effort to improve fine-grained recovery in FLINK-4256. Best, Andrey > On 2 Oct 2018, at 13:52, aviad <rotem.av...@gmail.com> wrote: > > Hi, > > I want to write batch job which reads data from *elasticsearch* using > *elasticsearch-hadoop* (https://github.com/elastic/elasticsearch-hadoop/) > and *hadoopInputFormat* > > example code (from > https://github.com/genged/flink-playground/blob/master/src/main/java/com/mic/flink/FlinkMain.java): > > > > elasticsearch-hadoop creates one Hadoop InputSplit (tasks) per Elasticsearch > shard. > so if my index have 20 shards, it will be split to 20 InputSplit > > > /My question is:/ > What will happen if my job restart (failover) after finishing half of the > InputSplit's ? > Does hadoopInputFormat remember which InputSplit are finished and knows how > to continue from where it stopped? (maybe read from beginning of unfinished > InputSplit? ) or it starts from the beginning? > > thanks > > > > -- > Sent from: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/