flink checkpoint state data corruption

2019-11-13 Thread Jeffrey Martin
Hi all, I'm using protobufs as keys of a Flink stream using code copied from this pull request , but deserialization is failing after checkpoint restore due to missing data. I'm using HDFS and the RocksDB backend. I tried providing the path to a previous

Re: non-deserializable root cause in DeclineCheckpoint

2019-09-22 Thread Jeffrey Martin
d to seeing your pr soon :) > > Best, > Terry Wang > > > > > 在 2019年9月23日,上午11:48,Jeffrey Martin 写道: > > > > Hi Terry, > > > > KafkaException comes in through the job's dependencies (it's defined in > the > > kafka-clients jar packed up in

Re: non-deserializable root cause in DeclineCheckpoint

2019-09-22 Thread Jeffrey Martin
> it’s impossible to check > the possibility of serialization in advance. > Do I understand right? > > > > Best, > Terry Wang > > > > > 在 2019年9月23日,上午5:17,Jeffrey Martin 写道: > > > > Thanks for suggestion, Terry. I've investigated a bit fur

Re: non-deserializable root cause in DeclineCheckpoint

2019-09-22 Thread Jeffrey Martin
er changed > in flink 1.9 since you mentioned that your program run normally in flink > 1.8. > Just a thought from me. > Hope to be useful~ > > Best, > Terry Wang > > > > > 在 2019年9月21日,上午2:58,Jeffrey Martin 写道: > > > > JIRA ticket: https://issues.ap

non-deserializable root cause in DeclineCheckpoint

2019-09-21 Thread Jeffrey Martin
JIRA ticket: https://issues.apache.org/jira/browse/FLINK-14076 I'm on Flink v1.9 with the Kafka connector and a standalone JM. If FlinkKafkaProducer fails while checkpointing, it throws a KafkaException which gets wrapped in a CheckpointException which is sent to the JM as a DeclineCheckpoint. Ka

Re: [FLINK-14076] non-deserializable root cause in DeclineCheckpoint

2019-09-20 Thread Jeffrey Martin
To be clear -- I'm happy to make a PR for either option below. (Either is <10 lines diff.) It's just the contributor guidelines said to get consensus first and then only make a PR if I'm assigned to do the work. On Fri, Sep 20, 2019 at 12:23 PM Jeffrey Martin wrote: > (po

[FLINK-14076] non-deserializable root cause in DeclineCheckpoint

2019-09-20 Thread Jeffrey Martin
(possible dupe; I wasn't subscribed before and the previous message didn't seem to go through) I'm on Flink v1.9 with the Kafka connector and a standalone JM. If FlinkKafkaProducer fails while checkpointing, it throws a KafkaException which gets wrapped in a CheckpointException which is sent to th

[jira] [Created] (FLINK-14076) 'ClassNotFoundException: KafkaException' on Flink v1.9 w/ checkpointing

2019-09-14 Thread Jeffrey Martin (Jira)
Jeffrey Martin created FLINK-14076: -- Summary: 'ClassNotFoundException: KafkaException' on Flink v1.9 w/ checkpointing Key: FLINK-14076 URL: https://issues.apache.org/jira/browse/FLINK-14076