Hi Dawid,
After rewriting dashcode with Objects.hash for all the fields, I still get the
same error. One thing special is checkpoints always fail at 428, after trying
many times. Does it mean anything?
> On Aug 10, 2017, at 9:14 AM, Dawid Wysakowicz <wysakowicz.da...@gmail.com>
> wrote:
>
> Yes, with the information I have, the conclusion would be the same, that I
> think the reason is problem with hashcode. Without some data to reproduce it
> unfortunately I won’t be able to help you further. I could just advise you to
> debug the method SharedBuffer#serialize and pay attention to the entryID map.
>
>> On 10 Aug 2017, at 14:54, Daiqing Li <lidaiqing1...@gmail.com> wrote:
>>
>> Oh sorry, the data in {} is not empty because I hide private information
>> about my model. Do you have that same conclusion?
>>> On Aug 10, 2017, at 8:52 AM, Dawid Wysakowicz <wysakowicz.da...@gmail.com>
>>> wrote:
>>>
>>> You are right, I won’t be able to reproduce this problem without data. One
>>> thing I can tell though that I think the problem is indeed with the
>>> hashcode. Unforunately I don’t know Gson, but one strange thing I noticed
>>> is the exception message: SharedBufferEntry(ValueTimeWrapper({},
>>> 1502298303586, 0), [SharedBufferEdge(null, 1)], 1). The first {} is your
>>> event.toString, which seems odd as if your event was empty.
>>>
>>> Generally speaking as I understand this Exception is thrown because the
>>> hashcode of your event changes during serialization, and access to some
>>> internal temporary cache is broken.
>>>
>>>> On 10 Aug 2017, at 14:29, Daiqing Li <lidaiqing1...@gmail.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> Here is the code. But I am not sure if you can reproduce the problem
>>>> without data source.
>>>>
>>>> Best,
>>>> Daiqing
>>>>
>>>> On Thu, Aug 10, 2017 at 8:15 AM, Dawid Wysakowicz
>>>> <wysakowicz.da...@gmail.com> wrote:
>>>> As @Kostas asked in your previous thread would be possible for you to
>>>> share your code for that job or at least a minimal example to reproduce
>>>> this behaviour. I fear we won’t be able to help you without any further
>>>> info.
>>>>
>>>> Regards,
>>>> Dawid
>>>>
>>>>> On 10 Aug 2017, at 14:10, Daiqing Li <lidaiqing1...@gmail.com> wrote:
>>>>>
>>>>> Hi Flink user,
>>>>>
>>>>> I am using FsStateBackend and AWS EMR YARN to run a CEP job. I got this
>>>>> exception after running for a while. Could anyone give me some help to
>>>>> debug this? I try parallelism 1, and it has the same problem. I also try
>>>>> reimplemented hashcode and equals method. I use UUID as hashcode right
>>>>> now.
>>>>> 2017-08-09 18:15:04,572 INFO
>>>>> org.apache.flink.runtime.executiongraph.ExecutionGraph -
>>>>> KeyedCEPPatternOperator -> Map -> Sink: Unnamed (3/4)
>>>>> (d4749a4c3469732a2a5edf40b83f88
>>>>> d4) switched from RUNNING to FAILED.
>>>>> AsynchronousException{java.
>>>>> lang.Exception: Could not materialize checkpoint 946 for operator
>>>>> KeyedCEPPatternOperator -> Map -> Sink: Unnamed (3/4).}
>>>>> at org.apache.flink.streaming.
>>>>> runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(
>>>>> StreamTask.java:970)
>>>>> at java.util.concurrent.
>>>>> Executors$RunnableAdapter.
>>>>> call(Executors.java:511)
>>>>> at java.util.concurrent.
>>>>> FutureTask.run(FutureTask.
>>>>> java:266)
>>>>> at java.util.concurrent.
>>>>> ThreadPoolExecutor.runWorker(
>>>>> ThreadPoolExecutor.java:1149)
>>>>> at java.util.concurrent.
>>>>> ThreadPoolExecutor$Worker.run(
>>>>> ThreadPoolExecutor.java:624)
>>>>> at java.lang.Thread.run(Thread.
>>>>> java:748)
>>>>> Caused by: java.lang.Exception: Could not materialize checkpoint 946 for
>>>>> operator KeyedCEPPatternOperator -> Map -> Sink: Unnamed (3/4).
>>>>> ... 6 more
>>>>> Caused by: java.util.concurrent.
>>>>> ExecutionException: java.lang.IllegalStateException: Could not find id
>>>>> for entry: SharedBufferEntry(
>>>>> ValueTimeWrapper({}, 1502298303586, 0), [SharedBufferEdge(null, 1)], 1)
>>>>> at java.util.concurrent.
>>>>> FutureTask.report(FutureTask.
>>>>> java:122)
>>>>> at java.util.concurrent.
>>>>> FutureTask.get(FutureTask.
>>>>> java:192)
>>>>> at org.apache.flink.util.
>>>>> FutureUtil.runIfNotDoneAndGet(
>>>>> FutureUtil.java:43)
>>>>> at org.apache.flink.streaming.
>>>>> runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(
>>>>> StreamTask.java:897)
>>>>> ... 5 more
>>>>> Suppressed: java.lang.Exception: Could not properly cancel managed
>>>>> keyed state future.
>>>>> at org.apache.flink.streaming.
>>>>> api.operators.OperatorSnapshotResult.cancel(OperatorSnapshotResult.java:
>>>>> 90)
>>>>> at org.apache.flink.streaming.
>>>>> runtime.tasks.StreamTask$AsyncCheckpointRunnable.
>>>>> cleanup(StreamTask.java:1023)
>>>>> at org.apache.flink.streaming.
>>>>> runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(
>>>>> StreamTask.java:961)
>>>>> ... 5 more
>>>>>
>>>>
>>>>
>>>> <MilestoneEvent.java><example.java>
>>>
>>
>