Hello Timo,
thanks for the response.

We are still investigating in the production system but in test we get now
this exception that seems  very much related to the issue 6291.


java.lang.Exception: Could not perform checkpoint 13468 for operator
Aggregator -> Sink: HBase (1/1).
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:552)
        at 
org.apache.flink.streaming.runtime.io.BarrierBuffer.notifyCheckpoint(BarrierBuffer.java:378)
        at 
org.apache.flink.streaming.runtime.io.BarrierBuffer.processBarrier(BarrierBuffer.java:281)
        at 
org.apache.flink.streaming.runtime.io.BarrierBuffer.getNextNonBlocked(BarrierBuffer.java:183)
        at 
org.apache.flink.streaming.runtime.io.StreamTwoInputProcessor.processInput(StreamTwoInputProcessor.java:277)
        at 
org.apache.flink.streaming.runtime.tasks.TwoInputStreamTask.run(TwoInputStreamTask.java:91)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:263)
        at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.Exception: Could not complete snapshot 13468 for
operator Aggregator -> Sink: HBase (1/1).
        at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:407)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask$CheckpointingOperation.checkpointStreamOperator(StreamTask.java:1162)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask$CheckpointingOperation.executeCheckpointing(StreamTask.java:1094)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.checkpointState(StreamTask.java:654)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:590)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:543)
        ... 8 more
Caused by: java.lang.Exception: Could not write timer service of
Aggregator -> Sink: HBase (1/1) to checkpoint state stream.
        at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:438)
        at 
org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.snapshotState(AbstractUdfStreamOperator.java:98)
        at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:385)
        ... 13 more
Caused by: java.lang.NullPointerException
        at 
org.apache.flink.streaming.api.operators.HeapInternalTimerService.snapshotTimersForKeyGroup(HeapInternalTimerService.java:304)
        at 
org.apache.flink.streaming.api.operators.InternalTimeServiceManager.snapshotStateForKeyGroup(InternalTimeServiceManager.java:121)
        at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:434)
        ... 15 more



On Fri, May 25, 2018 at 3:11 PM Timo Walther <twal...@apache.org> wrote:

> Hi Alberto,
>
> do you get exactly the same exception? Maybe you can share some logs
> with us?
>
> Regards,
> Timo
>
> Am 25.05.18 um 13:41 schrieb Alberto Mancini:
> > Hello,
> > I think we are experiencing this issue:
> > https://issues.apache.org/jira/browse/FLINK-6291
> >
> > In fact we have a long running job that is unable to complete a
> > checkpoint and so we are unable to create a savepoint.
> >
> > I do not really understand from 6291 how the timer service has been
> > removed in my job and mostly i do not find how i can let my job to
> > create a savepoint.
> > We are using flink 1.3.2.
> >
> > Thanks,
> >    Alberto.
> >
>
>

Reply via email to