[ 
https://issues.apache.org/jira/browse/FLINK-20654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17265862#comment-17265862
 ] 

Huang Xingbo commented on FLINK-20654:
--------------------------------------

The failed instance with "ArithmeticException: integer overflow" happened 
again. Do we reopen this issue?
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=12083&view=logs&j=34f41360-6c0d-54d3-11a1-0292a2def1d9&t=2d56e022-1ace-542f-bf1a-b37dd63243f2]
{code:java}
2021-01-15T03:39:12.3771952Z Caused by: java.lang.ArithmeticException: integer 
overflow
2021-01-15T03:39:12.3772559Z    at java.lang.Math.toIntExact(Math.java:1011)
2021-01-15T03:39:12.3773309Z    at 
java.lang.StrictMath.toIntExact(StrictMath.java:813)
2021-01-15T03:39:12.3774293Z    at 
org.apache.flink.test.checkpointing.UnalignedCheckpointITCase$CountingMapFunction.flatMap(UnalignedCheckpointITCase.java:488)
2021-01-15T03:39:12.3775410Z    at 
org.apache.flink.test.checkpointing.UnalignedCheckpointITCase$CountingMapFunction.flatMap(UnalignedCheckpointITCase.java:474)
2021-01-15T03:39:12.3776427Z    at 
org.apache.flink.streaming.api.operators.StreamFlatMap.processElement(StreamFlatMap.java:47)
2021-01-15T03:39:12.3777379Z    at 
org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:191)
2021-01-15T03:39:12.3778351Z    at 
org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:205)
2021-01-15T03:39:12.3779297Z    at 
org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:175)
2021-01-15T03:39:12.3780231Z    at 
org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
2021-01-15T03:39:12.3781114Z    at 
org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:399)
2021-01-15T03:39:12.3782078Z    at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:190)
2021-01-15T03:39:12.3783079Z    at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:608)
2021-01-15T03:39:12.3783947Z    at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:572)
2021-01-15T03:39:12.3784646Z    at 
org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:749)
2021-01-15T03:39:12.3785361Z    at 
org.apache.flink.runtime.taskmanager.Task.run(Task.java:564)
2021-01-15T03:39:12.3785993Z    at java.lang.Thread.run(Thread.java:748)
{code}

> Unaligned checkpoint recovery may lead to corrupted data stream
> ---------------------------------------------------------------
>
>                 Key: FLINK-20654
>                 URL: https://issues.apache.org/jira/browse/FLINK-20654
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Checkpointing
>    Affects Versions: 1.12.0
>            Reporter: Arvid Heise
>            Assignee: Roman Khachatryan
>            Priority: Blocker
>              Labels: pull-request-available, test-stability
>             Fix For: 1.13.0, 1.12.1
>
>
> Fix of FLINK-20433 shows potential corruption after recovery for all 
> variations of UnalignedCheckpointITCase.
> To reproduce, run UCITCase a couple hundreds times. The issue showed for me 
> in:
> - execute [Parallel union, p = 5]
> - execute [Parallel union, p = 10]
> - execute [Parallel cogroup, p = 5]
> - execute [parallel pipeline with remote channels, p = 5]
> with decreasing frequency.
> The issue manifests as one of the following issues:
> - stream corrupted exception
> - EOF exception
> - assertion failure in NUM_LOST or NUM_OUT_OF_ORDER
> - (for union) ArithmeticException overflow (because the number that should be 
> [0;100000] has been mis-deserialized)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to