[jira] [Commented] (FLINK-4973) Flakey Yarn tests due to recently added latency marker

Andrey (JIRA) Tue, 18 Apr 2017 07:14:30 -0700

    [ 
https://issues.apache.org/jira/browse/FLINK-4973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15972750#comment-15972750
 ]


Andrey commented on FLINK-4973:
-------------------------------

Proper handling InterruptedException will make it possible to clean and fast 
thread pool shutdown. See except from the link:
{code}
The standard thread pool (ThreadPoolExecutor) worker thread implementation is 
responsive to interruption, so interrupting a task running in a thread pool may 
have the effect of both canceling the task and notifying the execution thread 
that the thread pool is shutting down. If the task were to swallow the 
interrupt request, the worker thread might not learn that an interrupt was 
requested, which could delay the application or service shutdown.
{code}
So it might indirectly affect shutdown sequence. I agree it won't fix the root 
cause, but could make life a bit easier :)

> Flakey Yarn tests due to recently added latency marker
> ------------------------------------------------------
>
>                 Key: FLINK-4973
>                 URL: https://issues.apache.org/jira/browse/FLINK-4973
>             Project: Flink
>          Issue Type: Bug
>          Components: Tests
>    Affects Versions: 1.2.0
>            Reporter: Till Rohrmann
>            Assignee: Till Rohrmann
>            Priority: Critical
>              Labels: test-stability
>             Fix For: 1.2.0
>
>
> The newly introduced {{LatencyMarksEmitter}} emits latency marker on the 
> {{Output}}. This can still happen after the underlying {{BufferPool}} has 
> been destroyed. The occurring exception is then logged:
> {code}
> 2016-10-29 15:00:48,088 INFO  org.apache.flink.runtime.taskmanager.Task       
>               - Source: Custom File Source (1/1) switched to FINISHED
> 2016-10-29 15:00:48,089 INFO  org.apache.flink.runtime.taskmanager.Task       
>               - Freeing task resources for Source: Custom File Source (1/1)
> 2016-10-29 15:00:48,089 INFO  org.apache.flink.yarn.YarnTaskManager           
>               - Un-registering task and sending final execution state 
> FINISHED to JobManager for task Source: Custom File Source 
> (8fe0f817fa6d960ea33f6e57e0c3891c)
> 2016-10-29 15:00:48,101 WARN  
> org.apache.flink.streaming.api.operators.AbstractStreamOperator  - Error 
> while emitting latency marker
> java.lang.RuntimeException: Buffer pool is destroyed.
>       at 
> org.apache.flink.streaming.runtime.io.RecordWriterOutput.emitLatencyMarker(RecordWriterOutput.java:99)
>       at 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.emitLatencyMarker(AbstractStreamOperator.java:734)
>       at 
> org.apache.flink.streaming.api.operators.StreamSource$LatencyMarksEmitter$1.run(StreamSource.java:134)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>       at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
>       at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>       at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalStateException: Buffer pool is destroyed.
>       at 
> org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBuffer(LocalBufferPool.java:144)
>       at 
> org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBufferBlocking(LocalBufferPool.java:133)
>       at 
> org.apache.flink.runtime.io.network.api.writer.RecordWriter.sendToTarget(RecordWriter.java:118)
>       at 
> org.apache.flink.runtime.io.network.api.writer.RecordWriter.randomEmit(RecordWriter.java:103)
>       at 
> org.apache.flink.streaming.runtime.io.StreamRecordWriter.randomEmit(StreamRecordWriter.java:104)
>       at 
> org.apache.flink.streaming.runtime.io.RecordWriterOutput.emitLatencyMarker(RecordWriterOutput.java:96)
>       ... 9 more
> {code}
> This exception is clearly related to the shutdown of a stream operator and 
> does not indicate a wrong behaviour. Since the yarn tests simply scan the log 
> for some keywords (including exception) such a case can make them fail.
> Best if we could make sure that the {{LatencyMarksEmitter}} would only emit 
> latency marker if the {{Output}} would still be active. But we could also 
> simply not log exceptions which occurred after the stream operator has been 
> stopped.
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/171578846/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-4973) Flakey Yarn tests due to recently added latency marker

Reply via email to