[ 
https://issues.apache.org/jira/browse/FLINK-13275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kostas Kloudas updated FLINK-13275:
-----------------------------------
    Description: 
Travis run https://api.travis-ci.org/v3/job/558897776/log.txt with the race 
condition.

The race condition consists of the following series of steps:

1) the `finishTask` of the source puts the {{POISON_PILL}} in the {{mailbox}}. 
This will be put as first letter in the queue because we call 
{{sendPriorityLetter()}} in the {{MailboxProcessor}}. 

2) then in the {{cancel()}} of the source (called by the {{cancelTask()}} leads 
the source out of the run-loop which throws the exception in the test source. 
The latter, calls again {{sendPriorityLetter()}} for the exception, which means 
that this exception letter may override the previously sent {{POISON_PILL}} 
(because it jumps the line), 

3) if the {{POISON_PILL}} has already been executed, we are good, if not, then 
the test harness sets the exception in the 
{{StreamTaskTestHarness.TaskThread.error}}.

To fix that, I would suggest to follow the same strategy for the root problem 
https://issues.apache.org/jira/browse/FLINK-13124 as in release-1.9.

  was:
Travis run https://api.travis-ci.org/v3/job/558897776/log.txt

The race condition consists of the following series of steps:

1) the `finishTask` of the source puts the {{POISON_PILL}} in the {{mailbox}}. 
This will be put as first letter in the queue because we call 
{{sendPriorityLetter()}} in the {{MailboxProcessor}}. 

2) then in the {{cancel()}} of the source (called by the {{cancelTask()}} leads 
the source out of the run-loop which throws the exception in the test source. 
The latter, calls again {{sendPriorityLetter()}} for the exception, which means 
that this exception letter may override the previously sent {{POISON_PILL}} 
(because it jumps the line), 

3) if the {{POISON_PILL}} has already been executed, we are good, if not, then 
the test harness sets the exception in the 
{{StreamTaskTestHarness.TaskThread.error}}.

To fix that, I would suggest to follow the same strategy for the root problem 
https://issues.apache.org/jira/browse/FLINK-13124 as in release-1.9.


> Race condition in SourceStreamTaskTest.finishingIgnoresExceptions()
> -------------------------------------------------------------------
>
>                 Key: FLINK-13275
>                 URL: https://issues.apache.org/jira/browse/FLINK-13275
>             Project: Flink
>          Issue Type: Improvement
>          Components: Tests
>    Affects Versions: 1.10.0
>            Reporter: Kostas Kloudas
>            Assignee: Kostas Kloudas
>            Priority: Major
>
> Travis run https://api.travis-ci.org/v3/job/558897776/log.txt with the race 
> condition.
> The race condition consists of the following series of steps:
> 1) the `finishTask` of the source puts the {{POISON_PILL}} in the 
> {{mailbox}}. This will be put as first letter in the queue because we call 
> {{sendPriorityLetter()}} in the {{MailboxProcessor}}. 
> 2) then in the {{cancel()}} of the source (called by the {{cancelTask()}} 
> leads the source out of the run-loop which throws the exception in the test 
> source. The latter, calls again {{sendPriorityLetter()}} for the exception, 
> which means that this exception letter may override the previously sent 
> {{POISON_PILL}} (because it jumps the line), 
> 3) if the {{POISON_PILL}} has already been executed, we are good, if not, 
> then the test harness sets the exception in the 
> {{StreamTaskTestHarness.TaskThread.error}}.
> To fix that, I would suggest to follow the same strategy for the root problem 
> https://issues.apache.org/jira/browse/FLINK-13124 as in release-1.9.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to