[ 
https://issues.apache.org/jira/browse/FLINK-7880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16240153#comment-16240153
 ] 

Stefan Richter edited comment on FLINK-7880 at 11/6/17 11:20 AM:
-----------------------------------------------------------------

Yes, but the test seems to expect that waiting for {{CancellationSuccess}} 
includes a successful cleanup or it was just not aware how important the proper 
cleanup is with native resources. In any case, I think the origin of this 
problem might be taking other IT cases as blueprint, and I have seen different 
patterns for this "wait until the job is gone" problem in different tests. Many 
of them might be similar to this one, but they will often look correct or cause 
no trouble if there are no native libraries involved (e.g. test only uses heap 
backend). I would suggest that there should be one and only one simple (maybe 
one helper class that does this), idiomatic way of waiting for a job to go away 
and release all resources that is used throughout all tests that actually want 
to have this behaviour. Otherwise, for example, extending an existing test to 
include a different backend can suddenly uncover the improper cleanup and make 
tests randomly fail with a JVM crash. 

Having a clear way to end IT cases could help to avoid chasing seriously 
looking, misleading test failures that seem to originate from the RocksDB 
backend code, but are actually tests problems from improper cleanup. What do 
you think?


was (Author: srichter):
Yes, but the test seems to expect that waiting for {{CancellationSuccess}} 
includes a successful cleanup or it was just not aware how important the proper 
cleanup is with native resources. In any case, I think the origin of this 
problem might be taking other IT cases a blueprints, and I have seen different 
patterns for this "wait until the job is gone" problem in different tests. Many 
of them might be similar to this one, but they will often look correct or cause 
no trouble if there are no native libraries involved (e.g. test only uses heap 
backend). I would suggest that there should be one and only one simple (maybe 
one helper class that does this), idiomatic way of waiting for a job to go away 
and release all resources that is used throughout all tests that actually want 
to have this behaviour. Otherwise, for example, extending an existing test to 
include a different backend can suddenly uncover the improper cleanup and make 
tests randomly fail with a JVM crash. 

Having a clear way to end IT cases could help to avoid chasing seriously 
looking, misleading test failures that seem to originate from the RocksDB 
backend code, but are actually tests problems from improper cleanup. What do 
you think?

> flink-queryable-state-java fails with core-dump
> -----------------------------------------------
>
>                 Key: FLINK-7880
>                 URL: https://issues.apache.org/jira/browse/FLINK-7880
>             Project: Flink
>          Issue Type: Bug
>          Components: Queryable State, Tests
>    Affects Versions: 1.4.0
>            Reporter: Till Rohrmann
>            Assignee: Kostas Kloudas
>            Priority: Blocker
>              Labels: test-stability
>             Fix For: 1.4.0
>
>
> The {{flink-queryable-state-java}} module fails on Travis with a core dump.
> https://travis-ci.org/tillrohrmann/flink/jobs/289949829



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to