I tried to replace default no-op handler by handler stopping node and failing the test.
I've returned the no-op handler in many classes because critical situations are expected behavior. But PR still have a lot of failed tests and suites. In some tests, I can't understand a failure reason. I'm not finished to check failures, but after several RunAll runs, I see new flaky tests (1 or 2 fails) appeared because of new handler. I think we should keep no-op handler as default, but add new handler for a few classes, where critical situations aren't expected. пт, 21 сент. 2018 г. в 17:03, Andrey Kuznetsov <stku...@gmail.com>: > > Thanks to all for participating the discussion. > > I've updated [1]: now it requires new handler from [2] for completion. > > [1] https://issues.apache.org/jira/browse/IGNITE-9660 > [2] https://issues.apache.org/jira/browse/IGNITE-8227 > > чт, 20 сент. 2018 г. в 21:56, Vladimir Ozerov <voze...@gridgain.com>: > > > Stop node handler is not very good choice. Some test will continue work as > > usual even if some node failed. E.g. SQL queries with backups may continue > > function in some cases, especially if these are test with REPLICATED cache. > > > > New test-scope handler looks like a better candidate to me. > > > > чт, 20 сент. 2018 г. в 21:22, Andrey Kuznetsov <stku...@gmail.com>: > > > > > I meant the first comment in [1]. We are to decide first whether we'll do > > > it or not. > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-8227 > > > < > > > > > https://issues.apache.org/jira/browse/IGNITE-8227?focusedCommentId=16435298&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16435298 > > > > > > > > > > чт, 20 сент. 2018 г. в 21:18, Dmitriy Pavlov <dpavlov....@gmail.com>: > > > > > > > Sorry, incomplete message. > > > > > > > > Why do you think there is no consensus? > > > > > > > > I have no clue what can be a reason for another approach. > > > > By default failure handler should fail all test. > > > > > > > > Failure handlers test will be always a minority of tests, so fail > > handler > > > > call is something abnormal. > > > > > > > > чт, 20 сент. 2018 г. в 21:15, Dmitriy Pavlov <dpavlov....@gmail.com>: > > > > > > > > > Why do you think there is no consensus? > > > > > > > > > > I have no clue that by default failure handler should fail all test. > > > > > > > > > > чт, 20 сент. 2018 г. в 21:10, Andrey Kuznetsov <stku...@gmail.com>: > > > > > > > > > >> I've created [1] to address this. > > > > >> > > > > >> Dmitriy, I like your idea of creating special test-scope handler. > > But > > > > >> there > > > > >> is no consensus about it, so I don't want to rely on that potential > > > > >> handler > > > > >> right now. We can switch to it later, of course. > > > > >> > > > > >> [1] https://issues.apache.org/jira/browse/IGNITE-9660 > > > > >> > > > > >> чт, 20 сент. 2018 г. в 20:03, Maxim Muzafarov <maxmu...@gmail.com>: > > > > >> > > > > >> > Andrey, > > > > >> > > > > > >> > I like your idea. > > > > >> > > > > > >> > After changing the default node failure handler to the new one we > > > > should > > > > >> > carefully review the whole new test failures. For instance, > > calling > > > > this > > > > >> > method in tests should not lead test to the node being stopped: > > > > >> > > > > > >> > <strong>FOR TEST ONLY!!!</strong> > > > > >> > TcpDiscoverySpi#simulateNodeFailure > > > > >> > > > > > >> > BTW, I would like to remove this method at all from production > > code. > > > > >> > > > > > >> > On Thu, 20 Sep 2018 at 19:43 Dmitriy Pavlov < > > dpavlov....@gmail.com> > > > > >> wrote: > > > > >> > > > > > >> > > But the totally ideal situation would be finding a way to fail > > the > > > > >> test > > > > >> > by > > > > >> > > default, not only stopping a node. > > > > >> > > > > > > >> > > Some time ago I've created > > > > >> > > https://issues.apache.org/jira/browse/IGNITE-8227 to > > > > >> > > find out a way to do so. > > > > >> > > > > > > >> > > чт, 20 сент. 2018 г. в 19:40, Dmitriy Pavlov < > > > dpavlov....@gmail.com > > > > >: > > > > >> > > > > > > >> > > > ++1 > > > > >> > > > > > > > >> > > > чт, 20 сент. 2018 г. в 19:39, Andrey Kuznetsov < > > > stku...@gmail.com > > > > >: > > > > >> > > > > > > > >> > > >> Igniters, > > > > >> > > >> > > > > >> > > >> While running tests I see a lot of ignored critical failures > > > > >> caused by > > > > >> > > >> {{NoOpFailureHandler}} that we use by default. In some tests, > > > of > > > > >> > cource, > > > > >> > > >> critical failures are the part of normal workflow, but in the > > > > >> majority > > > > >> > > of > > > > >> > > >> tests they indicate bugs. By using {{NoOpFailureHandler}} we > > > just > > > > >> hide > > > > >> > > >> these bugs from ourselves. > > > > >> > > >> > > > > >> > > >> What do you think about changing default handler to > > > > >> > > >> {{StopNodeFailureHandler}} at {{GridAbstractTest}} level? > > This > > > > >> could > > > > >> > be > > > > >> > > >> overridden in subclasses. > > > > >> > > >> > > > > >> > > >> -- > > > > >> > > >> Best regards, > > > > >> > > >> Andrey Kuznetsov. > > > > >> > > >> > > > > >> > > > > > > > >> > > > > > > >> > -- > > > > >> > -- > > > > >> > Maxim Muzafarov > > > > >> > > > > > >> > > > > >> > > > > >> -- > > > > >> Best regards, > > > > >> Andrey Kuznetsov. > > > > >> > > > > > > > > > > > > > > > > > > -- > > > Best regards, > > > Andrey Kuznetsov. > > > > > > > > -- > Best regards, > Andrey Kuznetsov.