Re: Stop nodes after test by default - IGNITE-6842

Dmitry Pavlov Mon, 19 Mar 2018 05:23:20 -0700

I agree it is important, I'm going to do a review, but do not have time
slot to do.


Who could pick up this review?

Dmitriy G., could I ask you?

пн, 19 мар. 2018 г. в 15:13, Maxim Muzafarov <maxmu...@gmail.com>:

> Dmitry and other igniters,
>
> Will you have time to review this issue?
> I've preperated PR and TC for this, also I've fixed all comments made by
> Andrey Kuznetsov and Vyacheslav Daradur.
>
> I think this is important issue and will make test framework more stable
> and clear.
>
>
> TC: https://ci.ignite.apache.org/viewLog.html?buildId=1138151
> JIRA: https://issues.apache.org/jira/browse/IGNITE-6842
> Upsource: https://reviews.ignite.apache.org/ignite/review/IGNT-CR-502
> PR: https://github.com/apache/ignite/pull/3542
>
> чт, 15 мар. 2018 г. в 13:31, Maxim Muzafarov <maxmu...@gmail.com>:
>
>> Dmtry,
>>
>> Can we proceed with this change?
>> I've done with fixing review comments and tests that you mentioned before.
>>
>> TC: https://ci.ignite.apache.org/viewLog.html?buildId=1138151
>> JIRA: https://issues.apache.org/jira/browse/IGNITE-6842
>> Upsource: https://reviews.ignite.apache.org/ignite/review/IGNT-CR-502
>> PR: https://github.com/apache/ignite/pull/3542
>>
>>
>>
>> вт, 6 мар. 2018 г. в 20:42, Dmitry Pavlov <dpavlov....@gmail.com>:
>>
>>> Ok, thank you.
>>>
>>> Please let me know when we can proceed with review
>>> https://reviews.ignite.apache.org/ignite/review/IGNT-CR-502
>>>
>>>
>>> вт, 6 мар. 2018 г. в 20:17, Maxim Muzafarov <maxmu...@gmail.com>:
>>>
>>> > Hello Dmitry,
>>> >
>>> > Yes, I've updated test classes as you metioned before.
>>> > Now i'm fixing review comments. Within next few days I'll prepare final
>>> > version of this PR.
>>> >
>>> > вт, 6 мар. 2018 г. в 20:12, Dmitry Pavlov <dpavlov....@gmail.com>:
>>> >
>>> > > Hi Maxim,
>>> > >
>>> > > are there any news on these test fails?
>>> > >
>>> > > Is issue ready for review?
>>> > >
>>> > > Sincerely,
>>> > > Dmitiry Pavlov
>>> > >
>>> > > вт, 27 февр. 2018 г. в 17:12, Dmitry Pavlov <dpavlov....@gmail.com>:
>>> > >
>>> > > > Hi, thank you!
>>> > > >
>>> > > > I've found several suspicious fails: such test fails have rate less
>>> > than
>>> > > > 1%, it is probably new failures.
>>> > > >
>>> > > > It would be great if we can fix it before merge. Could you address
>>> this
>>> > > > fails?
>>> > > >
>>> > > > Sincerely,
>>> > > > Dmitriy Pavlov
>>> > > >
>>> > > > IgniteCacheTestSuite5: IgniteCacheStoreCollectionTest.testStoreMap
>>> > (fail
>>> > > > rate 0,0%)
>>> > > > IgniteCacheTestSuite5:
>>> > > > CacheLateAffinityAssignmentTest.testDelayAssignmentClientJoin (fail
>>> > rate
>>> > > > 0,0%)
>>> > > > IgniteCacheWithIndexingTestSuite:
>>> > > > CacheRandomOperationsMultithreadedTest.testAtomicOffheapEviction
>>> (fail
>>> > > rate
>>> > > > 0,0%)
>>> > > > IgniteCacheWithIndexingTestSuite:
>>> > > >
>>> >
>>> CacheRandomOperationsMultithreadedTest.testAtomicOffheapEvictionIndexing
>>> > > > (fail rate 0,0%)
>>> > > > IgniteCacheWithIndexingTestSuite:
>>> > > > CacheRandomOperationsMultithreadedTest.testTxOffheapEviction (fail
>>> rate
>>> > > > 0,0%)
>>> > > > IgniteCacheWithIndexingTestSuite:
>>> > > >
>>> CacheRandomOperationsMultithreadedTest.testTxOffheapEvictionIndexing
>>> > > (fail
>>> > > > rate 0,0%)
>>> > > >
>>> > > > IgniteBinarySimpleNameMapperCacheFullApiTestSuite:
>>> > > >
>>> > >
>>> >
>>> GridCachePartitionedNearDisabledMultiNodeWithGroupFullApiSelfTest.testWithSkipStoreTx
>>> > > > (fail rate 0,0%)
>>> > > >
>>> > > > вт, 27 февр. 2018 г. в 17:04, Maxim Muzafarov <maxmu...@gmail.com
>>> >:
>>> > > >
>>> > > >> Yep, link may not be correct.
>>> > > >>
>>> > > >> Here is correct version:
>>> > > >> TC: *
>>> > > >>
>>> > >
>>> >
>>> https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8&branch_IgniteTests24Java8=pull%2F3542%2Fhead
>>> > > >> <
>>> > > >>
>>> > >
>>> >
>>> https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8&branch_IgniteTests24Java8=pull%2F3542%2Fhead
>>> > > >> >*
>>> > > >>
>>> > > >>
>>> > > >> вт, 27 февр. 2018 г. в 16:41, Dmitry Pavlov <
>>> dpavlov....@gmail.com>:
>>> > > >>
>>> > > >> > Hi Maxim,
>>> > > >> >
>>> > > >> > could you please provide link to TC run on your PR? It seems
>>> link
>>> > > >> provided
>>> > > >> > points to run of master. In changes field you may select
>>> > > pull/3542/head
>>> > > >> > before starting RunAll.
>>> > > >> >
>>> > > >> > Igniters,
>>> > > >> >
>>> > > >> > this change is related to our test framework, so change may
>>> affect
>>> > > your
>>> > > >> > tests. Please join to review
>>> > > >> > https://reviews.ignite.apache.org/ignite/review/IGNT-CR-502
>>> > > >> >
>>> > > >> > Sincerely,
>>> > > >> > Dmitriy Pavlov
>>> > > >> >
>>> > > >> > вт, 27 февр. 2018 г. в 16:14, Maxim Muzafarov <
>>> maxmu...@gmail.com>:
>>> > > >> >
>>> > > >> > > Hi all,
>>> > > >> > >
>>> > > >> > > I think, I've done with this issue, what should we do next?
>>> > > >> > >
>>> > > >> > > PR: https://github.com/apache/ignite/pull/3542
>>> > > >> > > Upsource:
>>> > > https://reviews.ignite.apache.org/ignite/review/IGNT-CR-502
>>> > > >> > > TC:
>>> > > >> > >
>>> > > >> > >
>>> > > >> >
>>> > > >>
>>> > >
>>> >
>>> https://ci.ignite.apache.org/viewModification.html?modId=723895&personal=false&buildTypeId=&tab=vcsModificationTests
>>> > > >> > > JIRA: https://issues.apache.org/jira/browse/IGNITE-6842
>>> > > >> > >
>>> > > >> > >
>>> > > >> > > чт, 22 февр. 2018 г. в 14:12, Dmitry Pavlov <
>>> > dpavlov....@gmail.com
>>> > > >:
>>> > > >> > >
>>> > > >> > > > Hi Maxim,
>>> > > >> > > >
>>> > > >> > > > Thank you.
>>> > > >> > > >
>>> > > >> > > > To my mind stopAllGrids call should be kept in
>>> afterTestsStop().
>>> > > If
>>> > > >> > > > developer forgot to call super(), he will see exception from
>>> > > >> > > > beforeTestsStart()
>>> > > >> > > > added by you.
>>> > > >> > > >
>>> > > >> > > > If you think patch is ready to be reviewed, please change
>>> JIRA
>>> > > >> status
>>> > > >> > to
>>> > > >> > > > "Patch Available".
>>> > > >> > > >
>>> > > >> > > > Optionally you can create Upsource review. It would be
>>> easier
>>> > for
>>> > > >> > > multiple
>>> > > >> > > > reviewers to handle this patch. This test framework is used
>>> by
>>> > all
>>> > > >> > > Igniters
>>> > > >> > > > so Upsource would be good option (
>>> > > >> https://reviews.ignite.apache.org/
>>> > > >> > ).
>>> > > >> > > >
>>> > > >> > > > Sincerely,
>>> > > >> > > > Dmitriy Pavlov
>>> > > >> > > >
>>> > > >> > > > чт, 22 февр. 2018 г. в 13:44, Maxim Muzafarov <
>>> > maxmu...@gmail.com
>>> > > >:
>>> > > >> > > >
>>> > > >> > > > > Hi all,
>>> > > >> > > > >
>>> > > >> > > > > I've made some changes based on our previous discusstions,
>>> > > please
>>> > > >> see
>>> > > >> > > PR
>>> > > >> > > > > [1]:
>>> > > >> > > > > 1) Remove duplicated code for stopGrid (by index and by
>>> name);
>>> > > >> > > > > 2) Add new method that thows exception if not all grids
>>> > haven't
>>> > > >> > stopped
>>> > > >> > > > > correctly;
>>> > > >> > > > > 3)  Change tests that have been affected by this changes;
>>> > > >> > > > >
>>> > > >> > > > > Also, I have some thoughts for clarification:
>>> > > >> > > > > 1) beforeTestsStart() - I expect here in subtests that
>>> grids
>>> > are
>>> > > >> not
>>> > > >> > > > > started yet. Am I right?
>>> > > >> > > > > 2) I think we should call stopAllGrids in tearDown
>>> method. So,
>>> > > if
>>> > > >> in
>>> > > >> > > some
>>> > > >> > > > > cases we'll override afterTestsStop in subclasses and
>>> forgot
>>> > to
>>> > > >> call
>>> > > >> > > > > super() it won't lead us to exception.
>>> > > >> > > > >
>>> > > >> > > > > [1] https://github.com/apache/ignite/pull/3542
>>> > > >> > > > > [2]
>>> > > >> https://ci.ignite.apache.org/viewModification.html?modId=717275
>>> > > >> > > > > [3] https://issues.apache.org/jira/browse/IGNITE-6842
>>> > > >> > > > >
>>> > > >> > > > >
>>> > > >> > > > > ср, 7 февр. 2018 г. в 18:28, Maxim Muzafarov <
>>> > > maxmu...@gmail.com
>>> > > >> >:
>>> > > >> > > > >
>>> > > >> > > > > > Thank you all,
>>> > > >> > > > > >
>>> > > >> > > > > > I'll add this comment's for JIRA ticket, if you don't
>>> mind.
>>> > > >> > > > > >
>>> > > >> > > > > > ср, 7 февр. 2018 г. в 15:16, Dmitry Pavlov <
>>> > > >> dpavlov....@gmail.com
>>> > > >> > >:
>>> > > >> > > > > >
>>> > > >> > > > > >> Yes, this solution allows to cover both cases:
>>> > > >> > > > > >> a) not stopped node from previous test and
>>> > > >> > > > > >> b) allows to remove useless code that stops Ignite
>>> nodes
>>> > from
>>> > > >> each
>>> > > >> > > > test.
>>> > > >> > > > > >>
>>> > > >> > > > > >> ср, 7 февр. 2018 г. в 15:13, Anton Vinogradov <
>>> > > >> > > > avinogra...@gridgain.com
>>> > > >> > > > > >:
>>> > > >> > > > > >>
>>> > > >> > > > > >> > Maxim,
>>> > > >> > > > > >> >
>>> > > >> > > > > >> > We discussed with Dima privately, and decided
>>> > > >> > > > > >> >
>>> > > >> > > > > >> > 1) We have to assert that there is no alive nodes at
>>> > > >> > > > > GridAbstractTest's
>>> > > >> > > > > >> > beforeTestsStarted
>>> > > >> > > > > >> > 2) We have to kill all alive nodes (without force) at
>>> > > >> > > > > GridAbstractTest's
>>> > > >> > > > > >> > afterTestsStopped
>>> > > >> > > > > >> > 3) In case of any exceptions at #2 we have to see
>>> test
>>> > > error
>>> > > >> > > > > >> > 4) We can get rid of all useless stopAllGrids at
>>> > > >> > > GridAbstractTest's
>>> > > >> > > > > >> > subclasses.
>>> > > >> > > > > >> >
>>> > > >> > > > > >> >
>>> > > >> > > > > >> >
>>> > > >> > > > > >> > On Wed, Feb 7, 2018 at 2:32 PM, Dmitry Pavlov <
>>> > > >> > > > dpavlov....@gmail.com>
>>> > > >> > > > > >> > wrote:
>>> > > >> > > > > >> >
>>> > > >> > > > > >> > > > Let's just add stopAllGrids(flase) to
>>> > GridAbstractTest
>>> > > 's
>>> > > >> > > > > >> > > afterTestsStopped
>>> > > >> > > > > >> > > method body.
>>> > > >> > > > > >> > >
>>> > > >> > > > > >> > > Can't agree with it becase implicit silent
>>> shutdown of
>>> > > >> nodes
>>> > > >> > > from
>>> > > >> > > > > test
>>> > > >> > > > > >> > > framework may cause a lot of bugs introduced to
>>> Ignite.
>>> > > >> > > > > >> > >
>>> > > >> > > > > >> > > I think stop+test fail should occur.
>>> > > >> > > > > >> > > In that case author of incorrect test or not
>>> consistent
>>> > > >> Ignite
>>> > > >> > > > > >> shutdown
>>> > > >> > > > > >> > > will see problem.
>>> > > >> > > > > >> > >
>>> > > >> > > > > >> > > 'Fail fast' if always better than hidding real
>>> problem
>>> > > >> under
>>> > > >> > > > > automatic
>>> > > >> > > > > >> > > framework feature.
>>> > > >> > > > > >> > >
>>> > > >> > > > > >> > > ср, 7 февр. 2018 г. в 14:05, Anton Vinogradov <
>>> > > >> > > > > >> avinogra...@gridgain.com
>>> > > >> > > > > >> > >:
>>> > > >> > > > > >> > >
>>> > > >> > > > > >> > > > Hi all,
>>> > > >> > > > > >> > > >
>>> > > >> > > > > >> > > > > I've made some research about this problem and
>>> i
>>> > > think
>>> > > >> > that
>>> > > >> > > in
>>> > > >> > > > > >> > general
>>> > > >> > > > > >> > > we
>>> > > >> > > > > >> > > > > should move stopAllGrids method in
>>> GridAbstractTest
>>> > > >> class
>>> > > >> > to
>>> > > >> > > > > >> > > > > afterTestsStopped method with some changes. Am
>>> I
>>> > > right?
>>> > > >> > > > > >> > > > Let's just add stopAllGrids(flase) to
>>> > GridAbstractTest
>>> > > 's
>>> > > >> > > > > >> > > > afterTestsStopped method
>>> > > >> > > > > >> > > > body.
>>> > > >> > > > > >> > > >
>>> > > >> > > > > >> > > > > I have a question about stopAllGrids(boolean
>>> > cancel)
>>> > > >> this
>>> > > >> > > > > "cancel"
>>> > > >> > > > > >> > > > That's just a flag means "do not wait for any
>>> > > operations
>>> > > >> > > finish"
>>> > > >> > > > > >> > > > See no reason to set it true in that case.
>>> > > >> > > > > >> > > >
>>> > > >> > > > > >> > > > > If you delegate closing to afterTestsStopped
>>> this
>>> > > will
>>> > > >> > > affect
>>> > > >> > > > > only
>>> > > >> > > > > >> > > > > last test (method).
>>> > > >> > > > > >> > > > The idea is to stop all nodes at last test's
>>> method
>>> > > >> finish.
>>> > > >> > > > > >> > > >
>>> > > >> > > > > >> > > > >  Nodes that survive between tests can affect
>>> > > successive
>>> > > >> > > > > >> > > > tests.
>>> > > >> > > > > >> > > > Thats ok. we have a lot tests where nodes
>>> reusable
>>> > > >> between
>>> > > >> > > > test's
>>> > > >> > > > > >> > > methods.
>>> > > >> > > > > >> > > >
>>> > > >> > > > > >> > > >
>>> > > >> > > > > >> > > > On Wed, Feb 7, 2018 at 1:20 PM, Dmitry Pavlov <
>>> > > >> > > > > >> dpavlov....@gmail.com>
>>> > > >> > > > > >> > > > wrote:
>>> > > >> > > > > >> > > >
>>> > > >> > > > > >> > > > > Hi Igniters,
>>> > > >> > > > > >> > > > >
>>> > > >> > > > > >> > > > > IMO nodes that survive between tests is not
>>> general
>>> > > >> > practice
>>> > > >> > > > and
>>> > > >> > > > > >> I'm
>>> > > >> > > > > >> > > not
>>> > > >> > > > > >> > > > > sure is a best practice. I suggest to mark such
>>> > tests
>>> > > >> with
>>> > > >> > > > some
>>> > > >> > > > > >> > method
>>> > > >> > > > > >> > > > > overriden from AbstractTest.
>>> > > >> > > > > >> > > > >
>>> > > >> > > > > >> > > > > About cancel flag, please note
>>> stopAllGrids(boolean
>>> > > >> > cancel)
>>> > > >> > > > > >> > > cancel=false
>>> > > >> > > > > >> > > > > allows to wait of checkpoint ends in case
>>> > persistence
>>> > > >> > > enabled.
>>> > > >> > > > > >> > > > >
>>> > > >> > > > > >> > > > > I still suggest to avoid stopping any nodes by
>>> > test,
>>> > > >> but
>>> > > >> > > > > validate
>>> > > >> > > > > >> not
>>> > > >> > > > > >> > > > > stopped node exist and fail test instead of
>>> siltent
>>> > > >> > implicit
>>> > > >> > > > > >> actions.
>>> > > >> > > > > >> > > > >
>>> > > >> > > > > >> > > > > Sincerely,
>>> > > >> > > > > >> > > > > Dmitriy Pavlov
>>> > > >> > > > > >> > > > >
>>> > > >> > > > > >> > > > > ср, 7 февр. 2018 г. в 13:04, Andrey Kuznetsov <
>>> > > >> > > > > stku...@gmail.com
>>> > > >> > > > > >> >:
>>> > > >> > > > > >> > > > >
>>> > > >> > > > > >> > > > > > Hi Maxim,
>>> > > >> > > > > >> > > > > >
>>> > > >> > > > > >> > > > > > Regarding your first question, the use of
>>> > > >> > > afterTestsStopped
>>> > > >> > > > is
>>> > > >> > > > > >> not
>>> > > >> > > > > >> > > > enough
>>> > > >> > > > > >> > > > > > to stop all nodes, since each individual test
>>> > > >> (method)
>>> > > >> > can
>>> > > >> > > > > start
>>> > > >> > > > > >> > > custom
>>> > > >> > > > > >> > > > > set
>>> > > >> > > > > >> > > > > > of notes during its operation, and this very
>>> test
>>> > > >> should
>>> > > >> > > > stop
>>> > > >> > > > > >> all
>>> > > >> > > > > >> > > those
>>> > > >> > > > > >> > > > > > nodes. If you delegate closing to
>>> > afterTestsStopped
>>> > > >> this
>>> > > >> > > > will
>>> > > >> > > > > >> > affect
>>> > > >> > > > > >> > > > only
>>> > > >> > > > > >> > > > > > last test (method). Nodes that survive
>>> between
>>> > > tests
>>> > > >> can
>>> > > >> > > > > affect
>>> > > >> > > > > >> > > > > successive
>>> > > >> > > > > >> > > > > > tests.
>>> > > >> > > > > >> > > > > >
>>> > > >> > > > > >> > > > > > 2018-02-07 1:10 GMT+03:00 Maxim Muzafarov <
>>> > > >> > > > maxmu...@gmail.com
>>> > > >> > > > > >:
>>> > > >> > > > > >> > > > > >
>>> > > >> > > > > >> > > > > > > Hello,
>>> > > >> > > > > >> > > > > > >
>>> > > >> > > > > >> > > > > > > I've made some research about this problem
>>> and
>>> > i
>>> > > >> think
>>> > > >> > > > that
>>> > > >> > > > > in
>>> > > >> > > > > >> > > > general
>>> > > >> > > > > >> > > > > we
>>> > > >> > > > > >> > > > > > > should move stopAllGrids method in
>>> > > GridAbstractTest
>>> > > >> > > class
>>> > > >> > > > to
>>> > > >> > > > > >> > > > > > > afterTestsStopped method with some
>>> changes. Am
>>> > I
>>> > > >> > right?
>>> > > >> > > > > >> > > > > > >
>>> > > >> > > > > >> > > > > > > Also, I have a question about
>>> > > stopAllGrids(boolean
>>> > > >> > > cancel)
>>> > > >> > > > > >> this
>>> > > >> > > > > >> > > > > "cancel"
>>> > > >> > > > > >> > > > > > > argument. Why in some cases we should
>>> interrupt
>>> > > >> > > ComputeJob
>>> > > >> > > > > >> and in
>>> > > >> > > > > >> > > > some
>>> > > >> > > > > >> > > > > > > cases shouldn't? For example here:
>>> > > >> > > > > >> > > > > > >
>>> > > >> IgniteBaselineAffinityTopologyActivationTest#afterTest
>>> > > >> > > > > >> > > > > > > we call method stopAllGrids(false) this
>>> way.
>>> > Why
>>> > > >> not
>>> > > >> > > > "true"
>>> > > >> > > > > >> > > argument
>>> > > >> > > > > >> > > > > > > instead?
>>> > > >> > > > > >> > > > > > >
>>> > > >> > > > > >> > > > > > >
>>> > > >> > > > > >> > > > > > > --
>>> > > >> > > > > >> > > > > > Best regards,
>>> > > >> > > > > >> > > > > >   Andrey Kuznetsov.
>>> > > >> > > > > >> > > > > >
>>> > > >> > > > > >> > > > >
>>> > > >> > > > > >> > > >
>>> > > >> > > > > >> > >
>>> > > >> > > > > >> >
>>> > > >> > > > > >>
>>> > > >> > > > > >
>>> > > >> > > > >
>>> > > >> > > >
>>> > > >> > >
>>> > > >> >
>>> > > >>
>>> > > >
>>> > >
>>> >
>>>
>>

Re: Stop nodes after test by default - IGNITE-6842

Reply via email to