Hi, Thanks for bring this up Michael.
I want to elaborate on the impetus for this (or at least my take on it). When 8099 merged we had a thing that must never happen for our process to work. We introduced a large enough number of test failures that it was difficult to tell if you introduced a regression. At the time we thought we could exclude the test failures prior to 8099 and that the test failures introduced by 8099 would get addressed promptly. What has happened instead is that the number of failures have snowballed to the point that you can hardly tell if you broke anything even if you compare test by test with trunk. You have to go into the history on trunk for each test and go back several pages to really be sure. If you don’t have consistently passing CI you can’t avoid the addition of test failures by ongoing work that slip in masked by known failures. The artery is severed, we’re bleeding out, and we’re going to have to lose the leg. I’m sure the prosthetic when it comes will be just as good, but the rehab is going to suck. There that’s my analogy. I think the utests are in pretty good shape but the pig tests are a problem. They extend the job time a lot, cause aborts, and fail randomly. Ariel > On Aug 14, 2015, at 3:16 PM, Michael Shuler <mich...@pbandjelly.org> wrote: > > This is a prompt for Cassandra developers to discuss the alternatives and let > Test Engineering know what you desire. > > As discussed a few times in person, on irc, etc., there are a couple > different ways we can run tests in Jenkins, particularly cassandra-dtest. The > Cassandra developers are the committers to unit tests, so Test Engineering > runs whatever is in the branch. If you'd like to make changes to unit tests > to make things blue, just commit those! > > Currently, we run dtests as 1), but we could do 2): > > 1) Run all dtests that don't catastrophically hang a server, pass or fail, > and report the results. > 2) Run only known passing dtests, skipping anything that fails - make it all > blue on the main branch builds. > > The biggest benefit is that dev branch builds should be easily recognizable > as able to merge, if the dtest run is passing and blue. There is no > comparison with the main branch build needing interpretation. > > Test Eng has recently added the ability run *only* the skipped tests and has > a a prototype job, trunk_dtest-skipped-with-require, to dig through. This > could be set up for all main branch builds, moving anything that doesn't pass > 100% to the -skipped job. This is perhaps the drawback with 2) above: we're > simply not going to run all the dtests on your dev branch. I don't think it > makes sense to set up a -skipped dtest job on your dev branches. In addition, > there's another job result set to go look at to properly evaluate the true > state of a Cassandra branch or release. There may be other side effects - > feel free to chime in. > > I'm on a "disconnected" holiday until Monday Aug 24, so I won't have a chance > to check in until then - the Test Eng team can field questions or > clarifications, if needed. > > -- > Warm regards, > Michael