> On Feb 27, 2017, at 3:33 PM, Jean-Paul Calderone <exar...@twistedmatrix.com> > wrote: > > On Mon, Feb 27, 2017 at 6:00 PM, Tristan Seligmann <mithra...@mithrandi.net > <mailto:mithra...@mithrandi.net>> wrote: > On Mon, 27 Feb 2017 at 21:54 Glyph Lefkowitz <gl...@twistedmatrix.com > <mailto:gl...@twistedmatrix.com>> wrote: > That said, it has been improving and if it keeps improving at the rate it has > been, I expect that we'd be able to put that coverage blocker back in in > another 3-4 months. Perhaps something to talk about at PyCon. > > I think at least one problem that we're suffering from here is our fault, > rather than Codecov's: the coverage of the test suite is not stable due to > non-determinism in the test suite. That is, the lines executed during a test > run are not the same every time due to things like ordering / timing races / > etc. This means that "changes" to coverage may show up for a particular > PReven though nothing in that PR is actually responsible. > > > Changes to Twisted code which are only sometimes covered by the test suite > sound like they would violate a 100% coverage rule. But I guess the > experience of looking at a codecov report is so bad/confusing that it's not > surprising authors/reviewers might fail to see what's going on and fix the > non-deterministic. > > Particularly for code that requires coverage measurements on multiple > platforms (ie, you basically can't do it locally), it seems like it would be > easier (though, to be clear, bad) to just forget about it and hope everything > is covered... > > A tool that pointed out coverage differences between multiple runs of the > same version of the code would be a useful thing to start pointing out where > these flaws in the Twisted test suite lie, right? And then each area could > be given deterministic test coverage instead...
While this is certainly an issue, I don't think it's the issue we're discussing here. Unreliability of coverage is largely mitigated by the fact that the main thing we pay attention to is "patch coverage", which can be seen to fluctuate from commit to commit on a branch if the new test coverage is non-deterministic (and rarely is a PR an individual commit). This is opposed to "coverage delta", which only looks at coverage before / coverage after and is indeed somewhat unpredictable due to old / bad tests. So I can say when I've had to overrule codecov, it's almost never been because of flapping coverage lines outside of the patch under consideration (and the patches in consideration either have deterministic tests, or I ask the author to add them). General improvements to build reliability often reduce coverage unreliability as well, so as we've been using Github more, which surfaces status visibility / mergeability to reviewers more, we've been fixing lots of little build-reliability issues and this problem continues to get smaller. -glyph
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python