> On Feb 27, 2017, at 3:33 PM, Jean-Paul Calderone <exar...@twistedmatrix.com> 
> wrote:
> 
> On Mon, Feb 27, 2017 at 6:00 PM, Tristan Seligmann <mithra...@mithrandi.net 
> <mailto:mithra...@mithrandi.net>> wrote:
> On Mon, 27 Feb 2017 at 21:54 Glyph Lefkowitz <gl...@twistedmatrix.com 
> <mailto:gl...@twistedmatrix.com>> wrote:
> That said, it has been improving and if it keeps improving at the rate it has 
> been, I expect that we'd be able to put that coverage blocker back in in 
> another 3-4 months.  Perhaps something to talk about at PyCon.
> 
> I think at least one problem that we're suffering from here is our fault, 
> rather than Codecov's: the coverage of the test suite is not stable due to 
> non-determinism in the test suite. That is, the lines executed during a test 
> run are not the same every time due to things like ordering / timing races / 
> etc. This means that "changes" to coverage may show up for a particular 
> PReven though nothing in that PR is actually responsible.
> 
> 
> Changes to Twisted code which are only sometimes covered by the test suite 
> sound like they would violate a 100% coverage rule.  But I guess the 
> experience of looking at a codecov report is so bad/confusing that it's not 
> surprising authors/reviewers might fail to see what's going on and fix the 
> non-deterministic.
> 
> Particularly for code that requires coverage measurements on multiple 
> platforms (ie, you basically can't do it locally), it seems like it would be 
> easier (though, to be clear, bad) to just forget about it and hope everything 
> is covered...
> 
> A tool that pointed out coverage differences between multiple runs of the 
> same version of the code would be a useful thing to start pointing out where 
> these flaws in the Twisted test suite lie, right?  And then each area could 
> be given deterministic test coverage instead...

While this is certainly an issue, I don't think it's the issue we're discussing 
here.  Unreliability of coverage is largely mitigated by the fact that the main 
thing we pay attention to is "patch coverage", which can be seen to fluctuate 
from commit to commit on a branch if the new test coverage is non-deterministic 
(and rarely is a PR an individual commit).  This is opposed to "coverage 
delta", which only looks at coverage before / coverage after and is indeed 
somewhat unpredictable due to old / bad tests.

So I can say when I've had to overrule codecov, it's almost never been because 
of flapping coverage lines outside of the patch under consideration (and the 
patches in consideration either have deterministic tests, or I ask the author 
to add them).

General improvements to build reliability often reduce coverage unreliability 
as well, so as we've been using Github more, which surfaces status visibility / 
mergeability to reviewers more, we've been fixing lots of little 
build-reliability issues and this problem continues to get smaller.

-glyph

_______________________________________________
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

Reply via email to