Thanks for the feedback and the suggestions. As Stephan said, the "we have to fix it asap" usually does not work well. I think blocking master is not an option, exactly for the reasons that Fabian and Till outlined.
From the comments so far, I don't feel like we are eager to adapt a disable policy. I still think it is a better policy. I think we actually don't decrease test coverage by disabling a flakey test, but increase it. For example the KafkaITCase is in one of the modules, which is tested in the middle of a build. If it fails (as it does sometimes), a lot of later tests don't run. I'm not sure if we have the time (or discipline) to trigger a 1hr build again when a known-to-fail test is failing and 4 of the other builds are succeeding. – Ufuk On 04 Jun 2015, at 09:25, Till Rohrmann <till.rohrm...@gmail.com> wrote: > I'm also in favour of quickly fixing the failing test cases but I think > that blocking the master is a kind of drastic measure. IMO this creates a > culture of blaming someone whereas I would prefer a more proactive > approach. When you see a failing test case and know that someone recently > worked on it, then ping him because maybe he can quickly fix it or knows > about it. If he's not available, e.g. holidays, busy with other stuff, > etc., then maybe one can investigate the problem oneself and fix it. > > But this is basically our current approach and I don't know how to enforce > this policy by some means. Maybe it's making people more aware of it and > motivating people to have a stable master. > > Cheers, > Till > > On Thu, Jun 4, 2015 at 9:06 AM, Matthias J. Sax < > mj...@informatik.hu-berlin.de> wrote: > >> I think, people should be forced to fixed failing tests asap. One way to >> go, could be to lock the master branch until the test is fixed. If >> nobody can push to the master, pressure is very high for the responsible >> developer to get it done asap. Not sure if this is Apache compatible. >> >> Just a thought (from industry experience). >> >> >> On 06/04/2015 08:10 AM, Aljoscha Krettek wrote: >>> I tend to agree with Ufuk, although it would be nice to fix them very >> quickly. >>> >>> On Thu, Jun 4, 2015 at 1:26 AM, Stephan Ewen <se...@apache.org> wrote: >>>> @matthias: That is the implicit policy right now. Seems not to work... >>>> >>>> On Thu, Jun 4, 2015 at 12:40 AM, Matthias J. Sax < >>>> mj...@informatik.hu-berlin.de> wrote: >>>> >>>>> I basically agree that the current policy on not optimal. However, I >>>>> would rather give failing tests "top priority" to get fixed (if >> possible >>>>> within one/a-few days) and not disable them. >>>>> >>>>> -Matthias >>>>> >>>>> On 06/04/2015 12:32 AM, Ufuk Celebi wrote: >>>>>> Hey all, >>>>>> >>>>>> we have certain test cases, which are failing regularly on Travis. In >> all >>>>>> cases I can think of we just keep the test activated. >>>>>> >>>>>> I think this makes it very hard for regular contributors to take these >>>>>> failures seriously. I think the following situation is not unrealistic >>>>> with >>>>>> the current policy: I know that test X is failing. I don't know that >>>>> person >>>>>> Y fixed this test. I see test X failing (again for a different reason) >>>>> and >>>>>> think that it is a "known issue". >>>>>> >>>>>> I think a better policy is to just disable the test, assign someone to >>>>> fix >>>>>> it, and then only enable it again after someone has fixed it. >>>>>> >>>>>> Is this reasonable? Or do we have good reasons to keep such tests >> (there >>>>>> are currently one or two) activated? >>>>>> >>>>>> – Ufuk >>>>>> >>>>> >>>>> >>> >> >>