I'm not totally sure yet here. Right now there is a generic category mechanism, but it is only settable via a file in a directory, or overridden via the test case class method called getCategories().
I think I'd want a more general decorator that allows you to tag a method itself with categories. So, something like: class TestRaise: @lldbtest.category("flakey") # or maybe better since it uses a constant @lldbtest.category(lldbtest.CATEGORY_FLAKEY) def test_something_that_does_not_always_pass(): pass Then a method has a collection of categories, and checking membership can look at the class instance (to hook into the existing mechanism). -Todd On Mon, Oct 19, 2015 at 4:40 PM, Zachary Turner <ztur...@google.com> wrote: > Yea, I definitely agree with you there. > > Is this going to end up with an @expectedFlakeyWindows, > @expectedFlakeyLinux, @expectedFlakeyDarwin, @expectedFlakeyAndroid, > @expectedFlakeyFreeBSD? > > It's starting to get a little crazy, at some point I think we just need > something that we can use like this: > > @test_status(status=flaky, host=[win, linux, android, darwin, bsd], > target=[win, linux, android, darwin, bsd], compiler=[gcc, clang], > debug_info=[dsym, dwarf, dwo]) > > On Mon, Oct 19, 2015 at 4:35 PM Todd Fiala <todd.fi...@gmail.com> wrote: > >> My initial proposal was an attempt to not entirely skip running them on >> our end and still get them to generate actionable signals without >> conflating them with unexpected successes (which they absolutely are not in >> a semantic way). >> >> On Mon, Oct 19, 2015 at 4:33 PM, Todd Fiala <todd.fi...@gmail.com> wrote: >> >>> Nope, I have no issue with what you said. We don't want to run them >>> over here at all because we don't see enough useful info come out of them. >>> You need time series data for that to be somewhat useful, and even then it >>> only is useful if you see a sharp change in it after a specific change. >>> >>> So I really don't want to be running flaky tests at all as their signals >>> are not useful on a per-run basis. >>> >>> On Mon, Oct 19, 2015 at 4:16 PM, Zachary Turner <ztur...@google.com> >>> wrote: >>> >>>> Don't get me wrong, I like the idea of running flakey tests a couple of >>>> times and seeing if one passes (Chromium does this too as well, so it's not >>>> without precedent). If I sounded harsh, it's because I *want* to be harsh >>>> on flaky tests. Flaky tests indicate literally the *worst* kind of bugs >>>> because you don't even know what kind of problems they're causing in the >>>> wild, so by increasing the amount of pain they cause people (test suite >>>> running longer, etc) the hope is that it will motivate someone to fix it. >>>> >>>> On Mon, Oct 19, 2015 at 4:04 PM Todd Fiala <todd.fi...@gmail.com> >>>> wrote: >>>> >>>>> Okay, so I'm not a fan of the flaky tests myself, nor of test suites >>>>> taking longer to run than needed. >>>>> >>>>> Enrico is going to add a new 'flakey' category to the test >>>>> categorization. >>>>> >>>>> Scratch all the other complexity I offered up. What we're going to >>>>> ask is if a test is flakey, please add it to the 'flakey' category. We >>>>> won't do anything different with the category by default, so everyone will >>>>> still get flakey tests running the same manner they do now. However, on >>>>> our test runners, we will be disabling the category entirely using the >>>>> skipCategories mechanism since those are generating too much noise. >>>>> >>>>> We may need to add a per-test-method category mechanism since right >>>>> now our only mechanism to add categories (1) specify a dot-file to the >>>>> directory to have everything in it get tagged with a category, or (2) >>>>> override the categorization for the TestCase getCategories() mechanism. >>>>> >>>>> -Todd >>>>> >>>>> On Mon, Oct 19, 2015 at 1:03 PM, Zachary Turner <ztur...@google.com> >>>>> wrote: >>>>> >>>>>> >>>>>> >>>>>> On Mon, Oct 19, 2015 at 12:50 PM Todd Fiala via lldb-dev < >>>>>> lldb-dev@lists.llvm.org> wrote: >>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> I'd like unexpected successes (i.e. tests marked as unexpected >>>>>>> failure that in fact pass) to retain the actionable meaning that >>>>>>> something >>>>>>> is wrong. The wrong part is that either (1) the test now passes >>>>>>> consistently and the author of the fix just missed updating the test >>>>>>> definition (or perhaps was unaware of the test), or (2) the test is not >>>>>>> covering the condition it is testing completely, and some change to the >>>>>>> code just happened to make the test pass (due to the test being not >>>>>>> comprehensive enough). Either of those requires some sort of >>>>>>> adjustment by >>>>>>> the developers. >>>>>>> >>>>>> I'dd add #3. The test is actually flaky but is tagged incorrectly. >>>>>> >>>>>> >>>>>>> >>>>>>> We have a category of test known as "flaky" or "flakey" (both are >>>>>>> valid spellings, for those who care: >>>>>>> http://www.merriam-webster.com/dictionary/flaky, although flaky is >>>>>>> considered the primary). Flaky tests are tests that we can't get to >>>>>>> pass >>>>>>> 100% of the time. This might be because it is extremely difficult to >>>>>>> write >>>>>>> the test as such and deemed not worth the effort, or it is a condition >>>>>>> that >>>>>>> is just not going to present itself successfully 100% of the time. >>>>>>> >>>>>> IMO if it's not worth the effort to write the test correctly, we >>>>>> should delete the test. Flaky is useful as a temporary status, but if >>>>>> nobody ends up fixing the flakiness, I think the test should be deleted >>>>>> (more reasons follow). >>>>>> >>>>>> >>>>>> >>>>>>> These are tests we still want to exercise, but we don't want to have >>>>>>> them start generating test failures if they don't pass 100% of the time. >>>>>>> Currently the flaky test mechanism requires a test to pass one in two >>>>>>> times. That is okay for a test that exhibits a slim degree of >>>>>>> flakiness. >>>>>>> For others, that is not a large enough sample of runs to elicit a >>>>>>> successful result. Those tests get marked as XFAIL, and generate a >>>>>>> non-actionable "unexpected success" result when they do happen to pass. >>>>>>> >>>>>>> GOAL >>>>>>> >>>>>>> * Enhance expectedFlakey* test decorators. Allow specification of >>>>>>> the number of times in which a flaky test should be run to be expected >>>>>>> to >>>>>>> pass at least once. Call that MAX_RUNS. >>>>>>> >>>>>> I think it's worth considering it it's a good idea include the date >>>>>> at which they were declared flakey. After a certain amount of time has >>>>>> passed, if it's still flakey they can be relegated to hard failures. I >>>>>> don't think flakey should be a permanent state. >>>>>> >>>>>> >>>>>>> >>>>>>> * When running a flaky test, run it up MAX_RUNS number of times. >>>>>>> The first time it passes, mark it as a successful test completion. The >>>>>>> test event system will be given the number of times it was run before >>>>>>> passing. Whether we consume this info or not is TBD (and falls into the >>>>>>> purview of the test results formatter). >>>>>>> >>>>>> >>>>>>> * If the test does not pass within MAX_RUNS time, mark it as a flaky >>>>>>> fail. For purposes of the standard output, this can look like FAIL: >>>>>>> (flaky) or something similar so fail scanners still see it. (Note it's >>>>>>> highly likely I'll do the normal output counts with the TestResults >>>>>>> formatter-based output at the same time, so we get accurate test method >>>>>>> counts and the like). >>>>>>> >>>>>> The concern I have here (and the reason I would like to delete flakey >>>>>> tests if the flakiness isn't removed after certain amount of time) is >>>>>> because some of our tests are slow. Repeating them many times is going >>>>>> to >>>>>> have an impact on how long the test suite takes to run. It's already >>>>>> tripled over the past 3 weeks, and I think we need to be careful to keep >>>>>> out things that have the potential to lead to significant slowness of the >>>>>> test suite runner. >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> -Todd >>>>> >>>> >>> >>> >>> -- >>> -Todd >>> >> >> >> >> -- >> -Todd >> > -- -Todd
_______________________________________________ lldb-dev mailing list lldb-dev@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev