Very sorry for the slow response, been EBUSY with real-life. On Sun, May 22, 2011 at 11:42 PM, Stefano Lattarini <stefano.lattar...@gmail.com> wrote: > On Sunday 22 May 2011, Ralf Wildenhues wrote: >> Hi Stefano, and sorry for the long delay, >> > No problem, you had warned me in due time about such possible delays this > month; so there's really no need to apologize. > >> * Stefano Lattarini wrote on Fri, Apr 29, 2011 at 11:21:06AM CEST: >> > Now that my GSoC application "automake - Interfacing with a test protocol >> > like TAP or subunit" has been officially accepted, I'd like to start >> > discussing with the community some early, high-level design and interface >> > decisions. >> >> >> > 1. Reuse parallel-tests "framework" >> > ----------------------------------- >> > >> > The new TAP/SubUnit support should reuse as much of the current >> > parallel-tests implementation and semantics as possible. In particular, >> > it should be able to run different test scripts in parallel, generate a >> > `.log' file for each test script and a "summarizing" `test-suite.log' >> > file, honour the make variables AM_TESTS_ENVIRONMENT, TESTS_ENVIRONMENT >> > and AM_COLOR_TESTS and the environment variable VERBOSE, and support >> > different extensions for the test scripts, with extension-specific "log >> > compilers" and flags (the stuff enabled by TESTS_EXTENSIONS, >> > <ext>_LOG_COMPILER, etc.). >> >> Sounds all sane. >> >> > The XFAIL_TESTS variable might be still supported for the sake of >> > backward-compatibility (see below for the details), but it should be >> > deprecated, since TAP and SubUnit offer better and more granular ways >> > to express expected failures. >> >> OK. >> >> In another mail: >> > Thinking again about this, it might be worth trying to be even more >> > consistent >> > with the existing parallel-tests functionality, and use an >> > `ext_TEST_PROTOCOL' >> > variable (or similar) instead of a global `tests-protocol' option. With >> > some >> > tweaking to the post-processing of `.log' files done in `lib/am/check.am' >> > (to >> > generate `$(TEST_SUITE_LOG)'), this might allow greater code reuse and a >> > more >> > consistent API. >> > >> > I've started experimenting with this idea, and I'm not seeing any obvious >> > shortcoming right now. I'm hoping I'll be able to post some experimental >> > patches soon enough. >> >> Allowing to specify that per-test is a good idea for transitioning test >> suites. >> > About this, in my first two "tentative" patches: > <http://lists.gnu.org/archive/html/automake-patches/2011-05/msg00093.html> > I've taken an even more general approach, allowing the developer to define > and use his own program(s) to: > 1. launch his test scripts, > 2. interpreter their results, > 3. display these results on screen, and > 4. format and generate the log files. > All of this is attainable simply by assigning a variable `LOG_WRAPPER' > (and extensione-specific counterparts of it), and, well, obviously > providing a real "driver" script that obeys to minimal command-line > interface (so that it can grasp the options the Automake-generated > Makefiles passes to it). Then we will hopefully be able to implement > our TAP/SubUnit parsers on the top of this feature (thus making it > indirectly more tested, which is always good for a new feature).
If the subunit parser just gets all the output from the test script, you might want to use the subunit parser itself: folk in the bootstrap-set couldn't (obviously), but most other environments can use high level languages freely. (I don't object to more parser implementations existing, I'm just thinking about reuse where possible. >> I hope to look into the posted patches later today. >> > About this, please note that I might be AFK until this evening. So have > no haste. > >> > 2. New automake option `tests-protocol' >> > --------------------------------------- >> > >> > The Tap/SubUnit support in the Automake-generated testsuite drivers >> > should be enabled by a new (argument-requiring) option `tests-protocol', >> > that will be used to specify the level of support for, detection of, and >> > enforcing of SubUnit/TAP streams. >> > >> > The possible values for `tests-protocol' will be: >> > - tests-protocol=tap >> > All test scripts are expected to use the TAP protocol. >> > - tests-protocol=subunit >> > All test scripts are expected to use the SubUnit protocol. >> > - tests-protocol=adaptive >> >> The way you describe "adaptive", it sounds like it should rather be >> named something like "detect" or "detected" or so. >> > I'd like to withdraw this proposal now that we can define per-extension test > protocols. Having our hypotetical "client developer" rename his test scripts > as they get converted to use TAP/SubUnit is IMO better than we having to > implement in Automake a probably non-trivial "metaparser" that could end up > being scarcerly used anyway. WDYT? > >> > Each test script is expected to print on its first line of output >> > which protocol it uses (the exact format of this special line is >> > still to be determined); if this line is unrecognized, the driver >> > should assume that the test script uses no protocol. Also, in >> > this case, we should continue to honour XFAIL_TESTS. All of this >> > should help to maximize backward-compatibility. >> >> >> >> > 3. Console output from the test driver >> > --------------------------------------- >> > >> > This output should remain as close as possible to the one already >> > provided by the current parallel-tests driver. The following example >> > should help to clarify what I mean. >> > >> > Assume we are using `tests-protocol=adaptive', and let TESTS be defined >> > to "pass.test skip.test subunit.test tap.test". Here, `pass.test' and >> > `skip.test' are test scripts that use no protocol (and that exit with >> > status `0' and `77' respectively), `subunit.test' is a test script using >> > the SubUnit protocol and containing four testcases (one passing, one >> > failing, one skipped and one which incurs in an expected failure), and >> > `tap.test' is a test script using the TAP protocol which runs two >> > successful testcases, then encounters an internal error and bails out >> > (using the "Bail out!" directive). >> > >> > With such a setup, this is the output I'd expect from "make check": >> > PASS: pass.test >> > SKIP: skip.test >> > PASS: subunit.test [testcase name/description] >> > FAIL: subunit.test [testcase name/ description] >> > SKIP: subunit.test [testcase name/description] [reason for skipping] >> > XFAIL: subunit.test [testcase name/description] [failure reason] >> > PASS: tap.test [testcase name/description] >> > PASS: tap.test [testcase name/description] >> > ERROR: tap.test [reason for the bailout] >> > >> > Of course, the `color-tests' option should make the above output properly >> > colorized; the attached html file shows what colors I'd expect. >> >> Wrt. to the details of the output, I wouldn't fix things too early. >> Two things two consider here: >> - if tests are run in parallel, you want to avoid intermixing output >> from different tests as much as possible, >> > I disagree. Once the test script name is printed in each line which report > the result of one of its test cases, I see no issue with intermixed lines. > I.e., I don't believe that this: > > PASS: pass.test > PASS: subunit.test [foo] > PASS: tap.test [t1] > SKIP: skip.test > PASS: tap.test [t2] > FAIL: subunit.test [bar] > SKIP: subunit.test [baz] [reason for skipping] > ERROR: tap.test [reason for the bailout] > XFAIL: subunit.test [quux] [failure reason] > > is any less clear than this: > > PASS: pass.test > SKIP: skip.test > PASS: subunit.test [testcase name/description] > FAIL: subunit.test [testcase name/ description] > SKIP: subunit.test [testcase name/description] [reason for skipping] > XFAIL: subunit.test [testcase name/description] [failure reason] > PASS: tap.test [testcase name/description] > PASS: tap.test [testcase name/description] > ERROR: tap.test [reason for the bailout] XXX > >> but even more than that, you want to avoid intermixing within lines. >> > OTOH, I do believe this is a real concern, to be carefully addressed and > tested for. Thanks for bringing this up. For Both TAP and subunit the test script running needs to feed into a single parser: the issue with intermingling then is all about how the parsers structure their output. Ideally you would use a threadsafe structure. For subunit, *if* you were working in python you could use the existing multithreaded logic testrepository uses, which can mingle multiple streams correctly, preserving datestamps - a single ui parser then handles all the combined output. >> Gathering output and >> printing it all at once can help here, but the downside is possibly >> losing output if tests are interrupted, or the driver doesn't finish >> correctly. >> - some sort of bypass of the (currently hidden) output is helpful, >> some gnulib-using packages do that already by redirecting stderr >> before running the test, and printing a reason for skipping from >> within the test. >> > [thinking about this put on the TODO list] > >> I think the basic premise to take into account is that the test driver >> author (you) may not know all requirements a future test author may >> have. >> > That's why I like my current idea of allowing the client developers > to provide their own testsuite driver. > >> > 4. RST support and HTML generation: should be dropped? >> > ------------------------------------------------------ >> >> Good question. You could do a poll on the automake list (in a separate >> mail & thread, to gain visbility). Or just defer the implementation >> for later. >> >> In another mail: >> > BTW, I also hope that new interface I'm planning will make it easier to >> > implement HTML report generation also for TAP and SubUnit tests, with a >> > consistent reuse of the existing code. In which case my considerations >> > above will become moot. >> >> Cool. >> > The downside of this is that it might place additional burdens on the > writer of test drivers. Hmmm... maybe a new option "no-html-test" or > similar is warranted? At least for subunit, and I believe TAP afficiondos will agree for TAP, you'd generally run subunit2html [made upname] to generate a html report. An example of such a formatter is subunit2junithtml, and I believe samba have a perl based html formatter for subunit already (Jelmer will know more about that). -Rob