I was interested to watch the video of the DejaGnu BOF at the Cauldron. A few issues with DejaGnu for toolchain testing that I've noted but I don't think were covered there include:
* DejaGnu has a lot of hardcoded logic to try to find various files in a toolchain build directory. A lot of it is actually for very old toolchain versions (using GCC version 2 or older, for example). The first issue with this is that it doesn't belong in DejaGnu: the toolchain should be free to rearrange its build directories without needing changes to DejaGnu itself (which in practice means there's lots of such logic in the toolchain's own testsuites *as well*, duplicating the DejaGnu code to a greater or lesser extent). The second issue is that "make install" already knows where to find files in the build directory, and it would be better to move towards build-tree testing installing the toolchain in a staging directory and running tools from there, rather than needing any logic in the testsuites at all to enable bits of uninstalled tools to find other bits of uninstalled tools. (There might still be a few bits like setting LD_LIBRARY_PATH required. But the compiler command lines would be much simpler and much closer to how users actually use the compiler in practice.) * Similarly, DejaGnu has hardcoded prune_warnings - and again GCC adds lots of its own prunes; it's not clear hardcoding this in DejaGnu is a particularly good idea either. * Another piece of unfortunate hardcoding in DejaGnu is how remote-host testing uses "-o a.out" when running tools on the remote host - such a difference from how they are run on a local host results in lots of issue where a tool cares about the output file name in some way (e.g. to generate other output files). * A key feature of QMTest that I like but I don't think got mentioned is that you can *statically enumerate the set of tests* without running them. That is, a testsuite has a well-defined set of tests, and that set does not depend on what the results of the tests are - whereas it's very easy and common for a DejaGnu test to have test names (the text after PASS: or FAIL: ) depending on whether the test passed or failed, or how the test passed or failed (no doubt the testsuite authors had reasons for doing this, but it conflicts with any automatic comparison of results). The QMTest model isn't wonderfully well-matched to toolchain testing - in toolchain testing, you can typically do a single indivisible test execution (e.g. compiling a file), which produces results for a large number of test assertions (tests for warnings on particular lines of that file), and QMTest expects one indivisible test execution to produce one result. But a model where a test can contain multiple assertions, and both tests and their assertions can be statically enumerated independent of their result, and where the results can be annotated by the testsuite (to deal with the purposes for which testsuites stick extra text on the PASS/FAIL line) certainly seems better than one that makes it likely the set of test assertions will vary in unpredictable ways. * People in the BOF seemed happy with expect. I think expect has caused quite a few problems for toolchain testing. In particular, there are or have been too many places where expect likes to throw away input whose size exceeds some arbitrary limit and you need to hack around those by increasing the limits in some way. GCC tests can generate and test for very large numbers of diagnostics from a single test, and some binutils tests can generate megabytes of output from a tool (that are then matched against regular expressions etc.). -- Joseph S. Myers jos...@codesourcery.com