I was interested to watch the video of the DejaGnu BOF at the Cauldron.  A 
few issues with DejaGnu for toolchain testing that I've noted but I don't 
think were covered there include:

* DejaGnu has a lot of hardcoded logic to try to find various files in a 
toolchain build directory.  A lot of it is actually for very old toolchain 
versions (using GCC version 2 or older, for example).  The first issue 
with this is that it doesn't belong in DejaGnu: the toolchain should be 
free to rearrange its build directories without needing changes to DejaGnu 
itself (which in practice means there's lots of such logic in the 
toolchain's own testsuites *as well*, duplicating the DejaGnu code to a 
greater or lesser extent).  The second issue is that "make install" 
already knows where to find files in the build directory, and it would be 
better to move towards build-tree testing installing the toolchain in a 
staging directory and running tools from there, rather than needing any 
logic in the testsuites at all to enable bits of uninstalled tools to find 
other bits of uninstalled tools.  (There might still be a few bits like 
setting LD_LIBRARY_PATH required.  But the compiler command lines would be 
much simpler and much closer to how users actually use the compiler in 
practice.)

* Similarly, DejaGnu has hardcoded prune_warnings - and again GCC adds 
lots of its own prunes; it's not clear hardcoding this in DejaGnu is a 
particularly good idea either.

* Another piece of unfortunate hardcoding in DejaGnu is how remote-host 
testing uses "-o a.out" when running tools on the remote host - such a 
difference from how they are run on a local host results in lots of issue 
where a tool cares about the output file name in some way (e.g. to 
generate other output files).

* A key feature of QMTest that I like but I don't think got mentioned is 
that you can *statically enumerate the set of tests* without running them.  
That is, a testsuite has a well-defined set of tests, and that set does 
not depend on what the results of the tests are - whereas it's very easy 
and common for a DejaGnu test to have test names (the text after PASS: or 
FAIL: ) depending on whether the test passed or failed, or how the test 
passed or failed (no doubt the testsuite authors had reasons for doing 
this, but it conflicts with any automatic comparison of results).  The 
QMTest model isn't wonderfully well-matched to toolchain testing - in 
toolchain testing, you can typically do a single indivisible test 
execution (e.g. compiling a file), which produces results for a large 
number of test assertions (tests for warnings on particular lines of that 
file), and QMTest expects one indivisible test execution to produce one 
result.  But a model where a test can contain multiple assertions, and 
both tests and their assertions can be statically enumerated independent 
of their result, and where the results can be annotated by the testsuite 
(to deal with the purposes for which testsuites stick extra text on the 
PASS/FAIL line) certainly seems better than one that makes it likely the 
set of test assertions will vary in unpredictable ways.

* People in the BOF seemed happy with expect.  I think expect has caused 
quite a few problems for toolchain testing.  In particular, there are or 
have been too many places where expect likes to throw away input whose 
size exceeds some arbitrary limit and you need to hack around those by 
increasing the limits in some way.  GCC tests can generate and test for 
very large numbers of diagnostics from a single test, and some binutils 
tests can generate megabytes of output from a tool (that are then matched 
against regular expressions etc.).

-- 
Joseph S. Myers
jos...@codesourcery.com

Reply via email to