On Fri, 11 Jan 2013, Michael Zolotukhin wrote:

> > Personally I'd think a natural starting point on the compiler side would
> > be to write a reasonably thorough and systematic testsuite for such
> > issues.  That would cover all operations, for all floating-point types
> > (including ones such as __float128 and __float80), and conversions between
> > all pairs of floating-point types and either way between each
> > floating-point type and each integer type (including __int128 / unsigned
> > __int128), with operands being any of (constants, non-volatile variables
> > initialized with constants, volatile variables, vectors) and results being
> > (discarded, stored in non-volatile variables, stored in volatile
> > variables), in all the rounding modes, testing both results and exceptions
> > and confirming proper results when an operation is repeated after changes
> > of rounding mode or clearing exceptions.
> 
> We mostly have problems when there is an 'interaction' between
> different rounding modes - so a ton of tests that checking correctness
> of a single operation in a specific rounding mode won't catch it. We
> could place all such tests in one file/function so that the compiler
> would transform it as it does now, so we'll catch the fail - but in
> this case we don't need many tests.

Tests should generally be small to make it easier for people to track down 
the failures.  As you note, interactions are relevant - but that means 
tests would do an operation in one rounding mode, check results, repeat in 
another rounding mode, check results (which would catch the compiler 
wrongly reusing the first results), repeat again for each mode.  Tests for 
each separate operation and type can still be separate.

> So, generally I like the idea of having tests covering all the cases
> and then fixing them one-by-one, but I didn't catch what these tests
> would be except the ones from the trackers - it seems useless to have
> a bunch of tests, each of which contains a single operation and
> compares the result, even if we have a version of such test for all
> datatypes and rounding modes.

I'm thinking in terms of full FENV_ACCESS test coverage, for both 
exceptions and rounding modes, where there are many more things that can 
go wrong for single operations (such as the operation being wrongly 
discarded because the result isn't used, even though the exceptions are 
tested, or a libgcc implementation of a function raising excess 
exceptions).  But even just for rounding modes, there are still various 
uses for systematically covering different permutations.

* Tests should test both building -frounding-math, without the FENV_ACCESS 
pragma, and with the pragma but without that option, when the pragma is 
implemented.

* There's clearly some risk that implementations of __float128 using 
soft-fp have bugs in how they interact with hardware exceptions and 
rounding modes.  These are part of libgcc; there should be test coverage 
for such issues to provide confidence that GCC is handling exceptions and 
rounding modes correctly.  This also helps detect soft-fp bugs generally.

* Some architectures may well have rounding mode bugs in operations 
defined in their .md files.  E.g., conversion of integer 0 to 
floating-point in round-downwards mode on older 32-bit powerpc wrongly 
produces -0.0 instead of +0.0.  One purpose of tests for an issue with 
significant machine dependencies is to allow people testing on an 
architecture other than that originally used to develop the feature to 
tell whether there are architecture-specific bugs.  There are reasonably 
thorough tests of conversions between floating-point and integers 
(gcc.dg/torture/fp-int-convert-*) in the testsuite, which caught several 
bugs when added (especially as regards conversions to/from TImode), and 
sometimes continue to do so - but only cover round-to-nearest.

* Maybe a .md file wrongly enables vector operations without -ffast-math 
even though they do not handle all floating-point cases correctly.  Since 
this is a case where a risk of problems is reasonably predictable (it's 
common for processors to define vector instructions in ways that do not 
have the full IEEE semantics with rounding modes, exceptions, subnormals 
etc., which means they shouldn't be used for vectorization on such 
processors without appropriate -ffast-math options), verifying that vector 
operations (GNU C generic vectors) handle floating-point correctly is also 
desirable.


Thus, while adding testcases from specific bugs would ensure that those 
very specific tests remained fixed, I don't think it would provide much 
confidence that the overall FENV_ACCESS implementation is at all reliable, 
only that a limited subset of bugs that people had actually reported had 
been fixed (especially, areas such as conversions from TImode to float, 
that people less frequently use, would be at high risk of remaining bugs), 
and it wouldn't be of much use for someone trying to do the 
architecture-specific parts of reliable FENV_ACCESS support for another 
architecture.  As I see it, testcases for individual bugs are the right 
approach where the bug is of the form "GCC does this transformation, for 
this particular combination of operations in an expression, that isn't 
valid in this mode" - as tests for individual operations can't sensibly 
cover all combinations for which there might be a bogus transformation.  
But for the likely many issues regarding individual operations (whether in 
.md files or in libgcc), systematic coverage of those operations is also 
important.

(Bug 27682 is another illustration of how there are FENV_ACCESS bugs in 
individual operations, not just in optimizations.)

-- 
Joseph S. Myers
jos...@codesourcery.com

Reply via email to