"Maciej W. Rozycki" <ma...@linux-mips.org> writes:
> On Wed, 14 Jan 2015, Richard Sandiford wrote:
>
>> >  Taking care that the default compilation mode does not conflict (e.g. 
>> > MIPS16, incompatible) and taking any exceptions into account (e.g. n64, 
>> > unsupported) I presume, right?
>> 
>> mips.exp sorts that out for you.  Adding "-mmicromips" or "(-micromips)"
>> to dg-options forces (or at least is supposed to force) the overall flags
>> to be compatible with microMIPS.
>> 
>> The aim of mips.exp is avoid skipping tests whereever possible.  If
>> someone runs the testsuite with -mips16 and we have a -micromips test,
>> it's better to remove -mips16 for that test than to skip the test entirely.
>
>  OK, good to know, thanks; that works for compilation tests only though.  
> For execution tests however what if target hardware used is incompatible 
> or there is no suitable C library (multilib configuration) available for 
> the option requested?  E.g. any hardware supporting MIPS16 won't run 
> microMIPS code, n64 tests won't link if there's no n64 multilib, etc.

In those cases it just does a compile-only test, again on the basis that
it's better than skipping the test entirely.  See the big comment at the
beginning of mips.exp if you're interested in the specific details of
how this works and what is supported.

>> >  Please always try to test changes reasonably, i.e. at least o32, 
>> > o32/MIPS16, o32/microMIPS, n32, n64, and then Linux and ELF if applicable, 
>> > plus any options that may be relevant, unless it is absolutely clear 
>> > ABI/ISA variations do not matter for a change proposed.
>> 
>> TBH this seems a bit much.  On the one hand it's more testing than you'd
>> get for almost any other target, but on the other it leaves out important
>> differences like MIPS I vs MIPS II vs MIPS 32, MIPS III vs MIPS IV vs MIPS64,
>> r1 vs. r2 vs. r6, Octeon vs. Loongson vs. vanilla, DSP vs. no DSP, etc.
>
>  I disagree, I listed what I consider the base set of configurations for 
> the MIPS target, spanning the major target variations:
>
> - MIPS/MIPS16/microMIPS can be treated almost as distinct processor 
>   architectures, the instruction sets have much common in spirit, but 
>   there are enough pitfalls and traps,
>
> - n32 covers a substantially different calling convention plus (for Linux) 
>   SVR4 PIC code that normally isn't used for executables with o32 these 
>   days,
>
> - n64 covers all that n32 does plus a 64-bit target.
>
> I realise ELF testing may be difficult for some people due to the hassle 
> with setting up the runtime, so to skip an ELF target can be justified; 
> otherwise I think it makes sense to run such testing for at least one 
> configuration from the list above for a good measure.  As is running some 
> of them with the big and some of them with the little endianness.
>
>  You've got a point with architecture levels or processor models.  I think 
> r6 should be treated as a distinct architecture and tested as the fourth 
> variant along MIPS/MIPS16/microMIPS, but such a test environment may not 
> yet be available to many.  The rest I'm afraid will mostly matter for 
> changes made to the middle end rather than the MIPS backend, in which case 
> chances are MIPS testing will not be run at all.  A test bot (similar to 
> JBG's build bot, but extended to run testing too) can help in this case; I 
> don't know if anyone runs one.
>
>  As to DSP, MSA, hard-float, soft-float, 2008-NaN, etc., I'd only expect 
> to run appropriate testing (i.e. with `-mdsp', etc.) across the 
> configurations named above whenever relevant code is changed; some of this 
> stuff is irrelevant or unavailable for some of the configurations above 
> (e.g. n64 DSP, IIRC), or may have no influence (e.g. the NaN encoding), in 
> which case it may be justified to skip them.

But soft vs. hard float in particular is a significant difference in
terms of the ABI.  Especially when it comes to MIPS16 interworking
(but even apart from that).

>> I think we just have to accept that there are so many possible
>> combinations that we can't test everything that's potentially relevant.
>> I think it's more useful to be flexible than prescribe a particular list.
>
>  Of course flexibility is needed, I absolutely agree.  I consider the list 
> I quoted the base set, I've used it for all recent submissions.  Then for 
> each individual change I've asked myself: does it make sense to run all 
> this testing?  If for example a change touched `if (TARGET_MICROMIPS)' 
> code only, then clearly running any non-microMIPS testing adds no value.  
> And then: will this testing provide enough coverage?  If not, then what 
> else needs to be covered?
>
>  As I say, testing is cheap, you can fire a bunch of test suites in 
> parallel under Linux started on QEMU run in the system emulation mode.
> From my experience on decent x86 hardware whole GCC/G++ testing across the 
> 5 configurations named will complete in just a few hours, that you can 
> spend doing something else.  And if any issues are found then the patch 
> submitter, who's often the actual author and knows his code the best, is 
> in the best position to understand what happened.
>
>  OTOH chasing down a problem later on is expensive (difficult), first it 
> has to be narrowed down, often based on a user bug report rather than the 
> discovery of a test-suite regression.  Even making a reproducible test 
> case from such a report may be tough.  And then you have the choice of 
> either debugging the problem from scratch, or (if you have an easy way to 
> figure out it is a regression, such as by passing the test case through an 
> older version of the compiler whose binary you already have handy) 
> bisecting the tree to find the offending commit (not readily available 
> with SVN AFAIK, but I had cases I did it manually in the past) and 
> starting from there.  Both ways are tedious and time consuming.
>
>> Having everyone test the same multilib combinations on the same target
>> isn't necessarily a good thing anyway.  Diversity in testing (between
>> developers) is useful too.
>
>  Sure, people will undoubtedly use different default options, I'm sure 
> folks at Cavium will compile for Octeon rather than the base architecture 
> for example.  Other people may have DSP enabled.  Etc., etc...  That IMHO 
> does not preclude testing across more than just a single configuration.

Yeah, but that's just the way it goes.  By trying to get everyone to
test with the options that matter to you, you're reducing the amount of
work you have to do when tracking regressions on those targets, but you're
saying that people who care about Octeon or the opposite floatness have
to go through the process you describe as "tedious and time consuming".

And you don't avoid that process anyway, since people making changes
to target-independent parts of GCC are just as likely to introduce
a regression as those making changes to MIPS-only code.  If testing
is cheap and takes only a small number of hours, and if you want to
make it less tedious to track down a regression, continuous testing
would give you a narrow window for each regression.

Submitters should be free to test on what matters to them rather than
have to test a canned set of multilibs on specific configurations.

Thanks,
Richard

Reply via email to