Re: duplicate arm test results?

Martin Sebor via Gcc Wed, 23 Sep 2020 08:34:14 -0700

On 9/23/20 2:54 AM, Christophe Lyon wrote:

On Wed, 23 Sep 2020 at 01:47, Martin Sebor <mse...@gmail.com> wrote:


On 9/22/20 9:15 AM, Christophe Lyon wrote:

On Tue, 22 Sep 2020 at 17:02, Martin Sebor <mse...@gmail.com> wrote:


Hi Christophe,

While checking recent test results I noticed many posts with results
for various flavors of arm that at high level seem like duplicates
of one another.

For example, the batch below all have the same title, but not all
of the contents are the same.  The details (such as test failures)
on some of the pages are different.

Can you help explain the differences?  Is there a way to avoid
the duplication?


Sure, I am aware that many results look the same...


If you look at the top of the report (~line 5), you'll see:
Running target myarm-sim
Running target myarm-sim/-mthumb/-mcpu=cortex-m3/-mfloat-abi=soft/-march=armv7-m
Running target 
myarm-sim/-mthumb/-mcpu=cortex-m0/-mfloat-abi=soft/-march=armv6s-m
Running target myarm-sim/-mcpu=cortex-a7/-mfloat-abi=hard/-march=armv7ve+simd
Running target 
myarm-sim/-mthumb/-mcpu=cortex-m7/-mfloat-abi=hard/-march=armv7e-m+fp.dp
Running target 
myarm-sim/-mthumb/-mcpu=cortex-m4/-mfloat-abi=hard/-march=armv7e-m+fp
Running target 
myarm-sim/-mthumb/-mcpu=cortex-m33/-mfloat-abi=hard/-march=armv8-m.main+fp+dsp
Running target myarm-sim/-mcpu=cortex-a7/-mfloat-abi=soft/-march=armv7ve+simd
Running target 
myarm-sim/-mthumb/-mcpu=cortex-a7/-mfloat-abi=hard/-march=armv7ve+simd

For all of these, the first line of the report is:
LAST_UPDATED: Tue Sep 22 09:39:18 UTC 2020 (revision
r11-3343-g44135373fcdbe4019c5524ec3dff8e93d9ef113c)
TARGET=arm-none-eabi CPU=default FPU=default MODE=default

I have other combinations where I override the configure flags, eg:
LAST_UPDATED: Tue Sep 22 11:25:12 UTC 2020 (revision
r9-8928-gb3043e490896ea37cd0273e6e149c3eeb3298720)
TARGET=arm-none-linux-gnueabihf CPU=cortex-a9 FPU=neon-fp16 MODE=thumb

I tried to see if I could fit something in the subject line, but that
didn't seem convenient (would be too long, and I fear modifying the
awk script....)


Without some indication of a difference in the title there's no way
to know what result to look at, and checking all of them isn't really
practical.  The duplication (and the sheer number of results) also
make it more difficult to find results for targets other than arm-*.
There are about 13,000 results for September and over 10,000 of those
for arm-* alone.  It's good to have data but when there's this much
of it, and when the only form of presentation is as a running list,
it's too cumbersome to work with.


To help me track & report regressions, I build higher level reports like:
https://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/0latest/report-build-info.html
where it's more obvious what configurations are tested.


That looks awesome!  The regression indicator looks especially
helpful.  I really wish we had an overview like this for all
results.  I've been thinking about writing a script to scrape
gcc-testresults and format an HTML table kind of like this for
years.  With that, the number of posts sent to the list wouldn't
be a problem (at least not for those using the page).  But it
would require settling on a standard format for the basic
parameters of each run.


Each line of such reports can send a message to gcc-testresults.

I can control when such emails are sent, independently for each line:
- never
- for daily bump
- for each validation

So, I can easily reduce the amount of emails (by disabling them for
some configurations),
but that won't make the subject more informative.
I included the short revision (rXX-YYYY) in the title to make it clearer.

The number of configurations has grown over time because we regularly
found regressions
in configurations not tested previously.

I can probably easily add the values of --with-cpu, --with-fpu,
--with-mode and RUNTESTFLAGS
as part of the [<branch> revision rXX-YYYY-ZZZZZ] string in the title,
would that help?
I fear that's going to make very long subject lines.

It would probably be cleaner to update test_summary such that it adds
more info as part of $host
(as in "... testsuite on $host"), so that it grabs useful configure
parameters and runtestflags, however
this would be more controversial.


Until a way to present summaries is available, would grouping
the results of multiple runs in the same "basic configuration"
(for some definition of basic) in the same post work for you?

Martin


Christophe

Martin


I think HJ generates several "running targets" in the same log, I run
them separately to benefit from the compute farm I have access to.

Christophe

Thanks
Martin

Results for 11.0.0 20200922 (experimental) [master revision
r11-3343-g44135373fcdbe4019c5524ec3dff8e93d9ef113c] (GCC) testsuite on
arm-none-eabi   Christophe LYON

       Results for 11.0.0 20200922 (experimental) [master revision
r11-3343-g44135373fcdbe4019c5524ec3dff8e93d9ef113c] (GCC) testsuite on
arm-none-eabi   Christophe LYON
       Results for 11.0.0 20200922 (experimental) [master revision
r11-3343-g44135373fcdbe4019c5524ec3dff8e93d9ef113c] (GCC) testsuite on
arm-none-eabi   Christophe LYON
       Results for 11.0.0 20200922 (experimental) [master revision
r11-3343-g44135373fcdbe4019c5524ec3dff8e93d9ef113c] (GCC) testsuite on
arm-none-eabi   Christophe LYON
       Results for 11.0.0 20200922 (experimental) [master revision
r11-3343-g44135373fcdbe4019c5524ec3dff8e93d9ef113c] (GCC) testsuite on
arm-none-eabi   Christophe LYON
       Results for 11.0.0 20200922 (experimental) [master revision
r11-3343-g44135373fcdbe4019c5524ec3dff8e93d9ef113c] (GCC) testsuite on
arm-none-eabi   Christophe LYON
       Results for 11.0.0 20200922 (experimental) [master revision
r11-3343-g44135373fcdbe4019c5524ec3dff8e93d9ef113c] (GCC) testsuite on
arm-none-eabi   Christophe LYON
       Results for 11.0.0 20200922 (experimental) [master revision
r11-3343-g44135373fcdbe4019c5524ec3dff8e93d9ef113c] (GCC) testsuite on
arm-none-eabi   Christophe LYON
       Results for 11.0.0 20200922 (experimental) [master revision
r11-3343-g44135373fcdbe4019c5524ec3dff8e93d9ef113c] (GCC) testsuite on
arm-none-eabi   Christophe LYON

Re: duplicate arm test results?

Reply via email to