On Wed, 23 Sep 2020 at 17:50, Christophe Lyon <christophe.l...@linaro.org> wrote: > > On Wed, 23 Sep 2020 at 17:33, Martin Sebor <mse...@gmail.com> wrote: > > > > On 9/23/20 2:54 AM, Christophe Lyon wrote: > > > On Wed, 23 Sep 2020 at 01:47, Martin Sebor <mse...@gmail.com> wrote: > > >> > > >> On 9/22/20 9:15 AM, Christophe Lyon wrote: > > >>> On Tue, 22 Sep 2020 at 17:02, Martin Sebor <mse...@gmail.com> wrote: > > >>>> > > >>>> Hi Christophe, > > >>>> > > >>>> While checking recent test results I noticed many posts with results > > >>>> for various flavors of arm that at high level seem like duplicates > > >>>> of one another. > > >>>> > > >>>> For example, the batch below all have the same title, but not all > > >>>> of the contents are the same. The details (such as test failures) > > >>>> on some of the pages are different. > > >>>> > > >>>> Can you help explain the differences? Is there a way to avoid > > >>>> the duplication? > > >>>> > > >>> > > >>> Sure, I am aware that many results look the same... > > >>> > > >>> > > >>> If you look at the top of the report (~line 5), you'll see: > > >>> Running target myarm-sim > > >>> Running target > > >>> myarm-sim/-mthumb/-mcpu=cortex-m3/-mfloat-abi=soft/-march=armv7-m > > >>> Running target > > >>> myarm-sim/-mthumb/-mcpu=cortex-m0/-mfloat-abi=soft/-march=armv6s-m > > >>> Running target > > >>> myarm-sim/-mcpu=cortex-a7/-mfloat-abi=hard/-march=armv7ve+simd > > >>> Running target > > >>> myarm-sim/-mthumb/-mcpu=cortex-m7/-mfloat-abi=hard/-march=armv7e-m+fp.dp > > >>> Running target > > >>> myarm-sim/-mthumb/-mcpu=cortex-m4/-mfloat-abi=hard/-march=armv7e-m+fp > > >>> Running target > > >>> myarm-sim/-mthumb/-mcpu=cortex-m33/-mfloat-abi=hard/-march=armv8-m.main+fp+dsp > > >>> Running target > > >>> myarm-sim/-mcpu=cortex-a7/-mfloat-abi=soft/-march=armv7ve+simd > > >>> Running target > > >>> myarm-sim/-mthumb/-mcpu=cortex-a7/-mfloat-abi=hard/-march=armv7ve+simd > > >>> > > >>> For all of these, the first line of the report is: > > >>> LAST_UPDATED: Tue Sep 22 09:39:18 UTC 2020 (revision > > >>> r11-3343-g44135373fcdbe4019c5524ec3dff8e93d9ef113c) > > >>> TARGET=arm-none-eabi CPU=default FPU=default MODE=default > > >>> > > >>> I have other combinations where I override the configure flags, eg: > > >>> LAST_UPDATED: Tue Sep 22 11:25:12 UTC 2020 (revision > > >>> r9-8928-gb3043e490896ea37cd0273e6e149c3eeb3298720) > > >>> TARGET=arm-none-linux-gnueabihf CPU=cortex-a9 FPU=neon-fp16 MODE=thumb > > >>> > > >>> I tried to see if I could fit something in the subject line, but that > > >>> didn't seem convenient (would be too long, and I fear modifying the > > >>> awk script....) > > >> > > >> Without some indication of a difference in the title there's no way > > >> to know what result to look at, and checking all of them isn't really > > >> practical. The duplication (and the sheer number of results) also > > >> make it more difficult to find results for targets other than arm-*. > > >> There are about 13,000 results for September and over 10,000 of those > > >> for arm-* alone. It's good to have data but when there's this much > > >> of it, and when the only form of presentation is as a running list, > > >> it's too cumbersome to work with. > > >> > > > > > > To help me track & report regressions, I build higher level reports like: > > > https://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/0latest/report-build-info.html > > > where it's more obvious what configurations are tested. > > > > That looks awesome! The regression indicator looks especially > > helpful. I really wish we had an overview like this for all > > results. I've been thinking about writing a script to scrape > > gcc-testresults and format an HTML table kind of like this for > > years. With that, the number of posts sent to the list wouldn't > > be a problem (at least not for those using the page). But it > > would require settling on a standard format for the basic > > parameters of each run. > > > > It's probably easier to detect regressions and format reports from the > .sum files rather than extracting them from the mailing-list. > But your approach has the advantage that you can detect regressions > from reports sent by other people, not only by you. > > > > > > > > Each line of such reports can send a message to gcc-testresults. > > > > > > I can control when such emails are sent, independently for each line: > > > - never > > > - for daily bump > > > - for each validation > > > > > > So, I can easily reduce the amount of emails (by disabling them for > > > some configurations), > > > but that won't make the subject more informative. > > > I included the short revision (rXX-YYYY) in the title to make it clearer. > > > > > > The number of configurations has grown over time because we regularly > > > found regressions > > > in configurations not tested previously. > > > > > > I can probably easily add the values of --with-cpu, --with-fpu, > > > --with-mode and RUNTESTFLAGS > > > as part of the [<branch> revision rXX-YYYY-ZZZZZ] string in the title, > > > would that help? > > > I fear that's going to make very long subject lines. > > > > > > It would probably be cleaner to update test_summary such that it adds > > > more info as part of $host > > > (as in "... testsuite on $host"), so that it grabs useful configure > > > parameters and runtestflags, however > > > this would be more controversial. > > > > Until a way to present summaries is available, would grouping > > the results of multiple runs in the same "basic configuration" > > (for some definition of basic) in the same post work for you? > > > > That's not convenient for me at the moment: each build+make check runs > on a different server in a scratch area. It sends its results, saves > the logs and everything else is discarded. > After that I have a pass to compute regressions once all .sum are > available, and that's when I build the HTML reports you saw. > It's not terribly hard to reorganize, but it does require some work > and probably some disruption. I tend to try to make sure the reports > and results are still generated while I make changes to the scripts > :-) > > In the meantime, I am updating the title format following the > suggestions from Richard & Jakub. Hopefully this will be in place > quite soon, after the currently-running validations have completed. >
I have updated my scripts, twice because I discovered that an empty DEV-PHASE had a special meaning when constructing the revision string.... So a few reports last night had just: Results for 11.0.0 (GCC) testsuite on XXXX as title, which can now be as long as: Results for 8.4.1 [r8-10521 DEFMODE=arm DEFCPU=cortex-a9 DEFFPU=neon-fp16 TESTFLAGS=-mthumb/-march=armv8-a/-mfpu=crypto-neon-fp-armv8/-mfloat- abi=hard] (GCC) testsuite on XXX The first revision number (11.0.0 and 8.4.1 above) come from BASEVER, but I didn't try to replace it with an empty as it seems many things depend on it. Does it help you? Thanks, Christophe > Thanks, > > Christophe > > > Martin > > > > > > > > Christophe > > > > > >> Martin > > >> > > >>> > > >>> I think HJ generates several "running targets" in the same log, I run > > >>> them separately to benefit from the compute farm I have access to. > > >>> > > >>> Christophe > > >>> > > >>>> Thanks > > >>>> Martin > > >>>> > > >>>> Results for 11.0.0 20200922 (experimental) [master revision > > >>>> r11-3343-g44135373fcdbe4019c5524ec3dff8e93d9ef113c] (GCC) testsuite on > > >>>> arm-none-eabi Christophe LYON > > >>>> > > >>>> Results for 11.0.0 20200922 (experimental) [master revision > > >>>> r11-3343-g44135373fcdbe4019c5524ec3dff8e93d9ef113c] (GCC) testsuite on > > >>>> arm-none-eabi Christophe LYON > > >>>> Results for 11.0.0 20200922 (experimental) [master revision > > >>>> r11-3343-g44135373fcdbe4019c5524ec3dff8e93d9ef113c] (GCC) testsuite on > > >>>> arm-none-eabi Christophe LYON > > >>>> Results for 11.0.0 20200922 (experimental) [master revision > > >>>> r11-3343-g44135373fcdbe4019c5524ec3dff8e93d9ef113c] (GCC) testsuite on > > >>>> arm-none-eabi Christophe LYON > > >>>> Results for 11.0.0 20200922 (experimental) [master revision > > >>>> r11-3343-g44135373fcdbe4019c5524ec3dff8e93d9ef113c] (GCC) testsuite on > > >>>> arm-none-eabi Christophe LYON > > >>>> Results for 11.0.0 20200922 (experimental) [master revision > > >>>> r11-3343-g44135373fcdbe4019c5524ec3dff8e93d9ef113c] (GCC) testsuite on > > >>>> arm-none-eabi Christophe LYON > > >>>> Results for 11.0.0 20200922 (experimental) [master revision > > >>>> r11-3343-g44135373fcdbe4019c5524ec3dff8e93d9ef113c] (GCC) testsuite on > > >>>> arm-none-eabi Christophe LYON > > >>>> Results for 11.0.0 20200922 (experimental) [master revision > > >>>> r11-3343-g44135373fcdbe4019c5524ec3dff8e93d9ef113c] (GCC) testsuite on > > >>>> arm-none-eabi Christophe LYON > > >>>> Results for 11.0.0 20200922 (experimental) [master revision > > >>>> r11-3343-g44135373fcdbe4019c5524ec3dff8e93d9ef113c] (GCC) testsuite on > > >>>> arm-none-eabi Christophe LYON > > >> > >