Re: faster tests [was: rhel8 test failure confirmation?]

Bogdan Tue, 18 Apr 2023 04:07:54 -0700

Karl Berry <k...@freefriends.org>, Mon Apr 17 2023 22:16:38 GMT+0200(Central European Summer Time)

Hi Bogdan,

     Then, I analysed the files and added the trick from t/backcompat2.sh
     (if possible) and/or removed the extra calls to $ACLOCAL (if possible).


Thanks much for looking into this.

     Short version: after a few hours of testing and modifications, I
     *may* have saved up to 1 minute and 12 seconds of testing...

Well, at least you get kudos for doing all the research :).

:)

     You may look at the attached patch as a result of the investigation
     and then ... you're free to completely ignore it :). It works for me,
     but I wonder if it won't cause more confusion than it's worth...

I agree. Not worth the complications.

     t/backcompat-acout.sh: 35 -> 24s

That seems to me like the only one that might be worth applying the
patch for. Quite a bit more savings than anything else in the list.

Yes. Aclocal is called in a loop here, always with the same set of(Automake) macros in configure.ac, so it probably always generates thesame aclocal.m4 (no external macros called either). This duplicationcan be avoided. Strange that the trick doesn't work in all cases, butat least it works here.

      # A trick to make the test run muuuch faster, by avoiding repeated
      # runs of aclocal (one order of magnitude improvement in speed!).
      echo 'AC_INIT(x,0) AM_INIT_AUTOMAKE' > configure.ac


Alternatively, I wonder how much this is really saving. Maybe the trick
should not be used anywhere.

5 seconds is the gain, just checked. From about 17.5s to about12.5s. On the whole test, of course. So, in this particular case, thesaving was meaningful (cut 25-30% of time, even if this is just 5wallclock seconds).On the other hand, I'm now calling aclocal 7 times instead of 1, so6 skipped aclocal calls give about 5s of savings, or about 1s peraclocal call. In a loop with even tens of iterations the saving wouldbe visible, but otherwise it's just single seconds, or even literally1 second, like some of my results show.

     - having 1277 .sh files in 't/' means that even if each runs in 30
     seconds, you have 10 hours of testing just from the number of tests,

Indeed. The only practical way to run make check is in parallel.  I
discovered that early on :). It still takes painfully long for me
(10-15min at best, on a new and fast machine).

I have 4 vcores and I'm afraid the full set would literally takehours to complete on my machine. 10-15 minutes is a luxury! :)

     - it may be better to determine if there are duplicate tests

Sounds awfully hard to do.

I agree. Compare each test with each other. Sometimes within "areasonable group" (like tests with the same name and just a numberappended), sometimes with all other tests (like the ones named after aproblem ID). A complexity of n^2 operations :).

My impression is that the (vast?) majority of tests are the direct
result of bug reports. I would not be inclined to tweak, remove, merge,
or change them. Even if two tests are nearly identical, that "nearly"
can make all the difference.

Not sure about the "majority" (I didn't read each and every file),but I totally agree with the last part. Sometimes some simple changein one of the files causes some problem to appear and the testsucceeds (or fails if needed) and e.g. not porting the change to a new"merged" test may cause that we lose the test for an actual problemwithout realizing it (in other words, we lose coverage for some partof the code).Furthermore, more "atomic" tests allow you to check singlefunctionalities and give more "atomic" results - you narrow down whichfunctionality is failing vs. having a file "t/test-everything.sh",browsing through the log on each failure, and restarting the (hourslong) test after each fix. We don't want that.That's the "balance slider" here: more tests = more time (noguarantee that less tests would actually merge something and requireless time), but also more tests = more comfort.

     - as you see above, t/pr401b.sh takes 1m42s to run. I wonder if e.g.
     running the 'distcheck' target in tests would be the main factor

Sounds very likely to me. Distcheck is inherently a lengthy process. I
can't imagine how it can be sped up. Although I agree that 1:42 seems
rather long for a trivial package like those in the tests.

It runs '$MAKE distcheck' 3 times and one '$MAKE check' + '$MAKEdistclean' pair, so fortunately it's not one 'distcheck' that istaking so long. Having 25-30s per one 'distcheck' means there isn't aminute to chop off any longer, but - again - maybe just singleseconds. 25-30s doesn't sound so bad compared to 1:42...

     Same case for t/pr401c.sh and t/pr401.sh, although shorter times.

At a glance, I see required='cc libtoolize' in 401b, whereas 401c and
401 only have cc. Testing libtool really is different, and really does
take time. So I'm not sure there's any low-hanging fruit here.

I took a quick look at those 3, and you're probably right. All the'distcheck' are made on different configurations, so they mostprobably must stay as-is.

Thanks again for doing all this work,



 :)

--
Regards - Bogdan ('bogdro') D.                 (GNU/Linux & FreeDOS)
X86 assembly (DOS, GNU/Linux):    http://bogdro.evai.pl/index-en.php
Soft(EN): http://bogdro.evai.pl/soft  http://bogdro.evai.pl/soft4asm
www.Xiph.org  www.TorProject.org  www.LibreOffice.org  www.GnuPG.org

Re: faster tests [was: rhel8 test failure confirmation?]

Reply via email to