On Fri, 20 Oct 2023 at 18:55, Alex Herbert <alex.d.herb...@gmail.com> wrote: > > The chi-square critical value (13.82) is correct: > > >>> from scipy.stats import chi2 > >>> chi2(2).isf(0.001) > 13.815510557964274 > > The test seems to fail with the expected frequency when run locally. I > annotated with: > > @RepeatedTest(value = 100000) > > I observe 93 failures (just under 1 in 1000). So it is strange this > fails a lot on the GH CI build. > > We could just use a fixed Random argument to the call that is > ultimately performing the random string generation: > > random(count, 0, chars.length, false, false, chars, random()); > > Switch the test to: > > Random rng = new Random(0xdeadbeef) > > gen = RandomStringUtils.random(6, 0, 3, false, false, chars, rng); > > You will see a drop in coverage by not exercising the public API. > > The alternative is to change the chi-square critical value: > > 1 in 10,000: 18.420680743952364 > 1 in 100,000: 23.025850929940457 > 1 in 1,000,000: 27.631021115928547 > > Alex
Also note that although the test fails 1 in 1000 times, we run 4 builds in CI (coverage + 3 JDKS). So we expect to see failure with a p-value of: 1 - (1 - 0.001)^4 = 0.00399 This is the probability of not seeing a failure subtracted from 1. It is approximately 1 in 250. I did check the computation of the chi-square statistic and AFAIK it is correct. My suggestion for a first change is to bump the critical value to the level for 1 in 100,000. We should then see failures every 25,000 GH builds. If the frequency is more than that then we have to assume that the ThreadLocalRandom instance is not uniformly sampling from the set of size 3. I find this unlikely as the underlying algorithm for ThreadLocalRandom is good [1]. Alex [1] IIRC it passes the Test U01 BigCrush test for random generators --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org