On 01/13/2015 09:01 PM, Phil Steitz wrote: > On 1/12/15 3:21 PM, Thomas Neidhart wrote: >> On 01/12/2015 11:17 PM, Phil Steitz wrote: >>> On 1/12/15 2:30 PM, Thomas Neidhart wrote: >>>> On 01/12/2015 10:26 PM, Thomas Neidhart wrote: >>>>> On 01/12/2015 08:09 PM, Phil Steitz wrote: >>>>>> On 1/12/15 11:37 AM, sebb wrote: >>>>>>> On 12 January 2015 at 18:11, Phil Steitz <phil.ste...@gmail.com> wrote: >>>>>>>> On 1/12/15 10:50 AM, sebb wrote: >>>>>>>>> On 11 January 2015 at 22:10, Phil Steitz <phil.ste...@gmail.com> >>>>>>>>> wrote: >>>>>>>>>> On 1/11/15 11:19 AM, Phil Steitz wrote: >>>>>>>>>>> On 1/10/15 10:49 PM, Phil Steitz wrote: >>>>>>>>>>>> On 1/9/15 6:09 PM, sebb wrote: >>>>>>>>>>>>> On 10 January 2015 at 01:01, Phil Steitz <phil.ste...@gmail.com> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> On 1/9/15 5:32 PM, sebb wrote: >>>>>>>>>>>>>>> On 9 January 2015 at 23:48, sebb <seb...@gmail.com> wrote: >>>>>>>>>>>>>>>> Of the last 6 runs, only 1 had a problem with unit test >>>>>>>>>>>>>>>> failures. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> All the builds ran on ubuntu3, apart from the failure which >>>>>>>>>>>>>>>> ran on H10. >>>>>>>>>>>>>>>> This may have some bearing on the result; I don't yet know. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I had a quick look at 2 tests that failed: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> SimpleRegressionTest.testPerfect >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> SimpleRegressionTest.testPerfectNegative >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Although the test case has some instance data, these >>>>>>>>>>>>>>>> particular tests >>>>>>>>>>>>>>>> do not use any, so it does not look like a concurrency issue >>>>>>>>>>>>>>>> in the >>>>>>>>>>>>>>>> unit test itself. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The SimpleRegression class has mutable instance data, but the >>>>>>>>>>>>>>>> test >>>>>>>>>>>>>>>> cases create their own instance. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I don't know anything about the math functions involved, but >>>>>>>>>>>>>>>> it looks >>>>>>>>>>>>>>>> as though Infinity might result from getSignificance() if >>>>>>>>>>>>>>>> getSlopeStdErr() returns 0, as the latter is used as a >>>>>>>>>>>>>>>> divisor. Or if >>>>>>>>>>>>>>>> the field sumXX is 0 because that is also used as a divisor. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Maybe the H10 host has different floating point hardware? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I'll try running some more tests on H10. >>>>>>>>>>>>>>> the build failed again on H10; exactly the same tests failed as >>>>>>>>>>>>>>> before: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> This test: >>>>>>>>>>>>>>> https://builds.apache.org/job/Commons%20Math%20H10/1/console >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Previous failure: >>>>>>>>>>>>>>> https://builds.apache.org/job/Commons%20Math/14/console >>>>>>>>>>>>>> This is actually a bug. Thanks, sebb (and Jenkins)! >>>>>>>>>>>>>> >>>>>>>>>>>>>> Has been here since 1.x. What is going on is that the data sets >>>>>>>>>>>>>> used in the test cases are set up to be perfect linear >>>>>>>>>>>>>> relationships, which should in fact lead to mean square error >>>>>>>>>>>>>> (and >>>>>>>>>>>>>> hence slope standard error) equal to 0. The Jenkins box must be >>>>>>>>>>>>>> getting exact 0. The funny thing is the test is there to >>>>>>>>>>>>>> validate >>>>>>>>>>>>>> correct performance for models like this. Its success >>>>>>>>>>>>>> unfortunately >>>>>>>>>>>>>> depends on poor precision. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I will open a JIRA for this. I don't think it is a release >>>>>>>>>>>>>> blocker >>>>>>>>>>>>>> for 3.4.1, as I am sure you would get the same thing in any >>>>>>>>>>>>>> earlier >>>>>>>>>>>>>> version of [math]. >>>>>>>>>>>>> OK good to know. >>>>>>>>>>>>> >>>>>>>>>>>>> I'll leave the H10 Jenkins job for now to make it easy to retest. >>>>>>>>>>>> My first guess here was wrong. The infinities are being handled >>>>>>>>>>>> correctly for the JDKs I have. Something must be going awry in the >>>>>>>>>>>> t distribution cumulative probability computation for +INF on the >>>>>>>>>>>> box that is failing. Is there a way to find out exactly what JDK >>>>>>>>>>>> and OS version are being used? >>>>>>>>>>> I just committed a test that tests the t distribution computations >>>>>>>>>>> directly. It seems to have run clean; but the other test ran clean >>>>>>>>>>> too. Is there any way to force the build to use the host that >>>>>>>>>>> fails? >>>>>>>>>> I can't make any sense of what is going on with the Jenkins builds. >>>>>>>>>> Clean runs and then lots of errors. This one explains the >>>>>>>>>> SimpleRegression "problem" (which is not a problem with that class >>>>>>>>>> at least) >>>>>>>>>> >>>>>>>>>> testCumulativeProbablilityExtremes(org.apache.commons.math3.distribution.TDistributionTest) >>>>>>>>>> Time elapsed: 0.001 sec <<< FAILURE! >>>>>>>>>> java.lang.AssertionError: expected:<1.0> but was:<-Infinity> >>>>>>>>>> at org.junit.Assert.fail(Assert.java:88) >>>>>>>>>> at org.junit.Assert.failNotEquals(Assert.java:743) >>>>>>>>>> at org.junit.Assert.assertEquals(Assert.java:494) >>>>>>>>>> at org.junit.Assert.assertEquals(Assert.java:592) >>>>>>>>>> at >>>>>>>>>> org.apache.commons.math3.distribution.TDistributionTest.testCumulativeProbablilityExtremes(TDistributionTest.java:109) >>>>>>>>>> >>>>>>>>>> Earlier runs this ran clean. There is nothing non-deterministic >>>>>>>>>> about this test (or quite a few of the others that randomly seem to >>>>>>>>>> fail). >>>>>>>>>> >>>>>>>>>> I wonder if we have a bad cpu or something somewhere. >>>>>>>>> AFAICS all the failed builds ran on H10. >>>>>>>>> >>>>>>>>> IMO it is consistent; the apparent randomness comes from the fact the >>>>>>>>> there are several Ubuntu hosts, including H10. >>>>>>>> Am I reading it / looking at the wrong one, or did this one succeed? >>>>>>>> >>>>>>>> https://builds.apache.org/view/All/job/Commons%20Math%20H10/6/ >>>>>>>> >>>>>>>> That one was right after I added tests confirming that the t >>>>>>>> distribution cum prob handles INFs correctly. >>>>>>> That did run on H10 and did succeed; I'd not noticed that one before. >>>>>>> >>>>>>> I think it is still true that the failures have only occurred on H10. >>>>>>> >>>>>>> However, the latest one is failing: >>>>>>> >>>>>>> https://builds.apache.org/job/Commons%20Math/24/console >>>>>>> >>>>>>> This is on H11 - I think that's the first time H11 has been used. >>>>>>> >>>>>>> I suppose it's possible that H10 and H11 have a common failing, but it >>>>>>> seems less likely. >>>>>>> >>>>>>> I added a bit more debug - showing the value of sumXX - but that seems >>>>>>> OK on H11. >>>>>>> >>>>>>> I just added a bit more debug. >>>>>> I am pretty sure the SimpleRegressionTest failure is actually cause >>>>>> by the same thing causing the t-distribution test to fail (the >>>>>> reason I added that one). >>>>>> >>>>>> One that is more straightforward to chase is this one, which fails >>>>>> pretty consistently when "bad things happen" >>>>>> >>>>>> testExpInf(org.apache.commons.math3.complex.ComplexTest) Time elapsed: >>>>>> 0.001 sec <<< FAILURE! >>>>>> java.lang.AssertionError: expected:<0.0> but was:<Infinity> >>>>>> at org.junit.Assert.fail(Assert.java:88) >>>>>> at org.junit.Assert.failNotEquals(Assert.java:743) >>>>>> at org.junit.Assert.assertEquals(Assert.java:494) >>>>>> at org.junit.Assert.assertEquals(Assert.java:592) >>>>>> at org.apache.commons.math3.TestUtils.assertSame(TestUtils.java:76) >>>>>> at org.apache.commons.math3.TestUtils.assertSame(TestUtils.java:84) >>>>>> at >>>>>> org.apache.commons.math3.complex.ComplexTest.testExpInf(ComplexTest.java:788) >>>>>> >>>>>> I would wager that what is going on here is 0.0 * -INF = INF. >>>>> The output returned by the debug statements added by sebb is: >>>>> >>>>> expReal=Infinity >>>>> cosImag=0.5403023058681398 >>>>> sinImag=0.8414709848078965 >>>>> result=(Infinity, Infinity) >>>>> >>>>> while expReal should be -Infinity. >>>>> >>>>> of course, Math.exp(Infinity) = Infinity. >>>> oh stupid mistake, please forget my last post. >>>> I messed up expReal with the actual real value. >>> But it should be 0, since expReal should be exp(-INF) >> just added a few more debug output to the test and the result is: >> >> real=-Infinity >> -real=2147483647 >> expReal=Infinity >> >> according to FastMath.exp(), with these values, the code path should be >> as follows: >> >> if (x < 0.0) { >> intVal = (int) -x; >> >> if (intVal > 746) { >> if (hiPrec != null) { >> hiPrec[0] = 0.0; >> hiPrec[1] = 0.0; >> } >> --> return 0.0; >> } >> >> >> but obviously it doesn't do this. I guess we can only inspect the >> generated class files for a potential compiler bug. > > I did a little more poking about last night in the failed tests and > the ones I spot-checked could all have had to do with incorrect > computations of exp(-INF). What is strange is that the cast you > show above is working correctly (compliant with JLS) and the code > path should be as you have it there. It seems very strange that > just this one code path is sporadically having problems.
You can see the result of various test builds here: https://builds.apache.org/job/Commons%20Math%20H10/ Everytime I added more debug output to FastMath.exp(), the tests succeeded. I also setup a jenkins instance with the same maven / jdk version to build commons-math, but could never reproduce an error so far. Without direct access to one of the failing servers, I doubt that we will be able to find / fix this problem. Thomas --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org