Thank you Alex! The PR is now merged.
Gary On Fri, Oct 20, 2023, 4:09 AM Alex Herbert <alex.d.herb...@gmail.com> wrote: > I've implemented a fix for this in PR 467 [1]. Builds now pass on JDK 21. > > This does not change the DoubleFormat class. It adds javadoc to state > that the formatting is dependent on Double.toString. > > The tests have been updated to use a different range for the random > doubles so the tests across formats should be testing roughly the same > precision. They should not test the full precision output (i.e. 17 > digits). > > Note that this change identified another mismatch between the > reference DecimalFormat and the DoubleFormat when a zero trails the > decimal point. The DecimalFormat class can omit this trailing zero and > the decimal point: > > new DecimalFormat("##0.0##E0").format(1.1299999e-4) => 113E-6 (not > 113.0E-6) > > The inelegant solution I used was to remove .0 from the DecimalFormat > output and recheck the string is a match. This allows the above case > to pass. There are no other cases in the 1000 random numbers that hit > this mismatch case. > > A new test has been added for full precision output. The test requires > that the output format can be parsed back to the same number, i.e. no > loss of precision occurs when formatting. This seemed simpler than > writing custom code to check against the digits output from > Double.toString to ensure all digits are present. > > Alex > > [1] https://github.com/apache/commons-text/pull/467 > > On Mon, 16 Oct 2023 at 18:37, Alex Herbert <alex.d.herb...@gmail.com> > wrote: > > > > TLDR; The Double.toString(double) value is different for > > -9.354004711977437E17 on JDK 21 and earlier JDKs: > > > > JDK 21: -9.354004711977437E17 > > JDK 17: -9.3540047119774374E17 > > > > The DoubleFormat class is built upon Double.toString. So the test > > fails due to this change. > > > > --- > > > > More details: > > > > On JDK 21 Double.toString is dropping the last digit (a 4) as it is > > not required to uniquely represent the double from the next double up > > or down. Note the javadoc for toString states: "This decimal is > > (almost always) the shortest one that rounds to m according to the > > round to nearest rounding policy of IEEE 754 floating-point > > arithmetic." So this is not the closest decimal representation of the > > double, just the closest required for rounding to the double. > > > > I do not think this is a bug. But we should document that the > > DoubleFormat class is built upon the Double.toString representation of > > a double. This string may not be the closest decimal representation of > > the double. Thus final digit errors can be observed when using the > > entire character output of Double.toString compared to other > > formatting classes such as java.lang.DecimalFormat or the numerical > > string representation provided by java.lang.BigDecimal (see examples > > below). > > > > Q. How to fix the test? > > > > The DoubleFormatTest is using 1000 random double values with a base-2 > > exponent of -100 to 100. Then testing against a plain text format of > > 0.00##. The exponent of 9.35e17 is 59. It is so large there are no > > digits after the decimal point. So the test is checking if > > DoubleFormat is accurate to 17 decimal digits of precision, which it > > is not in this case. The other tests using the random numbers test the > > formats: > > > > #,##0.0## > > 0.0##E0 > > ##0.0##E0 > > > > So the scientific and engineering tests are only checking 4 and 6 > > decimal digits. But the plain formats are checking up to 17 digits due > > to the large exponent of the random numbers. This does not seem very > > fair. > > > > Fix 1: A simple fix would be to reduce the exponent range for random > > numbers so that the plain numbers should always have at least 4 digits > > after the decimal point. > > > > However, changing the test like this would reduce its power. It would > > not have alerted us to this failure. > > > > Fix 2: An alternative would be to change the test assertion to perform > > the current check, and if it fails perform a test that all digits from > > Double.toString are present in the formatted string in the same order > > (i.e. the class has used the entire output from Double.toString and so > > is at its limit). This is non-trivial as the double value is tested in > > all the installed locales which may change the digits and other > > formatting characters. The assertion would have to create a reverse > > parsing method based on the locale. > > > > Note: I did try using the DecimalFormat used to specify formatting the > > expected string to parse the string with DecimalFormat.parse. It > > worked on the original root locale but it failed on locale "he"; this > > may be another bug in DoubleFormat since the locale can parse its own > > output: > > > > jshell> var df = new java.text.DecimalFormat("0.0##", > > java.text.DecimalFormatSymbols.getInstance(Locale.forLanguageTag("he"))) > > jshell> x > > x ==> -9.3540047119774374E17 > > jshell> df.parse(df.format(x)).doubleValue() > > $32 ==> -9.3540047119774374E17 > > > > Fix 3: Implement fix 1, plus add a test for full length precision > > using only the root locale. This can quickly test the output is at the > > limit by checking the string creates the original input double using > > Double.parseDouble, i.e. the DoubleFormat has captured the entire > > information required to uniquely recreate the original double. > > > > Alex > > > > --- > > > > Here is the failure: > > > > [ERROR] Failures: > > [ERROR] > > DoubleFormatTest.testPlain_localeFormatComparison:491->checkLocalizedFormats:127->checkLocalizedFormat:121->assertLocalized > > FormatsAreEqual:40 Unexpected output for locale [und] and double value > > -9.354004711977437E17 ==> expected: <-935400471197743740.0> bu > > t was: <-935400471197743700.0> > > > > The Double.toString value is the minimum string required to uniquely > > identify the double value; this is the string required to round trip > > the number to a string and back via Double.parseDouble(String). > > However, it may not be the closest decimal representation of the > > value. > > > > In the failed test the double value is -9.354004711977437E17 which can > > be interpreted differently. I obtained the value that failed using > > Double.doubleToLongBits. This has different representations depending > > on how it is output: > > > > jshell > > | Welcome to JShell -- Version 17.0.6 > > | For an introduction type: /help intro > > > > jshell> double x = Double.longBitsToDouble(-4347673023486037259L) > > x ==> -9.3540047119774374E17 > > > > jshell> new BigDecimal(x) > > $2 ==> -935400471197743744 > > > > // Using the DecimalFormat from the failing test: > > jshell> new java.text.DecimalFormat("0.0##", > > java.text.DecimalFormatSymbols.getInstance(Locale.ROOT)).format(x) > > $3 ==> "-935400471197743740.0" > > > > jshell> Math.nextUp(x) > > $4 ==> -9.3540047119774362E17 > > > > jshell> Math.nextDown(x) > > $5 ==> -9.3540047119774387E17 > > > > > > Note the difference for JDK 21: > > > > jshell > > | Welcome to JShell -- Version 21 > > | For an introduction type: /help intro > > > > jshell> double x = Double.longBitsToDouble(-4347673023486037259L) > > x ==> -9.354004711977437E17 > > > > jshell> Double.toString(x) > > $3 ==> "-9.354004711977437E17" > > > > jshell> Math.nextUp(x) > > $4 ==> -9.354004711977436E17 > > > > jshell> Math.nextDown(x) > > $5 ==> -9.354004711977439E17 > > > > > > > > > > > > > > On Sun, 15 Oct 2023 at 18:57, Gary Gregory <garydgreg...@gmail.com> > wrote: > > > > > > Hi all, > > > > > > Can someone explain the 1 test failure on Java 21 in DoubleFornatTest? > A > > > one digit difference? > > > > > > Gary > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > >