Re: [Text] Java 21 failure with a double

Gary Gregory Fri, 20 Oct 2023 03:36:47 -0700

Thank you Alex!

The PR is now merged.


Gary

On Fri, Oct 20, 2023, 4:09 AM Alex Herbert <alex.d.herb...@gmail.com> wrote:

> I've implemented a fix for this in PR 467 [1]. Builds now pass on JDK 21.
>
> This does not change the DoubleFormat class. It adds javadoc to state
> that the formatting is dependent on Double.toString.
>
> The tests have been updated to use a different range for the random
> doubles so the tests across formats should be testing roughly the same
> precision. They should not test the full precision output (i.e. 17
> digits).
>
> Note that this change identified another mismatch between the
> reference DecimalFormat and the DoubleFormat when a zero trails the
> decimal point. The DecimalFormat class can omit this trailing zero and
> the decimal point:
>
> new DecimalFormat("##0.0##E0").format(1.1299999e-4) => 113E-6    (not
> 113.0E-6)
>
> The inelegant solution I used was to remove .0 from the DecimalFormat
> output and recheck the string is a match. This allows the above case
> to pass. There are no other cases in the 1000 random numbers that hit
> this mismatch case.
>
> A new test has been added for full precision output. The test requires
> that the output format can be parsed back to the same number, i.e. no
> loss of precision occurs when formatting. This seemed simpler than
> writing custom code to check against the digits output from
> Double.toString to ensure all digits are present.
>
> Alex
>
> [1] https://github.com/apache/commons-text/pull/467
>
> On Mon, 16 Oct 2023 at 18:37, Alex Herbert <alex.d.herb...@gmail.com>
> wrote:
> >
> > TLDR; The Double.toString(double) value is different for
> > -9.354004711977437E17 on JDK 21 and earlier JDKs:
> >
> > JDK 21: -9.354004711977437E17
> > JDK 17: -9.3540047119774374E17
> >
> > The DoubleFormat class is built upon Double.toString. So the test
> > fails due to this change.
> >
> > ---
> >
> > More details:
> >
> > On JDK 21 Double.toString is dropping the last digit (a 4) as it is
> > not required to uniquely represent the double from the next double up
> > or down. Note the javadoc for toString states: "This decimal is
> > (almost always) the shortest one that rounds to m according to the
> > round to nearest rounding policy of IEEE 754 floating-point
> > arithmetic." So this is not the closest decimal representation of the
> > double, just the closest required for rounding to the double.
> >
> > I do not think this is a bug. But we should document that the
> > DoubleFormat class is built upon the Double.toString representation of
> > a double. This string may not be the closest decimal representation of
> > the double. Thus final digit errors can be observed when using the
> > entire character output of Double.toString compared to other
> > formatting classes such as java.lang.DecimalFormat or the numerical
> > string representation provided by java.lang.BigDecimal (see examples
> > below).
> >
> > Q. How to fix the test?
> >
> > The DoubleFormatTest is using 1000 random double values with a base-2
> > exponent of -100 to 100. Then testing against a plain text format of
> > 0.00##. The exponent of 9.35e17 is 59. It is so large there are no
> > digits after the decimal point. So the test is checking if
> > DoubleFormat is accurate to 17 decimal digits of precision, which it
> > is not in this case. The other tests using the random numbers test the
> > formats:
> >
> > #,##0.0##
> > 0.0##E0
> > ##0.0##E0
> >
> > So the scientific and engineering tests are only checking 4 and 6
> > decimal digits. But the plain formats are checking up to 17 digits due
> > to the large exponent of the random numbers. This does not seem very
> > fair.
> >
> > Fix 1: A simple fix would be to reduce the exponent range for random
> > numbers so that the plain numbers should always have at least 4 digits
> > after the decimal point.
> >
> > However, changing the test like this would reduce its power. It would
> > not have alerted us to this failure.
> >
> > Fix 2: An alternative would be to change the test assertion to perform
> > the current check, and if it fails perform a test that all digits from
> > Double.toString are present in the formatted string in the same order
> > (i.e. the class has used the entire output from Double.toString and so
> > is at its limit). This is non-trivial as the double value is tested in
> > all the installed locales which may change the digits and other
> > formatting characters. The assertion would have to create a reverse
> > parsing method based on the locale.
> >
> > Note: I did try using the DecimalFormat used to specify formatting the
> > expected string to parse the string with DecimalFormat.parse. It
> > worked on the original root locale but it failed on locale "he"; this
> > may be another bug in DoubleFormat since the locale can parse its own
> > output:
> >
> > jshell> var df = new java.text.DecimalFormat("0.0##",
> > java.text.DecimalFormatSymbols.getInstance(Locale.forLanguageTag("he")))
> > jshell> x
> > x ==> -9.3540047119774374E17
> > jshell> df.parse(df.format(x)).doubleValue()
> > $32 ==> -9.3540047119774374E17
> >
> > Fix 3: Implement fix 1, plus add a test for full length precision
> > using only the root locale. This can quickly test the output is at the
> > limit by checking the string creates the original input double using
> > Double.parseDouble, i.e. the DoubleFormat has captured the entire
> > information required to uniquely recreate the original double.
> >
> > Alex
> >
> > ---
> >
> > Here is the failure:
> >
> > [ERROR] Failures:
> > [ERROR]
>  
> DoubleFormatTest.testPlain_localeFormatComparison:491->checkLocalizedFormats:127->checkLocalizedFormat:121->assertLocalized
> > FormatsAreEqual:40 Unexpected output for locale [und] and double value
> > -9.354004711977437E17 ==> expected: <-935400471197743740.0> bu
> > t was: <-935400471197743700.0>
> >
> > The Double.toString value is the minimum string required to uniquely
> > identify the double value; this is the string required to round trip
> > the number to a string and back via Double.parseDouble(String).
> > However, it may not be the closest decimal representation of the
> > value.
> >
> > In the failed test the double value is -9.354004711977437E17 which can
> > be interpreted differently. I obtained the value that failed using
> > Double.doubleToLongBits. This has different representations depending
> > on how it is output:
> >
> > jshell
> > |  Welcome to JShell -- Version 17.0.6
> > |  For an introduction type: /help intro
> >
> > jshell> double x = Double.longBitsToDouble(-4347673023486037259L)
> > x ==> -9.3540047119774374E17
> >
> > jshell> new BigDecimal(x)
> > $2 ==> -935400471197743744
> >
> > // Using the DecimalFormat from the failing test:
> > jshell> new java.text.DecimalFormat("0.0##",
> > java.text.DecimalFormatSymbols.getInstance(Locale.ROOT)).format(x)
> > $3 ==> "-935400471197743740.0"
> >
> > jshell> Math.nextUp(x)
> > $4 ==> -9.3540047119774362E17
> >
> > jshell> Math.nextDown(x)
> > $5 ==> -9.3540047119774387E17
> >
> >
> > Note the difference for JDK 21:
> >
> > jshell
> > |  Welcome to JShell -- Version 21
> > |  For an introduction type: /help intro
> >
> > jshell> double x = Double.longBitsToDouble(-4347673023486037259L)
> > x ==> -9.354004711977437E17
> >
> > jshell> Double.toString(x)
> > $3 ==> "-9.354004711977437E17"
> >
> > jshell> Math.nextUp(x)
> > $4 ==> -9.354004711977436E17
> >
> > jshell> Math.nextDown(x)
> > $5 ==> -9.354004711977439E17
> >
> >
> >
> >
> >
> >
> > On Sun, 15 Oct 2023 at 18:57, Gary Gregory <garydgreg...@gmail.com>
> wrote:
> > >
> > > Hi all,
> > >
> > > Can someone explain the 1 test failure on Java 21 in DoubleFornatTest?
> A
> > > one digit difference?
> > >
> > > Gary
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

Re: [Text] Java 21 failure with a double

Reply via email to