Re: [Text] Java 21 failure with a double

Alex Herbert Fri, 20 Oct 2023 01:09:33 -0700

I've implemented a fix for this in PR 467 [1]. Builds now pass on JDK 21.

This does not change the DoubleFormat class. It adds javadoc to state
that the formatting is dependent on Double.toString.


The tests have been updated to use a different range for the random
doubles so the tests across formats should be testing roughly the same
precision. They should not test the full precision output (i.e. 17
digits).

Note that this change identified another mismatch between the
reference DecimalFormat and the DoubleFormat when a zero trails the
decimal point. The DecimalFormat class can omit this trailing zero and
the decimal point:

new DecimalFormat("##0.0##E0").format(1.1299999e-4) => 113E-6    (not 113.0E-6)

The inelegant solution I used was to remove .0 from the DecimalFormat
output and recheck the string is a match. This allows the above case
to pass. There are no other cases in the 1000 random numbers that hit
this mismatch case.

A new test has been added for full precision output. The test requires
that the output format can be parsed back to the same number, i.e. no
loss of precision occurs when formatting. This seemed simpler than
writing custom code to check against the digits output from
Double.toString to ensure all digits are present.

Alex

[1] https://github.com/apache/commons-text/pull/467

On Mon, 16 Oct 2023 at 18:37, Alex Herbert <alex.d.herb...@gmail.com> wrote:
>
> TLDR; The Double.toString(double) value is different for
> -9.354004711977437E17 on JDK 21 and earlier JDKs:
>
> JDK 21: -9.354004711977437E17
> JDK 17: -9.3540047119774374E17
>
> The DoubleFormat class is built upon Double.toString. So the test
> fails due to this change.
>
> ---
>
> More details:
>
> On JDK 21 Double.toString is dropping the last digit (a 4) as it is
> not required to uniquely represent the double from the next double up
> or down. Note the javadoc for toString states: "This decimal is
> (almost always) the shortest one that rounds to m according to the
> round to nearest rounding policy of IEEE 754 floating-point
> arithmetic." So this is not the closest decimal representation of the
> double, just the closest required for rounding to the double.
>
> I do not think this is a bug. But we should document that the
> DoubleFormat class is built upon the Double.toString representation of
> a double. This string may not be the closest decimal representation of
> the double. Thus final digit errors can be observed when using the
> entire character output of Double.toString compared to other
> formatting classes such as java.lang.DecimalFormat or the numerical
> string representation provided by java.lang.BigDecimal (see examples
> below).
>
> Q. How to fix the test?
>
> The DoubleFormatTest is using 1000 random double values with a base-2
> exponent of -100 to 100. Then testing against a plain text format of
> 0.00##. The exponent of 9.35e17 is 59. It is so large there are no
> digits after the decimal point. So the test is checking if
> DoubleFormat is accurate to 17 decimal digits of precision, which it
> is not in this case. The other tests using the random numbers test the
> formats:
>
> #,##0.0##
> 0.0##E0
> ##0.0##E0
>
> So the scientific and engineering tests are only checking 4 and 6
> decimal digits. But the plain formats are checking up to 17 digits due
> to the large exponent of the random numbers. This does not seem very
> fair.
>
> Fix 1: A simple fix would be to reduce the exponent range for random
> numbers so that the plain numbers should always have at least 4 digits
> after the decimal point.
>
> However, changing the test like this would reduce its power. It would
> not have alerted us to this failure.
>
> Fix 2: An alternative would be to change the test assertion to perform
> the current check, and if it fails perform a test that all digits from
> Double.toString are present in the formatted string in the same order
> (i.e. the class has used the entire output from Double.toString and so
> is at its limit). This is non-trivial as the double value is tested in
> all the installed locales which may change the digits and other
> formatting characters. The assertion would have to create a reverse
> parsing method based on the locale.
>
> Note: I did try using the DecimalFormat used to specify formatting the
> expected string to parse the string with DecimalFormat.parse. It
> worked on the original root locale but it failed on locale "he"; this
> may be another bug in DoubleFormat since the locale can parse its own
> output:
>
> jshell> var df = new java.text.DecimalFormat("0.0##",
> java.text.DecimalFormatSymbols.getInstance(Locale.forLanguageTag("he")))
> jshell> x
> x ==> -9.3540047119774374E17
> jshell> df.parse(df.format(x)).doubleValue()
> $32 ==> -9.3540047119774374E17
>
> Fix 3: Implement fix 1, plus add a test for full length precision
> using only the root locale. This can quickly test the output is at the
> limit by checking the string creates the original input double using
> Double.parseDouble, i.e. the DoubleFormat has captured the entire
> information required to uniquely recreate the original double.
>
> Alex
>
> ---
>
> Here is the failure:
>
> [ERROR] Failures:
> [ERROR]   
> DoubleFormatTest.testPlain_localeFormatComparison:491->checkLocalizedFormats:127->checkLocalizedFormat:121->assertLocalized
> FormatsAreEqual:40 Unexpected output for locale [und] and double value
> -9.354004711977437E17 ==> expected: <-935400471197743740.0> bu
> t was: <-935400471197743700.0>
>
> The Double.toString value is the minimum string required to uniquely
> identify the double value; this is the string required to round trip
> the number to a string and back via Double.parseDouble(String).
> However, it may not be the closest decimal representation of the
> value.
>
> In the failed test the double value is -9.354004711977437E17 which can
> be interpreted differently. I obtained the value that failed using
> Double.doubleToLongBits. This has different representations depending
> on how it is output:
>
> jshell
> |  Welcome to JShell -- Version 17.0.6
> |  For an introduction type: /help intro
>
> jshell> double x = Double.longBitsToDouble(-4347673023486037259L)
> x ==> -9.3540047119774374E17
>
> jshell> new BigDecimal(x)
> $2 ==> -935400471197743744
>
> // Using the DecimalFormat from the failing test:
> jshell> new java.text.DecimalFormat("0.0##",
> java.text.DecimalFormatSymbols.getInstance(Locale.ROOT)).format(x)
> $3 ==> "-935400471197743740.0"
>
> jshell> Math.nextUp(x)
> $4 ==> -9.3540047119774362E17
>
> jshell> Math.nextDown(x)
> $5 ==> -9.3540047119774387E17
>
>
> Note the difference for JDK 21:
>
> jshell
> |  Welcome to JShell -- Version 21
> |  For an introduction type: /help intro
>
> jshell> double x = Double.longBitsToDouble(-4347673023486037259L)
> x ==> -9.354004711977437E17
>
> jshell> Double.toString(x)
> $3 ==> "-9.354004711977437E17"
>
> jshell> Math.nextUp(x)
> $4 ==> -9.354004711977436E17
>
> jshell> Math.nextDown(x)
> $5 ==> -9.354004711977439E17
>
>
>
>
>
>
> On Sun, 15 Oct 2023 at 18:57, Gary Gregory <garydgreg...@gmail.com> wrote:
> >
> > Hi all,
> >
> > Can someone explain the 1 test failure on Java 21 in DoubleFornatTest? A
> > one digit difference?
> >
> > Gary

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [Text] Java 21 failure with a double

Reply via email to