On Sat, 23 Jul 2022 20:03:39 GMT, Raffaello Giulietti <d...@openjdk.org> wrote:
>> src/java.base/share/classes/java/lang/Float.java line 1122: >> >>> 1120: // binary16 (when rounding is done, could still round up) >>> 1121: int exp = Math.getExponent(f); >>> 1122: assert -25 <= exp && exp <= 15; >> >> I think that both the subnormal and the normal case can be unified if we pay >> closer attention to the positions of the lsb, round and sticky bits in >> subnormals. >> >> >> // Clamp exp to the [-15, 15] range while retaining the >> // difference between the original value and -15 on clamping. >> // This is the excess shift value in addition to 13. >> int expdelta = Math.max(0, -15 - exp); >> exp += expdelta; >> assert -15 <= exp && exp <= 15; >> >> int f_signif_bits = doppel & 0x007f_ffff; // original significand >> // Significand bits as if using rounding to zero (truncation). >> short signif_bits = (short)(f_signif_bits >> (13 + expdelta)); >> >> // For round to nearest even, determining whether or >> // not to round up (in magnitude) is a function of the >> // least significant bit (LSB), the next bit position >> // (the round position), and the sticky bit (whether >> // there are any nonzero bits in the exact result to >> // the right of the round digit). An increment occurs >> // in three cases: >> // >> // LSB Round Sticky >> // 0 1 1 >> // 1 1 0 >> // 1 1 1 >> // See "Computer Arithmetic Algorithms," Koren, Table 4.9 >> >> int lsb = f_signif_bits & (1 << 13 + expdelta); >> int round = f_signif_bits & (1 << 12 + expdelta); >> int sticky = f_signif_bits & ((1 << 12 + expdelta) - 1); >> >> if (round != 0 && ((lsb | sticky) != 0 )) { >> signif_bits++; >> } >> >> // No bits set in significand beyond the *first* exponent >> // bit, not just the sigificand; quantity is added to the >> // exponent to implement a carry out from rounding the >> // significand. >> assert (0xf800 & signif_bits) == 0x0; >> >> return (short)(sign_bit | ( ((exp + 15) << 10) + signif_bits ) ); > > I didn't test this variant, will do tomorrow when also reviewing the tests > themselves. The correct variant below passes the tests. // For binary16 subnormals, beside forcing exp to -15, // retain the difference expdelta = E_min - exp. // This is the excess shift value, in addition to 13, to be used // in the computations below. // Further the (hidden) msb with value 1 in f must be involved as well. int expdelta = 0; int msb = 0x0000_0000; if (exp < -14) { expdelta = -14 - exp; exp = -15; msb = 0x0080_0000; } int f_signif_bits = doppel & 0x007f_ffff | msb; // Significand bits as if using rounding to zero (truncation). short signif_bits = (short)(f_signif_bits >> (13 + expdelta)); // For round to nearest even, determining whether or // not to round up (in magnitude) is a function of the // least significant bit (LSB), the next bit position // (the round position), and the sticky bit (whether // there are any nonzero bits in the exact result to // the right of the round digit). An increment occurs // in three cases: // // LSB Round Sticky // 0 1 1 // 1 1 0 // 1 1 1 // See "Computer Arithmetic Algorithms," Koren, Table 4.9 int lsb = f_signif_bits & (1 << 13 + expdelta); int round = f_signif_bits & (1 << 12 + expdelta); int sticky = f_signif_bits & ((1 << 12 + expdelta) - 1); if (round != 0 && ((lsb | sticky) != 0 )) { signif_bits++; } // No bits set in significand beyond the *first* exponent // bit, not just the sigificand; quantity is added to the // exponent to implement a carry out from rounding the // significand. assert (0xf800 & signif_bits) == 0x0; return (short)(sign_bit | ( ((exp + 15) << 10) + signif_bits ) ); ------------- PR: https://git.openjdk.org/jdk/pull/9422