Roland Scheidegger <srol...@vmware.com> writes: > Am 21.07.2014 17:53, schrieb Richard Sandiford: >> lp_build_iround has a fallback case that tries to emulate a round-to-nearest >> float->int conversion by adding 0.5 and using a truncating fptosi. For odd >> numbers in the range [2^23, 2^24], which are already integral, this has >> the effect of adding 1 to the integer, since the result of adding 0.5 is >> exactly half way between two representable values and the tie goes to even. >> This includes the important special case of (float)0xffffff, which is the >> maximum depth in a z24s8 format. I.e. calling iround on (float)0xffffff >> gives 0x1000000 rather than 0xffffff when the fallback is used. >> >> This patch only uses the x+0.5 trick if the ulp of the input is fractional. >> This still doesn't give ties-to-even semantics, but that doesn't seem to >> matter much in practice. >> >> Signed-off-by: Richard Sandiford <rsand...@linux.vnet.ibm.com> >> --- >> src/gallium/auxiliary/gallivm/lp_bld_arit.c | 43 >> ++++++++++++++++++----------- >> 1 file changed, 27 insertions(+), 16 deletions(-) >> >> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_arit.c >> b/src/gallium/auxiliary/gallivm/lp_bld_arit.c >> index 9ef8628..82ddb5a 100644 >> --- a/src/gallium/auxiliary/gallivm/lp_bld_arit.c >> +++ b/src/gallium/auxiliary/gallivm/lp_bld_arit.c >> @@ -2181,25 +2181,36 @@ lp_build_iround(struct lp_build_context *bld, >> res = lp_build_round_arch(bld, a, LP_BUILD_ROUND_NEAREST); >> } >> else { >> - LLVMValueRef half; >> + struct lp_type int_type = lp_int_type(type); >> + LLVMTypeRef vec_type = bld->vec_type; >> + unsigned mantissa = lp_mantissa(type); >> + unsigned expbits = type.width - mantissa - 1; >> + unsigned bias = (1 << (expbits - 1)) - 1; >> + LLVMValueRef mask = lp_build_const_int_vec(bld->gallivm, type, >> + (unsigned long long)1 << (type.width - 1)); >> + /* Smallest value with an integral ulp */ >> + LLVMValueRef abslimit = lp_build_const_int_vec(bld->gallivm, type, >> + (bias + mantissa) << mantissa); >> + LLVMValueRef half, sign, absa, fractulp; >> >> half = lp_build_const_vec(bld->gallivm, type, 0.5); >> >> - if (type.sign) { >> - LLVMTypeRef vec_type = bld->vec_type; >> - LLVMValueRef mask = lp_build_const_int_vec(bld->gallivm, type, >> - (unsigned long long)1 << (type.width - >> 1)); >> - LLVMValueRef sign; >> - >> - /* get sign bit */ >> - sign = LLVMBuildBitCast(builder, a, int_vec_type, ""); >> - sign = LLVMBuildAnd(builder, sign, mask, ""); >> - >> - /* sign * 0.5 */ >> - half = LLVMBuildBitCast(builder, half, int_vec_type, ""); >> - half = LLVMBuildOr(builder, sign, half, ""); >> - half = LLVMBuildBitCast(builder, half, vec_type, ""); >> - } >> + assert(type.sign); > You can't assert here. It is quite legal to have floats without sign > type - used to skip things like here when we know the input is positive. > There might not be all that many callers using such build contexts, > though some quick glance identified at least two > (lp_build_coord_repeat_npot_linear_int, > lp_build_clamped_float_to_unsigned_norm). Though I would have thought > the latter is where the bug you're seeing is coming from...
Ah, sorry, I hadn't realised that sort of modification could happen. I just looked at the types that lp_bld_type.h created. > Also, I'm not a fan of making this more complex in general. > lp_build_itrunc is most often used for converting texture coords, or (lp_build_iround rather than lp_build_itrunc?) > other stuff like srgb conversion, and none of these callers care about > this, so maybe should distinguish the cases. As an extra boolean argument? Yeah, I can try that, although I'll probably need help deciding what the right argument is for some cases. Or were you thinking of something else? >> + >> + /* get sign bit */ >> + sign = LLVMBuildBitCast(builder, a, int_vec_type, ""); >> + sign = LLVMBuildAnd(builder, sign, mask, ""); >> + >> + /* get "ulp is fractional" */ >> + absa = lp_build_abs(bld, a); >> + absa = LLVMBuildBitCast(builder, absa, int_vec_type, ""); >> + fractulp = lp_build_compare(bld->gallivm, int_type, PIPE_FUNC_LESS, >> absa, abslimit); >> + >> + /* (sign * 0.5) & intulp */ >> + half = LLVMBuildBitCast(builder, half, int_vec_type, ""); >> + half = LLVMBuildOr(builder, sign, half, ""); >> + half = LLVMBuildAnd(builder, half, fractulp, ""); >> + half = LLVMBuildBitCast(builder, half, vec_type, ""); >> >> res = LLVMBuildFAdd(builder, a, half, ""); >> } >> > > Though if the caller is lp_build_round, it actually seems not too > complicated - the abs() from both will be the same, and the comparison > could be made (nearly) the same too (lp_build_round and friends are > using 2^24, though my comments are wrong there and it could just use > 2^23 instead). > Or would something like adding "largest float smaller than 0.5" first, > then add the remainder to 0.5 later work too? I guess though if you have > type.sign that's not just another "fadd" but also another "or". Hmm, I suppose if we don't care whether x+0.5 goes to x or x+1 (because what we have now isn't ties-to-even anyway) then we could just add "largest float smaller than 0.5" instead of 0.5. I think adding the remainder would only make a difference for an input of 0.5 or -0.5. > Oh and this is not "util" code as the commit message mentions. Doh. Thanks, Richard _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev