On Sat, Nov 14, 2015 at 4:03 PM, Ganesh Ajjanagadde <gajja...@mit.edu> wrote: > On Sat, Nov 14, 2015 at 3:28 PM, Hendrik Leppkes <h.lepp...@gmail.com> wrote: >> On Sat, Nov 14, 2015 at 3:51 AM, Ganesh Ajjanagadde <gajja...@mit.edu> wrote: >>> On Fri, Nov 13, 2015 at 7:17 PM, Ronald S. Bultje <rsbul...@gmail.com> >>> wrote: >>>> Hi, >>>> >>>> On Fri, Nov 13, 2015 at 6:16 PM, Ganesh Ajjanagadde <gajja...@mit.edu> >>>> wrote: >>>> >>>>> On Fri, Nov 13, 2015 at 4:52 PM, Ronald S. Bultje <rsbul...@gmail.com> >>>>> wrote: >>>>> > Hi, >>>>> > >>>>> > On Fri, Nov 13, 2015 at 4:28 PM, Ganesh Ajjanagadde <gajja...@mit.edu> >>>>> > wrote: >>>>> > >>>>> >> On Fri, Nov 13, 2015 at 1:28 PM, Ronald S. Bultje <rsbul...@gmail.com> >>>>> >> wrote: >>>>> >> > Hi, >>>>> >> > >>>>> >> > On Fri, Nov 13, 2015 at 12:17 PM, Ganesh Ajjanagadde < >>>>> gajja...@mit.edu> >>>>> >> > wrote: >>>>> >> > >>>>> >> >> On Fri, Nov 13, 2015 at 12:10 PM, Ronald S. Bultje < >>>>> rsbul...@gmail.com> >>>>> >> >> wrote: >>>>> >> >> > Hi Ganesh, >>>>> >> >> > On Nov 13, 2015 12:02 PM, "Ganesh Ajjanagadde" < >>>>> >> gajjanaga...@gmail.com> >>>>> >> >> > wrote: >>>>> >> >> >> >>>>> >> >> >> The rationale for this function is reflected in the documentation >>>>> for >>>>> >> >> >> it, and is copied here: >>>>> >> >> >> >>>>> >> >> >> Clip a double value into the long long amin-amax range. >>>>> >> >> >> This function is needed because conversion of floating point to >>>>> >> integers >>>>> >> >> > when >>>>> >> >> >> it does not fit in the integer's representation does not >>>>> necessarily >>>>> >> >> > saturate >>>>> >> >> >> correctly (usually converted to a cvttsd2si on x86) which >>>>> saturates >>>>> >> >> > numbers >>>>> >> >> >> > INT64_MAX to INT64_MIN. The standard marks such conversions as >>>>> >> >> undefined >>>>> >> >> >> behavior, allowing this sort of mathematically bogus conversions. >>>>> >> This >>>>> >> >> > provides >>>>> >> >> >> a safe alternative that is slower obviously but assures safety >>>>> >> >> >> and >>>>> >> >> better >>>>> >> >> >> mathematical behavior. >>>>> >> >> >> API: >>>>> >> >> >> @param a value to clip >>>>> >> >> >> @param amin minimum value of the clip range >>>>> >> >> >> @param amax maximum value of the clip range >>>>> >> >> >> @return clipped value >>>>> >> >> >> >>>>> >> >> >> Note that a priori if one can guarantee from the calling side >>>>> >> >> >> that >>>>> >> the >>>>> >> >> >> double is in range, it is safe to simply do an explicit/implicit >>>>> >> cast, >>>>> >> >> >> and that will be far faster. However, otherwise this function >>>>> should >>>>> >> be >>>>> >> >> >> used. >>>>> >> >> >> >>>>> >> >> >> avutil minor version is bumped. >>>>> >> >> >> >>>>> >> >> >> Reviewed-by: Ronald S. Bultje <rsbul...@gmail.com> >>>>> >> >> >> Signed-off-by: Ganesh Ajjanagadde <gajjanaga...@gmail.com> >>>>> >> >> >> --- >>>>> >> >> >> libavutil/common.h | 30 ++++++++++++++++++++++++++++++ >>>>> >> >> >> libavutil/version.h | 2 +- >>>>> >> >> >> 2 files changed, 31 insertions(+), 1 deletion(-) >>>>> >> >> >> >>>>> >> >> >> diff --git a/libavutil/common.h b/libavutil/common.h >>>>> >> >> >> index 6f0f582..f4687ab 100644 >>>>> >> >> >> --- a/libavutil/common.h >>>>> >> >> >> +++ b/libavutil/common.h >>>>> >> >> >> @@ -298,6 +298,33 @@ static av_always_inline av_const double >>>>> >> >> > av_clipd_c(double a, double amin, double >>>>> >> >> >> else return a; >>>>> >> >> >> } >>>>> >> >> >> >>>>> >> >> >> +/** >>>>> >> >> >> + * Clip and convert a double value into the long long amin-amax >>>>> >> range. >>>>> >> >> >> + * This function is needed because conversion of floating point >>>>> to >>>>> >> >> > integers when >>>>> >> >> >> + * it does not fit in the integer's representation does not >>>>> >> necessarily >>>>> >> >> > saturate >>>>> >> >> >> + * correctly (usually converted to a cvttsd2si on x86) which >>>>> >> saturates >>>>> >> >> > numbers >>>>> >> >> >> + * > INT64_MAX to INT64_MIN. The standard marks such conversions >>>>> as >>>>> >> >> > undefined >>>>> >> >> >> + * behavior, allowing this sort of mathematically bogus >>>>> conversions. >>>>> >> >> > This provides >>>>> >> >> >> + * a safe alternative that is slower obviously but assures >>>>> >> >> >> safety >>>>> >> and >>>>> >> >> > better >>>>> >> >> >> + * mathematical behavior. >>>>> >> >> >> + * @param a value to clip >>>>> >> >> >> + * @param amin minimum value of the clip range >>>>> >> >> >> + * @param amax maximum value of the clip range >>>>> >> >> >> + * @return clipped value >>>>> >> >> >> + */ >>>>> >> >> >> +static av_always_inline av_const int64_t av_rint64_clip_c(double >>>>> a, >>>>> >> >> > int64_t amin, int64_t amax) >>>>> >> >> >> +{ >>>>> >> >> >> +#if defined(HAVE_AV_CONFIG_H) && defined(ASSERT_LEVEL) && >>>>> >> ASSERT_LEVEL >>>>> >> >> >>= 2 >>>>> >> >> >> + if (amin > amax) abort(); >>>>> >> >> >> +#endif >>>>> >> >> >> + // INT64_MAX+1,INT64_MIN are exactly representable as IEEE >>>>> >> doubles >>>>> >> >> >> + if (a >= 9223372036854775808.0 || llrint(a) >= amax) >>>>> >> >> >> + return amax; >>>>> >> >> >> + if (a <= -9223372036854775808.0 || llrint(a) <= amin) >>>>> >> >> >> + return amin; >>>>> >> >> > >>>>> >> >> > Doesn't this allow negative overflows in the max check? I think >>>>> >> >> > you >>>>> >> need >>>>> >> >> > both overflow checks before the min/max checks. Try the next float >>>>> val >>>>> >> >> > smaller than int64_min as input with a small amax, eg 0. I bet it >>>>> >> >> returns 0 >>>>> >> >> > instead of amin. >>>>> >> >> >>>>> >> >> They are needed. As you and others can clearly see, I am quite bad >>>>> >> >> with this stuff. >>>>> >> > >>>>> >> > >>>>> >> > Hm, so, getting back to my computer, I wanted to test this, and I >>>>> >> > have >>>>> >> this >>>>> >> > problem: llrint() works correctly for me for the "undefined" cases, >>>>> i.e., >>>>> >> > it already does what you're trying to fix in av_rint64_clip_c. >>>>> >> > >>>>> >> > llrint(-10223372056756029440.000000) returns -9223372036854775808 >>>>> >> > llrint(10223372056756029440.000000) returns 9223372036854775807 >>>>> >> > >>>>> >> > So, how do I reproduce that llrint() overflows? >>>>> >> >>>>> >> The link I gave originally >>>>> >> >>>>> http://blog.frama-c.com/index.php?post/2013/10/09/Overflow-float-integer >>>>> >> gives an illustration. Maybe the weird behavior happens only on >>>>> >> 9223372036854775808.0. This happens because INT64_MAX+1 is not >>>>> >> representable in long long, and hence signed overflow occurs yielding >>>>> >> INT64_MIN (of course undefined). Here is a minimal test program: >>>>> >> #include <stdio.h> >>>>> >> #include <math.h> >>>>> >> >>>>> >> int main(void) { >>>>> >> printf("%lld\n", llrint(9223372036854775808.0)); >>>>> >> return 0; >>>>> >> } >>>>> >> >>>>> > >>>>> > 9223372036854775807 >>>>> > >>>>> > Seems apple's libc got one thing right :) >>>>> >>>>> I personally am not that charitable, looking more carefully at your >>>>> asm shows a cmplesd, suggesting slowdown. Here is a source reference: >>>>> https://opensource.apple.com/source/Libm/Libm-2026/Source/ARM/llrint.c. >>>>> As usual, Apple dumps many implementations of llrint and it is unclear >>>>> which is actually being used on OS X at the moment (see e.g other >>>>> https://opensource.apple.com/source/Libm/Libm-92/i386.subproj/llrint.c), >>>>> but I digress. >>>>> >>>>> They essentially all put special case code like the patch above. Thus >>>>> their function is inherently slower than the conformant GNU libm >>>>> implementation. A client may very well want a branch free llrint for >>>>> speed. Apple offers no performance choice here, forcing a fast llrint >>>>> to use cvt2dsi inline or equivalent. Don't know if FFmpeg is affected >>>>> by this slowdown. >>>> >>>> >>>> I think FFmpeg should consider using Apple's version as a x86 >>>> implementation for av_rint64_clip :) >>> >>> I don't agree with this: it is a far less readable implementation with >>> many more lines of code, and worse yet only handles the llrint aspect >>> and not the clipping. Regardless, belongs to a separate patch/thread. >>> Pushed. Thanks all for reviews. >>> >> >> This change broke building on VS2012, llrint is apprently not available >> there. >> Note that this is a public header, so our compat headers ala >> avutil/libm.h cannot be included there. > > Hmm, so I could create a local avpriv_llrint with some ifdefry - e.g > for GNU_C, use llrint, else use a slower implementation on the lines > of Apple's. > Any cleaner solutions?
Possibly better idea: use floor(f + 0.5) as a hack (c89) on things lacking llrint (via a HAVE_LLRINT check). This won't result in identical output past a sufficiently large power of 2, but is still a safe API. It is also clearer and smaller. Idea inspired by avcodec/mpegaudio_tablegen.h (where this hack may be removed). > >> >> - Hendrik >> _______________________________________________ >> ffmpeg-devel mailing list >> ffmpeg-devel@ffmpeg.org >> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel