On Sat, Nov 14, 2015 at 3:28 PM, Hendrik Leppkes <h.lepp...@gmail.com> wrote: > On Sat, Nov 14, 2015 at 3:51 AM, Ganesh Ajjanagadde <gajja...@mit.edu> wrote: >> On Fri, Nov 13, 2015 at 7:17 PM, Ronald S. Bultje <rsbul...@gmail.com> wrote: >>> Hi, >>> >>> On Fri, Nov 13, 2015 at 6:16 PM, Ganesh Ajjanagadde <gajja...@mit.edu> >>> wrote: >>> >>>> On Fri, Nov 13, 2015 at 4:52 PM, Ronald S. Bultje <rsbul...@gmail.com> >>>> wrote: >>>> > Hi, >>>> > >>>> > On Fri, Nov 13, 2015 at 4:28 PM, Ganesh Ajjanagadde <gajja...@mit.edu> >>>> > wrote: >>>> > >>>> >> On Fri, Nov 13, 2015 at 1:28 PM, Ronald S. Bultje <rsbul...@gmail.com> >>>> >> wrote: >>>> >> > Hi, >>>> >> > >>>> >> > On Fri, Nov 13, 2015 at 12:17 PM, Ganesh Ajjanagadde < >>>> gajja...@mit.edu> >>>> >> > wrote: >>>> >> > >>>> >> >> On Fri, Nov 13, 2015 at 12:10 PM, Ronald S. Bultje < >>>> rsbul...@gmail.com> >>>> >> >> wrote: >>>> >> >> > Hi Ganesh, >>>> >> >> > On Nov 13, 2015 12:02 PM, "Ganesh Ajjanagadde" < >>>> >> gajjanaga...@gmail.com> >>>> >> >> > wrote: >>>> >> >> >> >>>> >> >> >> The rationale for this function is reflected in the documentation >>>> for >>>> >> >> >> it, and is copied here: >>>> >> >> >> >>>> >> >> >> Clip a double value into the long long amin-amax range. >>>> >> >> >> This function is needed because conversion of floating point to >>>> >> integers >>>> >> >> > when >>>> >> >> >> it does not fit in the integer's representation does not >>>> necessarily >>>> >> >> > saturate >>>> >> >> >> correctly (usually converted to a cvttsd2si on x86) which >>>> saturates >>>> >> >> > numbers >>>> >> >> >> > INT64_MAX to INT64_MIN. The standard marks such conversions as >>>> >> >> undefined >>>> >> >> >> behavior, allowing this sort of mathematically bogus conversions. >>>> >> This >>>> >> >> > provides >>>> >> >> >> a safe alternative that is slower obviously but assures safety and >>>> >> >> better >>>> >> >> >> mathematical behavior. >>>> >> >> >> API: >>>> >> >> >> @param a value to clip >>>> >> >> >> @param amin minimum value of the clip range >>>> >> >> >> @param amax maximum value of the clip range >>>> >> >> >> @return clipped value >>>> >> >> >> >>>> >> >> >> Note that a priori if one can guarantee from the calling side that >>>> >> the >>>> >> >> >> double is in range, it is safe to simply do an explicit/implicit >>>> >> cast, >>>> >> >> >> and that will be far faster. However, otherwise this function >>>> should >>>> >> be >>>> >> >> >> used. >>>> >> >> >> >>>> >> >> >> avutil minor version is bumped. >>>> >> >> >> >>>> >> >> >> Reviewed-by: Ronald S. Bultje <rsbul...@gmail.com> >>>> >> >> >> Signed-off-by: Ganesh Ajjanagadde <gajjanaga...@gmail.com> >>>> >> >> >> --- >>>> >> >> >> libavutil/common.h | 30 ++++++++++++++++++++++++++++++ >>>> >> >> >> libavutil/version.h | 2 +- >>>> >> >> >> 2 files changed, 31 insertions(+), 1 deletion(-) >>>> >> >> >> >>>> >> >> >> diff --git a/libavutil/common.h b/libavutil/common.h >>>> >> >> >> index 6f0f582..f4687ab 100644 >>>> >> >> >> --- a/libavutil/common.h >>>> >> >> >> +++ b/libavutil/common.h >>>> >> >> >> @@ -298,6 +298,33 @@ static av_always_inline av_const double >>>> >> >> > av_clipd_c(double a, double amin, double >>>> >> >> >> else return a; >>>> >> >> >> } >>>> >> >> >> >>>> >> >> >> +/** >>>> >> >> >> + * Clip and convert a double value into the long long amin-amax >>>> >> range. >>>> >> >> >> + * This function is needed because conversion of floating point >>>> to >>>> >> >> > integers when >>>> >> >> >> + * it does not fit in the integer's representation does not >>>> >> necessarily >>>> >> >> > saturate >>>> >> >> >> + * correctly (usually converted to a cvttsd2si on x86) which >>>> >> saturates >>>> >> >> > numbers >>>> >> >> >> + * > INT64_MAX to INT64_MIN. The standard marks such conversions >>>> as >>>> >> >> > undefined >>>> >> >> >> + * behavior, allowing this sort of mathematically bogus >>>> conversions. >>>> >> >> > This provides >>>> >> >> >> + * a safe alternative that is slower obviously but assures safety >>>> >> and >>>> >> >> > better >>>> >> >> >> + * mathematical behavior. >>>> >> >> >> + * @param a value to clip >>>> >> >> >> + * @param amin minimum value of the clip range >>>> >> >> >> + * @param amax maximum value of the clip range >>>> >> >> >> + * @return clipped value >>>> >> >> >> + */ >>>> >> >> >> +static av_always_inline av_const int64_t av_rint64_clip_c(double >>>> a, >>>> >> >> > int64_t amin, int64_t amax) >>>> >> >> >> +{ >>>> >> >> >> +#if defined(HAVE_AV_CONFIG_H) && defined(ASSERT_LEVEL) && >>>> >> ASSERT_LEVEL >>>> >> >> >>= 2 >>>> >> >> >> + if (amin > amax) abort(); >>>> >> >> >> +#endif >>>> >> >> >> + // INT64_MAX+1,INT64_MIN are exactly representable as IEEE >>>> >> doubles >>>> >> >> >> + if (a >= 9223372036854775808.0 || llrint(a) >= amax) >>>> >> >> >> + return amax; >>>> >> >> >> + if (a <= -9223372036854775808.0 || llrint(a) <= amin) >>>> >> >> >> + return amin; >>>> >> >> > >>>> >> >> > Doesn't this allow negative overflows in the max check? I think you >>>> >> need >>>> >> >> > both overflow checks before the min/max checks. Try the next float >>>> val >>>> >> >> > smaller than int64_min as input with a small amax, eg 0. I bet it >>>> >> >> returns 0 >>>> >> >> > instead of amin. >>>> >> >> >>>> >> >> They are needed. As you and others can clearly see, I am quite bad >>>> >> >> with this stuff. >>>> >> > >>>> >> > >>>> >> > Hm, so, getting back to my computer, I wanted to test this, and I have >>>> >> this >>>> >> > problem: llrint() works correctly for me for the "undefined" cases, >>>> i.e., >>>> >> > it already does what you're trying to fix in av_rint64_clip_c. >>>> >> > >>>> >> > llrint(-10223372056756029440.000000) returns -9223372036854775808 >>>> >> > llrint(10223372056756029440.000000) returns 9223372036854775807 >>>> >> > >>>> >> > So, how do I reproduce that llrint() overflows? >>>> >> >>>> >> The link I gave originally >>>> >> >>>> http://blog.frama-c.com/index.php?post/2013/10/09/Overflow-float-integer >>>> >> gives an illustration. Maybe the weird behavior happens only on >>>> >> 9223372036854775808.0. This happens because INT64_MAX+1 is not >>>> >> representable in long long, and hence signed overflow occurs yielding >>>> >> INT64_MIN (of course undefined). Here is a minimal test program: >>>> >> #include <stdio.h> >>>> >> #include <math.h> >>>> >> >>>> >> int main(void) { >>>> >> printf("%lld\n", llrint(9223372036854775808.0)); >>>> >> return 0; >>>> >> } >>>> >> >>>> > >>>> > 9223372036854775807 >>>> > >>>> > Seems apple's libc got one thing right :) >>>> >>>> I personally am not that charitable, looking more carefully at your >>>> asm shows a cmplesd, suggesting slowdown. Here is a source reference: >>>> https://opensource.apple.com/source/Libm/Libm-2026/Source/ARM/llrint.c. >>>> As usual, Apple dumps many implementations of llrint and it is unclear >>>> which is actually being used on OS X at the moment (see e.g other >>>> https://opensource.apple.com/source/Libm/Libm-92/i386.subproj/llrint.c), >>>> but I digress. >>>> >>>> They essentially all put special case code like the patch above. Thus >>>> their function is inherently slower than the conformant GNU libm >>>> implementation. A client may very well want a branch free llrint for >>>> speed. Apple offers no performance choice here, forcing a fast llrint >>>> to use cvt2dsi inline or equivalent. Don't know if FFmpeg is affected >>>> by this slowdown. >>> >>> >>> I think FFmpeg should consider using Apple's version as a x86 >>> implementation for av_rint64_clip :) >> >> I don't agree with this: it is a far less readable implementation with >> many more lines of code, and worse yet only handles the llrint aspect >> and not the clipping. Regardless, belongs to a separate patch/thread. >> Pushed. Thanks all for reviews. >> > > This change broke building on VS2012, llrint is apprently not available there. > Note that this is a public header, so our compat headers ala > avutil/libm.h cannot be included there.
Hmm, so I could create a local avpriv_llrint with some ifdefry - e.g for GNU_C, use llrint, else use a slower implementation on the lines of Apple's. Any cleaner solutions? > > - Hendrik > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel