On Sat, Nov 14, 2015 at 4:35 PM, James Almer <jamr...@gmail.com> wrote: > On 11/14/2015 6:30 PM, Hendrik Leppkes wrote: >> On Sat, Nov 14, 2015 at 10:27 PM, Ganesh Ajjanagadde <gajja...@mit.edu> >> wrote: >>> On Sat, Nov 14, 2015 at 4:03 PM, Ganesh Ajjanagadde <gajja...@mit.edu> >>> wrote: >>>> On Sat, Nov 14, 2015 at 3:28 PM, Hendrik Leppkes <h.lepp...@gmail.com> >>>> wrote: >>>>> On Sat, Nov 14, 2015 at 3:51 AM, Ganesh Ajjanagadde <gajja...@mit.edu> >>>>> wrote: >>>>>> On Fri, Nov 13, 2015 at 7:17 PM, Ronald S. Bultje <rsbul...@gmail.com> >>>>>> wrote: >>>>>>> Hi, >>>>>>> >>>>>>> On Fri, Nov 13, 2015 at 6:16 PM, Ganesh Ajjanagadde <gajja...@mit.edu> >>>>>>> wrote: >>>>>>> >>>>>>>> On Fri, Nov 13, 2015 at 4:52 PM, Ronald S. Bultje <rsbul...@gmail.com> >>>>>>>> wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> On Fri, Nov 13, 2015 at 4:28 PM, Ganesh Ajjanagadde <gajja...@mit.edu> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> On Fri, Nov 13, 2015 at 1:28 PM, Ronald S. Bultje >>>>>>>>>> <rsbul...@gmail.com> >>>>>>>>>> wrote: >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> On Fri, Nov 13, 2015 at 12:17 PM, Ganesh Ajjanagadde < >>>>>>>> gajja...@mit.edu> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> On Fri, Nov 13, 2015 at 12:10 PM, Ronald S. Bultje < >>>>>>>> rsbul...@gmail.com> >>>>>>>>>>>> wrote: >>>>>>>>>>>>> Hi Ganesh, >>>>>>>>>>>>> On Nov 13, 2015 12:02 PM, "Ganesh Ajjanagadde" < >>>>>>>>>> gajjanaga...@gmail.com> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> The rationale for this function is reflected in the documentation >>>>>>>> for >>>>>>>>>>>>>> it, and is copied here: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Clip a double value into the long long amin-amax range. >>>>>>>>>>>>>> This function is needed because conversion of floating point to >>>>>>>>>> integers >>>>>>>>>>>>> when >>>>>>>>>>>>>> it does not fit in the integer's representation does not >>>>>>>> necessarily >>>>>>>>>>>>> saturate >>>>>>>>>>>>>> correctly (usually converted to a cvttsd2si on x86) which >>>>>>>> saturates >>>>>>>>>>>>> numbers >>>>>>>>>>>>>>> INT64_MAX to INT64_MIN. The standard marks such conversions as >>>>>>>>>>>> undefined >>>>>>>>>>>>>> behavior, allowing this sort of mathematically bogus conversions. >>>>>>>>>> This >>>>>>>>>>>>> provides >>>>>>>>>>>>>> a safe alternative that is slower obviously but assures safety >>>>>>>>>>>>>> and >>>>>>>>>>>> better >>>>>>>>>>>>>> mathematical behavior. >>>>>>>>>>>>>> API: >>>>>>>>>>>>>> @param a value to clip >>>>>>>>>>>>>> @param amin minimum value of the clip range >>>>>>>>>>>>>> @param amax maximum value of the clip range >>>>>>>>>>>>>> @return clipped value >>>>>>>>>>>>>> >>>>>>>>>>>>>> Note that a priori if one can guarantee from the calling side >>>>>>>>>>>>>> that >>>>>>>>>> the >>>>>>>>>>>>>> double is in range, it is safe to simply do an explicit/implicit >>>>>>>>>> cast, >>>>>>>>>>>>>> and that will be far faster. However, otherwise this function >>>>>>>> should >>>>>>>>>> be >>>>>>>>>>>>>> used. >>>>>>>>>>>>>> >>>>>>>>>>>>>> avutil minor version is bumped. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Reviewed-by: Ronald S. Bultje <rsbul...@gmail.com> >>>>>>>>>>>>>> Signed-off-by: Ganesh Ajjanagadde <gajjanaga...@gmail.com> >>>>>>>>>>>>>> --- >>>>>>>>>>>>>> libavutil/common.h | 30 ++++++++++++++++++++++++++++++ >>>>>>>>>>>>>> libavutil/version.h | 2 +- >>>>>>>>>>>>>> 2 files changed, 31 insertions(+), 1 deletion(-) >>>>>>>>>>>>>> >>>>>>>>>>>>>> diff --git a/libavutil/common.h b/libavutil/common.h >>>>>>>>>>>>>> index 6f0f582..f4687ab 100644 >>>>>>>>>>>>>> --- a/libavutil/common.h >>>>>>>>>>>>>> +++ b/libavutil/common.h >>>>>>>>>>>>>> @@ -298,6 +298,33 @@ static av_always_inline av_const double >>>>>>>>>>>>> av_clipd_c(double a, double amin, double >>>>>>>>>>>>>> else return a; >>>>>>>>>>>>>> } >>>>>>>>>>>>>> >>>>>>>>>>>>>> +/** >>>>>>>>>>>>>> + * Clip and convert a double value into the long long amin-amax >>>>>>>>>> range. >>>>>>>>>>>>>> + * This function is needed because conversion of floating point >>>>>>>> to >>>>>>>>>>>>> integers when >>>>>>>>>>>>>> + * it does not fit in the integer's representation does not >>>>>>>>>> necessarily >>>>>>>>>>>>> saturate >>>>>>>>>>>>>> + * correctly (usually converted to a cvttsd2si on x86) which >>>>>>>>>> saturates >>>>>>>>>>>>> numbers >>>>>>>>>>>>>> + * > INT64_MAX to INT64_MIN. The standard marks such conversions >>>>>>>> as >>>>>>>>>>>>> undefined >>>>>>>>>>>>>> + * behavior, allowing this sort of mathematically bogus >>>>>>>> conversions. >>>>>>>>>>>>> This provides >>>>>>>>>>>>>> + * a safe alternative that is slower obviously but assures >>>>>>>>>>>>>> safety >>>>>>>>>> and >>>>>>>>>>>>> better >>>>>>>>>>>>>> + * mathematical behavior. >>>>>>>>>>>>>> + * @param a value to clip >>>>>>>>>>>>>> + * @param amin minimum value of the clip range >>>>>>>>>>>>>> + * @param amax maximum value of the clip range >>>>>>>>>>>>>> + * @return clipped value >>>>>>>>>>>>>> + */ >>>>>>>>>>>>>> +static av_always_inline av_const int64_t av_rint64_clip_c(double >>>>>>>> a, >>>>>>>>>>>>> int64_t amin, int64_t amax) >>>>>>>>>>>>>> +{ >>>>>>>>>>>>>> +#if defined(HAVE_AV_CONFIG_H) && defined(ASSERT_LEVEL) && >>>>>>>>>> ASSERT_LEVEL >>>>>>>>>>>>>> = 2 >>>>>>>>>>>>>> + if (amin > amax) abort(); >>>>>>>>>>>>>> +#endif >>>>>>>>>>>>>> + // INT64_MAX+1,INT64_MIN are exactly representable as IEEE >>>>>>>>>> doubles >>>>>>>>>>>>>> + if (a >= 9223372036854775808.0 || llrint(a) >= amax) >>>>>>>>>>>>>> + return amax; >>>>>>>>>>>>>> + if (a <= -9223372036854775808.0 || llrint(a) <= amin) >>>>>>>>>>>>>> + return amin; >>>>>>>>>>>>> >>>>>>>>>>>>> Doesn't this allow negative overflows in the max check? I think >>>>>>>>>>>>> you >>>>>>>>>> need >>>>>>>>>>>>> both overflow checks before the min/max checks. Try the next float >>>>>>>> val >>>>>>>>>>>>> smaller than int64_min as input with a small amax, eg 0. I bet it >>>>>>>>>>>> returns 0 >>>>>>>>>>>>> instead of amin. >>>>>>>>>>>> >>>>>>>>>>>> They are needed. As you and others can clearly see, I am quite bad >>>>>>>>>>>> with this stuff. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Hm, so, getting back to my computer, I wanted to test this, and I >>>>>>>>>>> have >>>>>>>>>> this >>>>>>>>>>> problem: llrint() works correctly for me for the "undefined" cases, >>>>>>>> i.e., >>>>>>>>>>> it already does what you're trying to fix in av_rint64_clip_c. >>>>>>>>>>> >>>>>>>>>>> llrint(-10223372056756029440.000000) returns -9223372036854775808 >>>>>>>>>>> llrint(10223372056756029440.000000) returns 9223372036854775807 >>>>>>>>>>> >>>>>>>>>>> So, how do I reproduce that llrint() overflows? >>>>>>>>>> >>>>>>>>>> The link I gave originally >>>>>>>>>> >>>>>>>> http://blog.frama-c.com/index.php?post/2013/10/09/Overflow-float-integer >>>>>>>>>> gives an illustration. Maybe the weird behavior happens only on >>>>>>>>>> 9223372036854775808.0. This happens because INT64_MAX+1 is not >>>>>>>>>> representable in long long, and hence signed overflow occurs yielding >>>>>>>>>> INT64_MIN (of course undefined). Here is a minimal test program: >>>>>>>>>> #include <stdio.h> >>>>>>>>>> #include <math.h> >>>>>>>>>> >>>>>>>>>> int main(void) { >>>>>>>>>> printf("%lld\n", llrint(9223372036854775808.0)); >>>>>>>>>> return 0; >>>>>>>>>> } >>>>>>>>>> >>>>>>>>> >>>>>>>>> 9223372036854775807 >>>>>>>>> >>>>>>>>> Seems apple's libc got one thing right :) >>>>>>>> >>>>>>>> I personally am not that charitable, looking more carefully at your >>>>>>>> asm shows a cmplesd, suggesting slowdown. Here is a source reference: >>>>>>>> https://opensource.apple.com/source/Libm/Libm-2026/Source/ARM/llrint.c. >>>>>>>> As usual, Apple dumps many implementations of llrint and it is unclear >>>>>>>> which is actually being used on OS X at the moment (see e.g other >>>>>>>> https://opensource.apple.com/source/Libm/Libm-92/i386.subproj/llrint.c), >>>>>>>> but I digress. >>>>>>>> >>>>>>>> They essentially all put special case code like the patch above. Thus >>>>>>>> their function is inherently slower than the conformant GNU libm >>>>>>>> implementation. A client may very well want a branch free llrint for >>>>>>>> speed. Apple offers no performance choice here, forcing a fast llrint >>>>>>>> to use cvt2dsi inline or equivalent. Don't know if FFmpeg is affected >>>>>>>> by this slowdown. >>>>>>> >>>>>>> >>>>>>> I think FFmpeg should consider using Apple's version as a x86 >>>>>>> implementation for av_rint64_clip :) >>>>>> >>>>>> I don't agree with this: it is a far less readable implementation with >>>>>> many more lines of code, and worse yet only handles the llrint aspect >>>>>> and not the clipping. Regardless, belongs to a separate patch/thread. >>>>>> Pushed. Thanks all for reviews. >>>>>> >>>>> >>>>> This change broke building on VS2012, llrint is apprently not available >>>>> there. >>>>> Note that this is a public header, so our compat headers ala >>>>> avutil/libm.h cannot be included there. >>>> >>>> Hmm, so I could create a local avpriv_llrint with some ifdefry - e.g >>>> for GNU_C, use llrint, else use a slower implementation on the lines >>>> of Apple's. >>>> Any cleaner solutions? >>> >>> Possibly better idea: use floor(f + 0.5) as a hack (c89) on things >>> lacking llrint (via a HAVE_LLRINT check). This won't result in >>> identical output past a sufficiently large power of 2, but is still a >>> safe API. It is also clearer and smaller. Idea inspired by >>> avcodec/mpegaudio_tablegen.h (where this hack may be removed). >>> >> >> This code is in a public header, public headers don't have access to >> config.h, so no HAVE_* checks. >> You could make it non-inline, then you have all the freedom in the world. >> >> - Hendrik > > As others have pointed out on IRC, why is this function needed to begin with? > It's not used anywhere currently.
Not ATM, but see the ffserver_config patch under review with recent comments by Michael. There are many more such instances where this API is needed, IIRC one in cmdutils.c. > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel