On 11/14/2015 6:30 PM, Hendrik Leppkes wrote: > On Sat, Nov 14, 2015 at 10:27 PM, Ganesh Ajjanagadde <gajja...@mit.edu> wrote: >> On Sat, Nov 14, 2015 at 4:03 PM, Ganesh Ajjanagadde <gajja...@mit.edu> wrote: >>> On Sat, Nov 14, 2015 at 3:28 PM, Hendrik Leppkes <h.lepp...@gmail.com> >>> wrote: >>>> On Sat, Nov 14, 2015 at 3:51 AM, Ganesh Ajjanagadde <gajja...@mit.edu> >>>> wrote: >>>>> On Fri, Nov 13, 2015 at 7:17 PM, Ronald S. Bultje <rsbul...@gmail.com> >>>>> wrote: >>>>>> Hi, >>>>>> >>>>>> On Fri, Nov 13, 2015 at 6:16 PM, Ganesh Ajjanagadde <gajja...@mit.edu> >>>>>> wrote: >>>>>> >>>>>>> On Fri, Nov 13, 2015 at 4:52 PM, Ronald S. Bultje <rsbul...@gmail.com> >>>>>>> wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> On Fri, Nov 13, 2015 at 4:28 PM, Ganesh Ajjanagadde <gajja...@mit.edu> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> On Fri, Nov 13, 2015 at 1:28 PM, Ronald S. Bultje <rsbul...@gmail.com> >>>>>>>>> wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> On Fri, Nov 13, 2015 at 12:17 PM, Ganesh Ajjanagadde < >>>>>>> gajja...@mit.edu> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> On Fri, Nov 13, 2015 at 12:10 PM, Ronald S. Bultje < >>>>>>> rsbul...@gmail.com> >>>>>>>>>>> wrote: >>>>>>>>>>>> Hi Ganesh, >>>>>>>>>>>> On Nov 13, 2015 12:02 PM, "Ganesh Ajjanagadde" < >>>>>>>>> gajjanaga...@gmail.com> >>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> The rationale for this function is reflected in the documentation >>>>>>> for >>>>>>>>>>>>> it, and is copied here: >>>>>>>>>>>>> >>>>>>>>>>>>> Clip a double value into the long long amin-amax range. >>>>>>>>>>>>> This function is needed because conversion of floating point to >>>>>>>>> integers >>>>>>>>>>>> when >>>>>>>>>>>>> it does not fit in the integer's representation does not >>>>>>> necessarily >>>>>>>>>>>> saturate >>>>>>>>>>>>> correctly (usually converted to a cvttsd2si on x86) which >>>>>>> saturates >>>>>>>>>>>> numbers >>>>>>>>>>>>>> INT64_MAX to INT64_MIN. The standard marks such conversions as >>>>>>>>>>> undefined >>>>>>>>>>>>> behavior, allowing this sort of mathematically bogus conversions. >>>>>>>>> This >>>>>>>>>>>> provides >>>>>>>>>>>>> a safe alternative that is slower obviously but assures safety and >>>>>>>>>>> better >>>>>>>>>>>>> mathematical behavior. >>>>>>>>>>>>> API: >>>>>>>>>>>>> @param a value to clip >>>>>>>>>>>>> @param amin minimum value of the clip range >>>>>>>>>>>>> @param amax maximum value of the clip range >>>>>>>>>>>>> @return clipped value >>>>>>>>>>>>> >>>>>>>>>>>>> Note that a priori if one can guarantee from the calling side that >>>>>>>>> the >>>>>>>>>>>>> double is in range, it is safe to simply do an explicit/implicit >>>>>>>>> cast, >>>>>>>>>>>>> and that will be far faster. However, otherwise this function >>>>>>> should >>>>>>>>> be >>>>>>>>>>>>> used. >>>>>>>>>>>>> >>>>>>>>>>>>> avutil minor version is bumped. >>>>>>>>>>>>> >>>>>>>>>>>>> Reviewed-by: Ronald S. Bultje <rsbul...@gmail.com> >>>>>>>>>>>>> Signed-off-by: Ganesh Ajjanagadde <gajjanaga...@gmail.com> >>>>>>>>>>>>> --- >>>>>>>>>>>>> libavutil/common.h | 30 ++++++++++++++++++++++++++++++ >>>>>>>>>>>>> libavutil/version.h | 2 +- >>>>>>>>>>>>> 2 files changed, 31 insertions(+), 1 deletion(-) >>>>>>>>>>>>> >>>>>>>>>>>>> diff --git a/libavutil/common.h b/libavutil/common.h >>>>>>>>>>>>> index 6f0f582..f4687ab 100644 >>>>>>>>>>>>> --- a/libavutil/common.h >>>>>>>>>>>>> +++ b/libavutil/common.h >>>>>>>>>>>>> @@ -298,6 +298,33 @@ static av_always_inline av_const double >>>>>>>>>>>> av_clipd_c(double a, double amin, double >>>>>>>>>>>>> else return a; >>>>>>>>>>>>> } >>>>>>>>>>>>> >>>>>>>>>>>>> +/** >>>>>>>>>>>>> + * Clip and convert a double value into the long long amin-amax >>>>>>>>> range. >>>>>>>>>>>>> + * This function is needed because conversion of floating point >>>>>>> to >>>>>>>>>>>> integers when >>>>>>>>>>>>> + * it does not fit in the integer's representation does not >>>>>>>>> necessarily >>>>>>>>>>>> saturate >>>>>>>>>>>>> + * correctly (usually converted to a cvttsd2si on x86) which >>>>>>>>> saturates >>>>>>>>>>>> numbers >>>>>>>>>>>>> + * > INT64_MAX to INT64_MIN. The standard marks such conversions >>>>>>> as >>>>>>>>>>>> undefined >>>>>>>>>>>>> + * behavior, allowing this sort of mathematically bogus >>>>>>> conversions. >>>>>>>>>>>> This provides >>>>>>>>>>>>> + * a safe alternative that is slower obviously but assures safety >>>>>>>>> and >>>>>>>>>>>> better >>>>>>>>>>>>> + * mathematical behavior. >>>>>>>>>>>>> + * @param a value to clip >>>>>>>>>>>>> + * @param amin minimum value of the clip range >>>>>>>>>>>>> + * @param amax maximum value of the clip range >>>>>>>>>>>>> + * @return clipped value >>>>>>>>>>>>> + */ >>>>>>>>>>>>> +static av_always_inline av_const int64_t av_rint64_clip_c(double >>>>>>> a, >>>>>>>>>>>> int64_t amin, int64_t amax) >>>>>>>>>>>>> +{ >>>>>>>>>>>>> +#if defined(HAVE_AV_CONFIG_H) && defined(ASSERT_LEVEL) && >>>>>>>>> ASSERT_LEVEL >>>>>>>>>>>>> = 2 >>>>>>>>>>>>> + if (amin > amax) abort(); >>>>>>>>>>>>> +#endif >>>>>>>>>>>>> + // INT64_MAX+1,INT64_MIN are exactly representable as IEEE >>>>>>>>> doubles >>>>>>>>>>>>> + if (a >= 9223372036854775808.0 || llrint(a) >= amax) >>>>>>>>>>>>> + return amax; >>>>>>>>>>>>> + if (a <= -9223372036854775808.0 || llrint(a) <= amin) >>>>>>>>>>>>> + return amin; >>>>>>>>>>>> >>>>>>>>>>>> Doesn't this allow negative overflows in the max check? I think you >>>>>>>>> need >>>>>>>>>>>> both overflow checks before the min/max checks. Try the next float >>>>>>> val >>>>>>>>>>>> smaller than int64_min as input with a small amax, eg 0. I bet it >>>>>>>>>>> returns 0 >>>>>>>>>>>> instead of amin. >>>>>>>>>>> >>>>>>>>>>> They are needed. As you and others can clearly see, I am quite bad >>>>>>>>>>> with this stuff. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hm, so, getting back to my computer, I wanted to test this, and I >>>>>>>>>> have >>>>>>>>> this >>>>>>>>>> problem: llrint() works correctly for me for the "undefined" cases, >>>>>>> i.e., >>>>>>>>>> it already does what you're trying to fix in av_rint64_clip_c. >>>>>>>>>> >>>>>>>>>> llrint(-10223372056756029440.000000) returns -9223372036854775808 >>>>>>>>>> llrint(10223372056756029440.000000) returns 9223372036854775807 >>>>>>>>>> >>>>>>>>>> So, how do I reproduce that llrint() overflows? >>>>>>>>> >>>>>>>>> The link I gave originally >>>>>>>>> >>>>>>> http://blog.frama-c.com/index.php?post/2013/10/09/Overflow-float-integer >>>>>>>>> gives an illustration. Maybe the weird behavior happens only on >>>>>>>>> 9223372036854775808.0. This happens because INT64_MAX+1 is not >>>>>>>>> representable in long long, and hence signed overflow occurs yielding >>>>>>>>> INT64_MIN (of course undefined). Here is a minimal test program: >>>>>>>>> #include <stdio.h> >>>>>>>>> #include <math.h> >>>>>>>>> >>>>>>>>> int main(void) { >>>>>>>>> printf("%lld\n", llrint(9223372036854775808.0)); >>>>>>>>> return 0; >>>>>>>>> } >>>>>>>>> >>>>>>>> >>>>>>>> 9223372036854775807 >>>>>>>> >>>>>>>> Seems apple's libc got one thing right :) >>>>>>> >>>>>>> I personally am not that charitable, looking more carefully at your >>>>>>> asm shows a cmplesd, suggesting slowdown. Here is a source reference: >>>>>>> https://opensource.apple.com/source/Libm/Libm-2026/Source/ARM/llrint.c. >>>>>>> As usual, Apple dumps many implementations of llrint and it is unclear >>>>>>> which is actually being used on OS X at the moment (see e.g other >>>>>>> https://opensource.apple.com/source/Libm/Libm-92/i386.subproj/llrint.c), >>>>>>> but I digress. >>>>>>> >>>>>>> They essentially all put special case code like the patch above. Thus >>>>>>> their function is inherently slower than the conformant GNU libm >>>>>>> implementation. A client may very well want a branch free llrint for >>>>>>> speed. Apple offers no performance choice here, forcing a fast llrint >>>>>>> to use cvt2dsi inline or equivalent. Don't know if FFmpeg is affected >>>>>>> by this slowdown. >>>>>> >>>>>> >>>>>> I think FFmpeg should consider using Apple's version as a x86 >>>>>> implementation for av_rint64_clip :) >>>>> >>>>> I don't agree with this: it is a far less readable implementation with >>>>> many more lines of code, and worse yet only handles the llrint aspect >>>>> and not the clipping. Regardless, belongs to a separate patch/thread. >>>>> Pushed. Thanks all for reviews. >>>>> >>>> >>>> This change broke building on VS2012, llrint is apprently not available >>>> there. >>>> Note that this is a public header, so our compat headers ala >>>> avutil/libm.h cannot be included there. >>> >>> Hmm, so I could create a local avpriv_llrint with some ifdefry - e.g >>> for GNU_C, use llrint, else use a slower implementation on the lines >>> of Apple's. >>> Any cleaner solutions? >> >> Possibly better idea: use floor(f + 0.5) as a hack (c89) on things >> lacking llrint (via a HAVE_LLRINT check). This won't result in >> identical output past a sufficiently large power of 2, but is still a >> safe API. It is also clearer and smaller. Idea inspired by >> avcodec/mpegaudio_tablegen.h (where this hack may be removed). >> > > This code is in a public header, public headers don't have access to > config.h, so no HAVE_* checks. > You could make it non-inline, then you have all the freedom in the world. > > - Hendrik
As others have pointed out on IRC, why is this function needed to begin with? It's not used anywhere currently. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel