Re: [FFmpeg-devel] [PATCHv4] avutil/common: add av_rint64_clip

James Almer Sat, 14 Nov 2015 13:36:02 -0800

On 11/14/2015 6:30 PM, Hendrik Leppkes wrote:
> On Sat, Nov 14, 2015 at 10:27 PM, Ganesh Ajjanagadde <[email protected]> wrote:
>> On Sat, Nov 14, 2015 at 4:03 PM, Ganesh Ajjanagadde <[email protected]> wrote:
>>> On Sat, Nov 14, 2015 at 3:28 PM, Hendrik Leppkes <[email protected]> 
>>> wrote:
>>>> On Sat, Nov 14, 2015 at 3:51 AM, Ganesh Ajjanagadde <[email protected]> 
>>>> wrote:
>>>>> On Fri, Nov 13, 2015 at 7:17 PM, Ronald S. Bultje <[email protected]> 
>>>>> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On Fri, Nov 13, 2015 at 6:16 PM, Ganesh Ajjanagadde <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> On Fri, Nov 13, 2015 at 4:52 PM, Ronald S. Bultje <[email protected]>
>>>>>>> wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> On Fri, Nov 13, 2015 at 4:28 PM, Ganesh Ajjanagadde <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> On Fri, Nov 13, 2015 at 1:28 PM, Ronald S. Bultje <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> On Fri, Nov 13, 2015 at 12:17 PM, Ganesh Ajjanagadde <
>>>>>>> [email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> On Fri, Nov 13, 2015 at 12:10 PM, Ronald S. Bultje <
>>>>>>> [email protected]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>> Hi Ganesh,
>>>>>>>>>>>> On Nov 13, 2015 12:02 PM, "Ganesh Ajjanagadde" <
>>>>>>>>> [email protected]>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> The rationale for this function is reflected in the documentation
>>>>>>> for
>>>>>>>>>>>>> it, and is copied here:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Clip a double value into the long long amin-amax range.
>>>>>>>>>>>>> This function is needed because conversion of floating point to
>>>>>>>>> integers
>>>>>>>>>>>> when
>>>>>>>>>>>>> it does not fit in the integer's representation does not
>>>>>>> necessarily
>>>>>>>>>>>> saturate
>>>>>>>>>>>>> correctly (usually converted to a cvttsd2si on x86) which
>>>>>>> saturates
>>>>>>>>>>>> numbers
>>>>>>>>>>>>>> INT64_MAX to INT64_MIN. The standard marks such conversions as
>>>>>>>>>>> undefined
>>>>>>>>>>>>> behavior, allowing this sort of mathematically bogus conversions.
>>>>>>>>> This
>>>>>>>>>>>> provides
>>>>>>>>>>>>> a safe alternative that is slower obviously but assures safety and
>>>>>>>>>>> better
>>>>>>>>>>>>> mathematical behavior.
>>>>>>>>>>>>> API:
>>>>>>>>>>>>> @param a value to clip
>>>>>>>>>>>>> @param amin minimum value of the clip range
>>>>>>>>>>>>> @param amax maximum value of the clip range
>>>>>>>>>>>>> @return clipped value
>>>>>>>>>>>>>
>>>>>>>>>>>>> Note that a priori if one can guarantee from the calling side that
>>>>>>>>> the
>>>>>>>>>>>>> double is in range, it is safe to simply do an explicit/implicit
>>>>>>>>> cast,
>>>>>>>>>>>>> and that will be far faster. However, otherwise this function
>>>>>>> should
>>>>>>>>> be
>>>>>>>>>>>>> used.
>>>>>>>>>>>>>
>>>>>>>>>>>>> avutil minor version is bumped.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Reviewed-by: Ronald S. Bultje <[email protected]>
>>>>>>>>>>>>> Signed-off-by: Ganesh Ajjanagadde <[email protected]>
>>>>>>>>>>>>> ---
>>>>>>>>>>>>>  libavutil/common.h  | 30 ++++++++++++++++++++++++++++++
>>>>>>>>>>>>>  libavutil/version.h |  2 +-
>>>>>>>>>>>>>  2 files changed, 31 insertions(+), 1 deletion(-)
>>>>>>>>>>>>>
>>>>>>>>>>>>> diff --git a/libavutil/common.h b/libavutil/common.h
>>>>>>>>>>>>> index 6f0f582..f4687ab 100644
>>>>>>>>>>>>> --- a/libavutil/common.h
>>>>>>>>>>>>> +++ b/libavutil/common.h
>>>>>>>>>>>>> @@ -298,6 +298,33 @@ static av_always_inline av_const double
>>>>>>>>>>>> av_clipd_c(double a, double amin, double
>>>>>>>>>>>>>      else               return a;
>>>>>>>>>>>>>  }
>>>>>>>>>>>>>
>>>>>>>>>>>>> +/**
>>>>>>>>>>>>> + * Clip and convert a double value into the long long amin-amax
>>>>>>>>> range.
>>>>>>>>>>>>> + * This function is needed because conversion of floating point
>>>>>>> to
>>>>>>>>>>>> integers when
>>>>>>>>>>>>> + * it does not fit in the integer's representation does not
>>>>>>>>> necessarily
>>>>>>>>>>>> saturate
>>>>>>>>>>>>> + * correctly (usually converted to a cvttsd2si on x86) which
>>>>>>>>> saturates
>>>>>>>>>>>> numbers
>>>>>>>>>>>>> + * > INT64_MAX to INT64_MIN. The standard marks such conversions
>>>>>>> as
>>>>>>>>>>>> undefined
>>>>>>>>>>>>> + * behavior, allowing this sort of mathematically bogus
>>>>>>> conversions.
>>>>>>>>>>>> This provides
>>>>>>>>>>>>> + * a safe alternative that is slower obviously but assures safety
>>>>>>>>> and
>>>>>>>>>>>> better
>>>>>>>>>>>>> + * mathematical behavior.
>>>>>>>>>>>>> + * @param a value to clip
>>>>>>>>>>>>> + * @param amin minimum value of the clip range
>>>>>>>>>>>>> + * @param amax maximum value of the clip range
>>>>>>>>>>>>> + * @return clipped value
>>>>>>>>>>>>> + */
>>>>>>>>>>>>> +static av_always_inline av_const int64_t av_rint64_clip_c(double
>>>>>>> a,
>>>>>>>>>>>> int64_t amin, int64_t amax)
>>>>>>>>>>>>> +{
>>>>>>>>>>>>> +#if defined(HAVE_AV_CONFIG_H) && defined(ASSERT_LEVEL) &&
>>>>>>>>> ASSERT_LEVEL
>>>>>>>>>>>>> = 2
>>>>>>>>>>>>> +    if (amin > amax) abort();
>>>>>>>>>>>>> +#endif
>>>>>>>>>>>>> +    // INT64_MAX+1,INT64_MIN are exactly representable as IEEE
>>>>>>>>> doubles
>>>>>>>>>>>>> +    if (a >=  9223372036854775808.0 || llrint(a) >= amax)
>>>>>>>>>>>>> +        return amax;
>>>>>>>>>>>>> +    if (a <= -9223372036854775808.0 || llrint(a) <= amin)
>>>>>>>>>>>>> +        return amin;
>>>>>>>>>>>>
>>>>>>>>>>>> Doesn't this allow negative overflows in the max check? I think you
>>>>>>>>> need
>>>>>>>>>>>> both overflow checks before the min/max checks. Try the next float
>>>>>>> val
>>>>>>>>>>>> smaller than int64_min as input with a small amax, eg 0. I bet it
>>>>>>>>>>> returns 0
>>>>>>>>>>>> instead of amin.
>>>>>>>>>>>
>>>>>>>>>>> They are needed. As you and others can clearly see, I am quite bad
>>>>>>>>>>> with this stuff.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hm, so, getting back to my computer, I wanted to test this, and I 
>>>>>>>>>> have
>>>>>>>>> this
>>>>>>>>>> problem: llrint() works correctly for me for the "undefined" cases,
>>>>>>> i.e.,
>>>>>>>>>> it already does what you're trying to fix in av_rint64_clip_c.
>>>>>>>>>>
>>>>>>>>>> llrint(-10223372056756029440.000000) returns -9223372036854775808
>>>>>>>>>> llrint(10223372056756029440.000000) returns 9223372036854775807
>>>>>>>>>>
>>>>>>>>>> So, how do I reproduce that llrint() overflows?
>>>>>>>>>
>>>>>>>>> The link I gave originally
>>>>>>>>>
>>>>>>> http://blog.frama-c.com/index.php?post/2013/10/09/Overflow-float-integer
>>>>>>>>> gives an illustration. Maybe the weird behavior happens only on
>>>>>>>>> 9223372036854775808.0. This happens because INT64_MAX+1 is not
>>>>>>>>> representable in long long, and hence signed overflow occurs yielding
>>>>>>>>> INT64_MIN (of course undefined). Here is a minimal test program:
>>>>>>>>> #include <stdio.h>
>>>>>>>>> #include <math.h>
>>>>>>>>>
>>>>>>>>> int main(void) {
>>>>>>>>>     printf("%lld\n", llrint(9223372036854775808.0));
>>>>>>>>>     return 0;
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>
>>>>>>>> 9223372036854775807
>>>>>>>>
>>>>>>>> Seems apple's libc got one thing right :)
>>>>>>>
>>>>>>> I personally am not that charitable, looking more carefully at your
>>>>>>> asm shows a cmplesd, suggesting slowdown. Here is a source reference:
>>>>>>> https://opensource.apple.com/source/Libm/Libm-2026/Source/ARM/llrint.c.
>>>>>>> As usual, Apple dumps many implementations of llrint and it is unclear
>>>>>>> which is actually being used on OS X at the moment (see e.g other
>>>>>>> https://opensource.apple.com/source/Libm/Libm-92/i386.subproj/llrint.c),
>>>>>>> but I digress.
>>>>>>>
>>>>>>> They essentially all put special case code like the patch above. Thus
>>>>>>> their function is inherently slower than the conformant GNU libm
>>>>>>> implementation. A client may very well want a branch free llrint for
>>>>>>> speed. Apple offers no performance choice here, forcing a fast llrint
>>>>>>> to use cvt2dsi inline or equivalent. Don't know if FFmpeg is affected
>>>>>>> by this slowdown.
>>>>>>
>>>>>>
>>>>>> I think FFmpeg should consider using Apple's version as a x86
>>>>>> implementation for av_rint64_clip :)
>>>>>
>>>>> I don't agree with this: it is a far less readable implementation with
>>>>> many more lines of code, and worse yet only handles the llrint aspect
>>>>> and not the clipping. Regardless, belongs to a separate patch/thread.
>>>>> Pushed. Thanks all for reviews.
>>>>>
>>>>
>>>> This change broke building on VS2012, llrint is apprently not available 
>>>> there.
>>>> Note that this is a public header, so our compat headers ala
>>>> avutil/libm.h cannot be included there.
>>>
>>> Hmm, so I could create a local avpriv_llrint with some ifdefry - e.g
>>> for GNU_C, use llrint, else use a slower implementation on the lines
>>> of Apple's.
>>> Any cleaner solutions?
>>
>> Possibly better idea: use floor(f + 0.5) as a hack (c89) on things
>> lacking llrint (via a HAVE_LLRINT check). This won't result in
>> identical output past a sufficiently large power of 2, but is still a
>> safe API. It is also clearer and smaller. Idea inspired by
>> avcodec/mpegaudio_tablegen.h (where this hack may be removed).
>>
> 
> This code is in a public header, public headers don't have access to
> config.h, so no HAVE_* checks.
> You could make it non-inline, then you have all the freedom in the world.
> 
> - Hendrik


As others have pointed out on IRC, why is this function needed to begin with?
It's not used anywhere currently.
_______________________________________________
ffmpeg-devel mailing list
[email protected]
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCHv4] avutil/common: add av_rint64_clip

Reply via email to