On Tuesday, March 12, 2024 at 02:24:47 PM PDT, Marton Balint <c...@passwd.hu> wrote: > On Tue, 12 Mar 2024, Allan Cady via ffmpeg-devel wrote: >> On Monday, March 11, 2024 at 12:11:45 PM PDT, Marton Balint <c...@passwd.hu> >> wrote: >>> On Mon, 11 Mar 2024, Andreas Rheinhardt wrote: >>> Allan Cady via ffmpeg-devel: >>>> From: "Allan Cady" <allanc...@yahoo.com> >>>> >>>> I propose changing the format to "%.6f", which will >>>> give microsecond precision for all timestamps, regardless of >>>> offset. Trailing zeros can be trimmed from the fraction, without >>>> losing precision. If the length of the fixed-precision formatted >>>> timestamp exceeds the length of the allocated buffer >>>> (AV_TS_MAX_STRING_SIZE, currently 32, less one for the >>>> terminating null), then we can fall back to scientific notation, though >>>> this seems almost certain to never occur, because 32 characters would >>>> allow a maximum timestamp value of (32 - 1 - 6 - 1) = 24 characters. >>>> By my calculation, 10^24 seconds is about six orders of magnitude >>>> greater than the age of the universe. >>>> >>>> The fix proposed here follows the following logic: >>>> >>>> 1) Try formatting the number of seconds using "%.6f". This will >>>> always result in a string with six decimal digits in the fraction, >>>> possibly including trailing zeros. (e.g. "897234.73200"). >>>> >>>> 2) Check if that string would overflow the buffer. If it would, then >>>> format it using scientific notation ("%.8g"). >>>> >>>> 3) If the original fixed-point format fits, then trim any trailing >>>> zeros and decimal point, and return that result. >>>> >>>> Making this change broke two fate tests, filter-metadata-scdet, >>>> and filter-metadata-silencedetect. To correct this, I've modified >>>> tests/ref/fate/filter-metadata-scdet and >>>> tests/ref/fate/filter-metadata-silencedetect to match the >>>> new output. >>>> >>>> Signed-off-by: Allan Cady <allanc...@yahoo.com> >>>> --- >>>> libavutil/timestamp.h | 53 +++++++++++++++++++- >>>> tests/ref/fate/filter-metadata-scdet | 12 ++--- >>>> tests/ref/fate/filter-metadata-silencedetect | 2 +- >>>> 3 files changed, 58 insertions(+), 9 deletions(-) >>>> >>>> diff --git a/libavutil/timestamp.h b/libavutil/timestamp.h >>>> index 2b37781eba..2f04f9bb2b 100644 >>>> --- a/libavutil/timestamp.h >>>> +++ b/libavutil/timestamp.h >>>> @@ -25,6 +25,7 @@ >>>> #define AVUTIL_TIMESTAMP_H >>>> >>>> #include "avutil.h" >>>> +#include <stdbool.h> >>>> >>>> #if defined(__cplusplus) && !defined(__STDC_FORMAT_MACROS) && >>>>!defined(PRId64) >>>> #error missing -D__STDC_FORMAT_MACROS / #define __STDC_FORMAT_MACROS >>>> @@ -53,6 +54,32 @@ static inline char *av_ts_make_string(char *buf, >>>> int64_t ts) >>>> */ >>>> #define av_ts2str(ts) >>>>av_ts_make_string((char[AV_TS_MAX_STRING_SIZE]){0}, ts) >>>> >>>> +/** >>>> + * Strip trailing zeros and decimal point from a string. Performed >>>> + * in-place on input buffer. For local use only by av_ts_make_time_string. >>>> + * >>>> + * e.g.: >>>> + * "752.378000" -> "752.378" >>>> + * "4.0" -> "4" >>>> + * "97300" -> "97300" >>>> + */ >>>> +static inline void av_ts_strip_trailing_zeros_and_decimal_point(char >>>> *str) { >>>> + if (strchr(str, '.')) >>>> + { >>>> + int i; >>>> + for (i = strlen(str) - 1; i >= 0 && str[i] == '0'; i--) { >>>> + str[i] = '\0'; >>>> + } >>>> + >>>> + // Remove decimal point if it's the last character >>>> + if (i >= 0 && str[i] == '.') { >>>> + str[i] = '\0'; >>>> + } >>>> + >>>> + // String was modified in place; no need for return value. >>>> + } >>>> +} >>>> + >>>> /** >>>> * Fill the provided buffer with a string containing a timestamp time >>>> * representation. >>>> @@ -65,8 +92,30 @@ static inline char *av_ts_make_string(char *buf, >>>> int64_t ts) >>>> static inline char *av_ts_make_time_string(char *buf, int64_t ts, >>>> const AVRational *tb) >>>> { >>>> - if (ts == AV_NOPTS_VALUE) snprintf(buf, AV_TS_MAX_STRING_SIZE, >>>> "NOPTS"); >>>> - else snprintf(buf, AV_TS_MAX_STRING_SIZE, >>>> "%.6g", av_q2d(*tb) * ts); >>>> + if (ts == AV_NOPTS_VALUE) >>>> + { >>>> + snprintf(buf, AV_TS_MAX_STRING_SIZE, "NOPTS"); >>>> + } >>>> + else >>>> + { >>>> + const int max_fraction_digits = 6; >>>> + >>>> + // Convert 64-bit timestamp to double, using rational timebase >>>> + double seconds = av_q2d(*tb) * ts; >>>> + >>>> + int length = snprintf(NULL, 0, "%.*f", max_fraction_digits, >>>> seconds); >>>> + if (length <= AV_TS_MAX_STRING_SIZE - 1) >>>> + { >>>> + snprintf(buf, AV_TS_MAX_STRING_SIZE, "%.*f", >>>> max_fraction_digits, seconds); >>>> + av_ts_strip_trailing_zeros_and_decimal_point(buf); >>>> + } >>>> + else >>>> + { >>>> + snprintf(buf, AV_TS_MAX_STRING_SIZE, "%.8g", seconds); >>>> + } >>>> + >>>> + } >>>> + >>>> return buf; >>>> } >>>> >>> >>> 1. What makes you believe that all users want the new format and that it >>> does not cause undesired behaviour for some (maybe a lot) of them? The >>> number of characters written by the earlier code stayed pretty constant >>> even when the times became big (in this case, it just printed 8 chars if >>> ts>=0), yet your code will really make use of the whole buffer. >>> Granted, we could tell our users that they have no right to complain >>> about this, given that we always had a "right" to use the full buffer, >>> but I consider this a violation of the principle of least surprise. Why >>> don't you just change silencedetect or add another function? >>> >>> I suggested to change av_ts_make_time_string() because this >>> problem affect all detection filters, not only silencedetect. So fixing >>> it in silencedetect only is wrong. >>> >>> The API did not make any promises about the format, and as long as it >>> provides less chars than AV_TS_MAX_STRING_SIZE, and the string is >>> parseable string representation of a floating point number, it is not a >>> break. >>> >>> Sure, it changes behaviour, but that is not unheard of if there is a good >>> reason, and in this case, I believe there is. And I think it is more >>> likely that some code is broken right now because the function >>> loses precision or returns scientific notation for relatively small >>> timestamps. Actually, it _was_ a suprise for me, so IMHO the element of >>> least suprise is that precision will not fade away for reasonably small >>> timestamps. >>> >>> 2. For very small timestamps (< 10^-4), the new code will print a lot of >>> useless leading zeros (after the decimal point). In fact, it can be so >>> many that the new code has less precision than the old code, despite >>> using the fill buffer. >>> 2. This is way too much code for an inline function. >>> 3. Anyway, your placement of {} on their own lines does not match the >>> project coding style. >>> >>> I am OK with any implementation which keeps at least millisecond >>> precision for timestamps < 10^10. You are right, it should be as simple as >>> possible. >>> >> >> Milliseconds would be fine with me. >> >> Marton, do you have any other comments on my implementation? I have >> from Andreas that I should remove the inlines, and move the curly >> braces to match coding style. I also plan on removing the >> #include <stdbool.h>, which I added at some point but is no longer >> needed. And I would be happy to change from %.6f to %.3f.
> TBH I'd rather keep the precision as is. If you want to convert the > function to non-inlined, then you will have to move the implementation > from the header to an existing or new C file and unconditionally compile > it to avutil... Maybe we should give it another go keeping it inlineable > by simplifying it a little. It's been years (decades) since I had to think about inlines -- and I don't remember there being any fixed criteria for them. Is it a language rule that subroutines in header files have to be inlined? Or just a policy? I really don't know the language. Anyway, I'm all for simple. I was fairly uncomfortable with that much complexity, especially in a header file. I don't really like having that extra sub to strip trailing zeros, but I couldn't come up with a better way to do it. >> >> If that sounds good, I'll submit another patch sometime tomorrow. >> >> The only other thing I had thought of is to wonder if I should add some >> additional testing for the new formatting. I did a fair amount of testing >> on my own, but it would probably be good to have at least some of that >> in FATE. I had thought about maybe generating an audio file with just >> a tone and several silence intervals. > > One thing to notice is that you will not need to use the scientific > representation at all, because maximum value this function prints is the > product of an INT32 and an INT64, and that is 96 bit, which is at most 29 > chars. Adding the optional sign and the decimal point, that is still only > 31. So we can be sure that by using %.6f, the first character of > the decimal point is going to be present in the output. I had done some similar calculation and came to a similar conclusion. Which is great, > because that means we only have to > - do a single snprintf("%.6f") > - calculate last char position by subtracting 1 from the minimum of > AV_TS_MAX_STRING_SIZE-1 and the result of the snprintf() call. > - decrement string length while last char is '0' to remove trailing 0s > - decrement string length while last char is non-digit to remove decimal > point (which can be a multiple chars for some locales). > - update last+1 char to \0. > Ot is it still too complex to keep it inline? I'll give your suggestion a spin tomorrow. Thanks. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".