Re: [FFmpeg-devel] [PATCH] avformat/dv: fix timestamps of audio packets in case of dropped corrupt audio frames

Dave Rice Sun, 01 Nov 2020 13:52:59 -0800


> On Nov 1, 2020, at 3:58 PM, Marton Balint <c...@passwd.hu> wrote:
> 
> 
> 
> On Sun, 1 Nov 2020, Michael Niedermayer wrote:
> 
>> On Sat, Oct 31, 2020 at 05:56:24PM +0100, Marton Balint wrote:
>>> Fixes out of sync timestamps in ticket #8762.
>>> 
>>> Signed-off-by: Marton Balint <c...@passwd.hu>
>>> ---
>>> libavformat/dv.c       | 16 ++--------------
>>> tests/ref/seek/lavf-dv | 18 +++++++++---------
>>> 2 files changed, 11 insertions(+), 23 deletions(-)
>>> 
>>> diff --git a/libavformat/dv.c b/libavformat/dv.c
>>> index 3e0d12c0e3..26a78139f5 100644
>>> --- a/libavformat/dv.c
>>> +++ b/libavformat/dv.c
>>> @@ -49,7 +49,6 @@ struct DVDemuxContext {
>>>     uint8_t           audio_buf[4][8192];
>>>     int               ach;
>>>     int               frames;
>>> -    uint64_t          abytes;
>>> };
>>> 
>>> static inline uint16_t dv_audio_12to16(uint16_t sample)
>>> @@ -258,7 +257,7 @@ static int dv_extract_audio_info(DVDemuxContext *c, 
>>> const uint8_t *frame)
>>>             c->ast[i] = avformat_new_stream(c->fctx, NULL);
>>>             if (!c->ast[i])
>>>                 break;
>>> -            avpriv_set_pts_info(c->ast[i], 64, 1, 30000);
>>> +            avpriv_set_pts_info(c->ast[i], 64, c->sys->time_base.num, 
>>> c->sys->time_base.den);
>>>             c->ast[i]->codecpar->codec_type = AVMEDIA_TYPE_AUDIO;
>>>             c->ast[i]->codecpar->codec_id   = AV_CODEC_ID_PCM_S16LE;
>>> 
>>> @@ -387,8 +386,7 @@ int avpriv_dv_produce_packet(DVDemuxContext *c, 
>>> AVPacket *pkt,
>>>     for (i = 0; i < c->ach; i++) {
>>>         c->audio_pkt[i].pos  = pos;
>>>         c->audio_pkt[i].size = size;
>>> -        c->audio_pkt[i].pts  = c->abytes * 30000 * 8 /
>>> -                               c->ast[i]->codecpar->bit_rate;
>>> +        c->audio_pkt[i].pts  = (c->sys->height == 720) ? (c->frames & ~1) 
>>> : c->frames;
>>>         ppcm[i] = c->audio_buf[i];
>>>     }
>>>     if (c->ach)
>>> @@ -401,10 +399,7 @@ int avpriv_dv_produce_packet(DVDemuxContext *c, 
>>> AVPacket *pkt,
>>>             c->audio_pkt[2].size = c->audio_pkt[3].size = 0;
>>>         } else {
>>>             c->audio_pkt[0].size = c->audio_pkt[1].size = 0;
>>> -            c->abytes           += size;
>>>         }
>>> -    } else {
>>> -        c->abytes += size;
>>>     }
>>> 
>>>     /* Now it's time to return video packet */
>> 
>> Please correct me if iam wrong but
>> in cases where no audio is missing or damaged, this would also ignore how 
>> much
>> audio is in each packet. So you could have lets say a timestamp difference
>> of excatly 1 second between 2 packets while their is actually not exactly
>> 1 second worth of audio samples between them.
> 
> This is true, by using the frame counter (and the video time base) for audio, 
> we lose some audio packet timestamp precision inherently. However I don't 
> consider this a problem, audio timestamps do not have to be sample accurate, 
> for most formats they are not. Also it is not practical to keep track of how 
> many samples are there in the packets, for example when you do seeking, 
> obviously you can't read all the audio data before the seek point to get a 
> precise sample accurate timestamp.


Good point.

> What matters is that based on what I understand about the DV format (but 
> maybe Dave can confirm or deny this) the divergence between the audio 
> timestamp and the video timestamp in a DV frame must be less than 1/3 frame 
> duration even for unlocked mode:
> 
> http://www.adamwilt.com/DV-FAQ-tech.html#LockedAudio

The divergence could be a little larger than 1/3 frame in unlocked mode. 
IEC61384-2 defines the allowable range of minimum to maximum samples per frame 
and the maximum allowable divergence of accumulated samples per frame.

Mode       | Min-Max   | Allowance of accumulated difference
NTSC 48000 | 1580-1620 | 20
NTSC 44100 | 1452-1489 | 19
NTSC 32000 | 1053-1080 | 14
PAL  48000 | 1896-1944 | 24
PAL  44100 | 1742-1786 | 22
PAL  32000 | 1264-1296 | 16

The divergence between the audio timestamp and video timestamp is conditional 
on the mode, so that would be

Mode       | Max divergence as percentage of frame duration
NTSC 48000 | 0.3742511235
NTSC 44100 | 0.3869807536
NTSC 32000 | 0.3929636797
PAL  48000 | 0.3125
PAL  44100 | 0.3117913832
PAL  32000 | 0.3125

0.3929636797 is the max divergence, at least according to the spec’s limit of 
the allowable accumulated difference.

> I believe this patch also fixes the timestamps when the audio clock is 
> unlocked and have a completely indpendent clock source. (E.g. runs on fixed 
> 48009 Hz, see the "Unlocked audio: real life" section in the linked page)

I suppose it’s possible for a DV stream to use unlocked audio and ignore the 
defined limits of allowable accumulated difference ("real life" or worse) and 
fill max samples for each frame which could give sample rates of 48,600 in PAL 
at 48.

Kind Regards,
Dave Rice

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] avformat/dv: fix timestamps of audio packets in case of dropped corrupt audio frames

Reply via email to