After another day of searching for solutions, I finally managed to come up with one. Remember I said that "-use_wallclock_as_timestamps 1" fixes the sync issues but causes the video to stutter? So I attempted to fix the stuttering with the following filter:

setts='max(floor(PTS/X)*X,if(N,PREV_OUTPTS+X))'

Note the X: it has to be substituted with a constant depending on the use case (recording/streaming), and/or the stream the filter is being applied to.

Explanation: this filter expects input timestamps to be generated from the wallclock time but not necessarily spread out evenly. To fix stuttering, it adjusts the timestamps to be multiples of X = timebase times 1000 (for 25 FPS this is 0.04*1000=40) by computing the frame number and rounding it down to the nearest integer. It also ensures that the timestamps are always increasing - if the adjusted value is found to be less than the previous value plus one frame, then this sum is used as the output instead.

There is also a catch here - the camera can sometimes alter the frame rate, which causes the formula to produce weird results. Rectifying this is possible by assuming a constant frame rate for the input stream ("-r 25").

Also note that for smooth playback, the filter has to be applied both to audio and video. When recording segments without re-encoding audio, X should be set to 40 for both streams (assuming 25 FPS). Streaming, however, is another matter. When RTMP streaming via "-f flv" (I use fifo+flv), X has to be set to 1 for the video stream (I think in this case "floor" can be dropped since all values seem to be integers, but not entirely sure). Audio is another beast entirely: X should be 320 because the original sampling frequency is 8000 Hz mono, meaning 8000/25=320 samples per frame. The filter also needs to be applied via "-af" instead of "-bsf:a" to operate on the source data.

The final command-line is as follows (both recording and streaming):

ffmpeg -nostdin -flags low_delay -fflags +nobuffer+discardcorrupt \
-rtsp_transport tcp -timeout 3000000 -use_wallclock_as_timestamps 1 \
-r 25 -i rtsp://login:passw...@ip.ad.dre.ss:554/url \
-map 0:v -c:v copy -bsf:v setts='max(floor(PTS/40)*40,if(N,PREV_OUTPTS+40))' \ -map 0:a -c:a copy -bsf:a setts='max(floor(PTS/40)*40,if(N,PREV_OUTPTS+40))' \ -f segment -strftime 1 -reset_timestamps 1 -segment_atclocktime 1 -segment_time 600 "%Y-%m-%dT%H-%M-%S.mkv"
-map 0:v -c:v copy -bsf:v setts='max(floor(PTS),if(N,PREV_OUTPTS+1))'
-map 0:a -c:a aac -ar 48000 -ac 2 -b:a 128k -af asetpts='max(floor(PTS/320)*320,if(N,PREV_OUTPTS+320))' \ -f fifo -fifo_format flv -drop_pkts_on_overflow 1 -attempt_recovery 1 -recover_any_error 1 -format_opts flvflags=no_duration_filesize rtmp://<STREAM_URL>

NB: the current version of FFmpeg in the FreeBSD ports collection (4.4.2) needs these two patches for the proposed solution to work:

https://github.com/FFmpeg/FFmpeg/commit/301d275301d72387732ccdc526babaf984ddafe5
https://github.com/FFmpeg/FFmpeg/commit/b0b3fce3c33352a87267b6ffa51da31d5162daff

The first patch fixes the expression parser erroring out, and the second one fixes the PREV_OUTPTS value always equal to NOPTS. Also, "timeout" has to be replaced with "stimeout".

I'm still not sure if this solution is the proper one. So far, it's been running for many hours, and the resulting video is smooth as butter, and without any gradually increasing audio/video lag. But it looks extremely overcomplicated, not to mention it took me several days of researching and analyzing the video files to implement. Also, I don't know where the timestamp drift actually occurs - most signs point to the camera, but there's also the fact that some sort of conversion takes place depending on the output (e.g. segment/mkv measures timestamps in 1/1000ths of a second, but flv measures them in frames), and it might be possible that there's a bug somewhere in there.

For simplicity though, let's assume there's no bug, and the fault occurs at the source. We know that the audio is always on time, so why not use the timestamps of the audio packets for the video too? E.g. for each incoming video frame, assign it the timestamp of the latest audio packet received (not the wallclock time). The problem is that "setts" filters cannot interact with each other, so it's not possible to use them for this purpose.

Well, even though I've managed to somehow deal with this problem, I'm still no expert. So further comments are still welcome. Until then, I hope the information provided in this thread will be useful to anybody who encounters a similar issue.

Thank you very much.

---
Kind regards,
Vladimir
_______________________________________________
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to