>2012/2/8 Gabriel Gazz?n <gabcorreo at gmail.com>:

>> I think frame level accuracy is usually "not enough" precision, as also
>
>it is good enough for me


It really depends on your use case.

The ITU did a study and found that the threshold of detectability of lip sync 
errors is about +45 ms to ?125 ms (audio early to audio late) and that the 
threshold of acceptability is about +90 ms to ?185 ms. Apparently it is 
generally more tolerable for the audio to be slightly delayed than for it to be 
slightly early. 


If you only synchronize on video frame boundaries, then the worst case scenario 
would be +20ms or -20ms for 25fps video (that is +/- 1/2 video frame). The ITU 
research supports Dan's comment that most people can't even detect that much 
error. That may be acceptable if your project will be consumed directly. But if 
the output of your project is destined for further processing (like being 
transcoded by another system, or being sent through a broadcast chain), the 
down stream systems may add to the AV sync error. If the error stacks up, it 
could exceed the thresholds of detectability or possibly even the threshold of 
acceptability.?

So the amount of A/V error that is appropriate for your project depends on what 
you plan on doing with it.

~BM



Reply via email to