On Wed, Jan 08, 2020 at 16:32:39 +0900, Ted Lee wrote: > Dear FFmpeg developers, > I'm glad to have a chance to contribute to FFmpeg.
Welcome. > *Summary:* > > - Test signal: 790_pomnyun_taebaek2.mp3 > - MP3 file with the same left channel and right channel except for > the first 7 seconds and the last 10 seconds. > (download: > > <https://www.dropbox.com/s/sivz18sixvhoo66/790_pomnyun_taebaek2.mp3?dl=0> > https://www.dropbox.com/s/sivz18sixvhoo66/790_pomnyun_taebaek2.mp3?dl=0 > ) > - When decoding the test signal using FFmpeg as below, a difference > occurs in the same part of L and R. > - $ffmpeg -i 790_pomnyun_taebaek2.mp3 -acodec pcm_s16le > 790_pomnyun_taebaek2.wav > > > Expected result (Adobe Audition or Core Audio): > > Decoded PCM > [image: image.png] > (L-R) > [image: image.png] > (You can reproduce it using Adobe Audition or CoreAudio of MacOS) > > Actual result (FFmpeg): > > Decoded PCM > [image: image.png] > > (L-R) > [image: image.png] To put it in other words: You expect the monoaural section of this file to be decoded with identical left and right channels. ffmpeg does not achieve this. In fact, there seem to be ~50 dB too much "noise". [See mean/max_volume results below, before and after subtraction.] Original ffmpeg: barsnick@sunshine:/usr/new/tools/video/ffmpeg/ffmpeg-snapshot-2020-01-05 > ../ffmpeg-build-2020-01-05/ffmpeg_g -ss 10 -t 10:00 -i ~/tmp/790_pomnyun_taebaek2.mp3 -af volumedetect,"pan=mono|c0=c0-c1",volumedetect -f null - [...] [Parsed_volumedetect_0 @ 0xa5bb380] n_samples: 26460000 [Parsed_volumedetect_0 @ 0xa5bb380] mean_volume: -25.1 dB [Parsed_volumedetect_0 @ 0xa5bb380] max_volume: 0.0 dB [Parsed_volumedetect_0 @ 0xa5bb380] histogram_0db: 594 [Parsed_volumedetect_0 @ 0xa5bb380] histogram_1db: 671 [Parsed_volumedetect_0 @ 0xa5bb380] histogram_2db: 1033 [Parsed_volumedetect_0 @ 0xa5bb380] histogram_3db: 1794 [Parsed_volumedetect_0 @ 0xa5bb380] histogram_4db: 3067 [Parsed_volumedetect_0 @ 0xa5bb380] histogram_5db: 4940 [Parsed_volumedetect_0 @ 0xa5bb380] histogram_6db: 7871 [Parsed_volumedetect_0 @ 0xa5bb380] histogram_7db: 12062 [Parsed_volumedetect_2 @ 0xa5c3b40] n_samples: 13230000 [Parsed_volumedetect_2 @ 0xa5c3b40] mean_volume: -41.2 dB [Parsed_volumedetect_2 @ 0xa5c3b40] max_volume: -15.5 dB [Parsed_volumedetect_2 @ 0xa5c3b40] histogram_15db: 29 [Parsed_volumedetect_2 @ 0xa5c3b40] histogram_16db: 154 [Parsed_volumedetect_2 @ 0xa5c3b40] histogram_17db: 159 [Parsed_volumedetect_2 @ 0xa5c3b40] histogram_18db: 308 [Parsed_volumedetect_2 @ 0xa5c3b40] histogram_19db: 627 [Parsed_volumedetect_2 @ 0xa5c3b40] histogram_20db: 1130 [Parsed_volumedetect_2 @ 0xa5c3b40] histogram_21db: 1932 [Parsed_volumedetect_2 @ 0xa5c3b40] histogram_22db: 3433 [Parsed_volumedetect_2 @ 0xa5c3b40] histogram_23db: 5099 [Parsed_volumedetect_2 @ 0xa5c3b40] histogram_24db: 8002 I can at least confirm that other decoders achieve a better equality. Using mpg321 with libmad-0.15.1b (to convert to wav and analyze with ffmpeg), I get: [Parsed_volumedetect_0 @ 0xbe50d00] n_samples: 26460000 [Parsed_volumedetect_0 @ 0xbe50d00] mean_volume: -26.9 dB [Parsed_volumedetect_0 @ 0xbe50d00] max_volume: -0.1 dB [Parsed_volumedetect_0 @ 0xbe50d00] histogram_0db: 186 [Parsed_volumedetect_0 @ 0xbe50d00] histogram_1db: 310 [Parsed_volumedetect_0 @ 0xbe50d00] histogram_2db: 556 [Parsed_volumedetect_0 @ 0xbe50d00] histogram_3db: 866 [Parsed_volumedetect_0 @ 0xbe50d00] histogram_4db: 1354 [Parsed_volumedetect_0 @ 0xbe50d00] histogram_5db: 2158 [Parsed_volumedetect_0 @ 0xbe50d00] histogram_6db: 3680 [Parsed_volumedetect_0 @ 0xbe50d00] histogram_7db: 6260 [Parsed_volumedetect_0 @ 0xbe50d00] histogram_8db: 8990 [Parsed_volumedetect_0 @ 0xbe50d00] histogram_9db: 13236 [Parsed_volumedetect_2 @ 0xbe4c180] n_samples: 13230000 [Parsed_volumedetect_2 @ 0xbe4c180] mean_volume: -85.5 dB [Parsed_volumedetect_2 @ 0xbe4c180] max_volume: -62.7 dB [Parsed_volumedetect_2 @ 0xbe4c180] histogram_62db: 1 [Parsed_volumedetect_2 @ 0xbe4c180] histogram_63db: 9 [Parsed_volumedetect_2 @ 0xbe4c180] histogram_64db: 13 [Parsed_volumedetect_2 @ 0xbe4c180] histogram_65db: 26 [Parsed_volumedetect_2 @ 0xbe4c180] histogram_66db: 37 [Parsed_volumedetect_2 @ 0xbe4c180] histogram_67db: 35 [Parsed_volumedetect_2 @ 0xbe4c180] histogram_68db: 68 [Parsed_volumedetect_2 @ 0xbe4c180] histogram_69db: 54 [Parsed_volumedetect_2 @ 0xbe4c180] histogram_70db: 73 [Parsed_volumedetect_2 @ 0xbe4c180] histogram_71db: 86 [Parsed_volumedetect_2 @ 0xbe4c180] histogram_72db: 118 [Parsed_volumedetect_2 @ 0xbe4c180] histogram_73db: 340 [Parsed_volumedetect_2 @ 0xbe4c180] histogram_74db: 9490 [Parsed_volumedetect_2 @ 0xbe4c180] histogram_75db: 0 [Parsed_volumedetect_2 @ 0xbe4c180] histogram_76db: 105618 With your patch, ffmpeg achieves this (very similar for both mp3 and mp3float), which is even closer to identity: [Parsed_volumedetect_0 @ 0xa6e3380] n_samples: 26460000 [Parsed_volumedetect_0 @ 0xa6e3380] mean_volume: -26.9 dB [Parsed_volumedetect_0 @ 0xa6e3380] max_volume: -0.1 dB [Parsed_volumedetect_0 @ 0xa6e3380] histogram_0db: 186 [Parsed_volumedetect_0 @ 0xa6e3380] histogram_1db: 310 [Parsed_volumedetect_0 @ 0xa6e3380] histogram_2db: 556 [Parsed_volumedetect_0 @ 0xa6e3380] histogram_3db: 866 [Parsed_volumedetect_0 @ 0xa6e3380] histogram_4db: 1354 [Parsed_volumedetect_0 @ 0xa6e3380] histogram_5db: 2158 [Parsed_volumedetect_0 @ 0xa6e3380] histogram_6db: 3680 [Parsed_volumedetect_0 @ 0xa6e3380] histogram_7db: 6258 [Parsed_volumedetect_0 @ 0xa6e3380] histogram_8db: 8990 [Parsed_volumedetect_0 @ 0xa6e3380] histogram_9db: 13236 [Parsed_volumedetect_2 @ 0xa6ebb40] n_samples: 13230000 [Parsed_volumedetect_2 @ 0xa6ebb40] mean_volume: -91.0 dB [Parsed_volumedetect_2 @ 0xa6ebb40] max_volume: -59.4 dB [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_59db: 5 [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_60db: 3 [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_61db: 1 [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_62db: 19 [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_63db: 29 [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_64db: 34 [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_65db: 65 [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_66db: 91 [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_67db: 53 [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_68db: 136 [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_69db: 74 [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_70db: 107 [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_71db: 120 [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_72db: 139 [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_73db: 148 [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_74db: 230 [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_75db: 0 [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_76db: 283 [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_77db: 0 [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_78db: 336 [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_79db: 0 [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_80db: 491 [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_81db: 0 [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_82db: 0 [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_83db: 0 [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_84db: 988 [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_85db: 0 [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_86db: 0 [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_87db: 0 [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_88db: 0 [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_89db: 0 [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_90db: 2788 [Parsed_volumedetect_2 @ 0xa6ebb40] histogram_91db: 13223860 I can't judge whether what you are doing is correct, but I can confirm the observations. > >From 754545f6c675aede07abd912992f9ed4a17d4ddc Mon Sep 17 00:00:00 2001 > From: Ted <t...@gaudiolab.com> > Date: Wed, 8 Jan 2020 11:53:56 +0900 > Subject: [PATCH] avfilter/mpegaudiodec_template: Fix wrong decision of > illegal > intensity stereo position for MPEG-2 LSF The patch is corrupted by newlines, inserted by your mailer. I managed to fix that manually, but to ease review, please attach a patch created by "git format-patch" to your e-mail (as an attachment), or use "git send-email". > Signed-off-by: Ted <t...@gaudiolab.com> You may want to provide a full real name, as this is what is registered forever in the repositories. > --- > libavcodec/mpegaudiodec_template.c | 14 +++++++++++--- > 1 file changed, 11 insertions(+), 3 deletions(-) Since you are changing the decoder, you must also make sure that fate testsuite does not detect changed results. Or if it does and such changes are expected, the fate tests need to be adapted. (If you can't do this, others may assist you, at least with checking. It's a bit more of a hassle.) It might also be worth adding such an MPEG-2 LSF sample to fate, if those in fate don't cover this yet. (The MP§ decoder source file does have a big TODO "test lsf / mpeg25 extensively".) Cheers, Moritz _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".