On Tue, Jul 03, 2018 at 10:25:36PM +0200, Karsten Otto wrote: > TL;DR: I will drop patch 3/3, may rather spend some time investigating why > "ff ee 47 9d“ passes the mp3 header parser. Also, the aa file "index" cannot > be used for frame or chapter detection, unfortunately. > > More details inline below. > > > Am 03.07.2018 um 02:32 schrieb Michael Niedermayer <mich...@niedermayer.cc>: > > > > Signierter PGP-Teil > > On Mon, Jul 02, 2018 at 07:21:43PM +0200, Karsten Otto wrote: > >> > >>> Am 02.07.2018 um 10:59 schrieb Michael Niedermayer > >>> <mich...@niedermayer.cc>: > >>> > >>> Signierter PGP-Teil > >>> On Thu, Jun 21, 2018 at 06:58:26PM +0200, Karsten Otto wrote: > >>>> MP3 frames may not be aligned to aa chunk boundaries. After seeking, > >>>> scan for the next valid frame header. Then truncate the packet, and > >>>> also adjust timestamp information accordingly. > >>>> --- > >>>> libavformat/aadec.c | 33 ++++++++++++++++++++++++++++----- > >>>> 1 file changed, 28 insertions(+), 5 deletions(-) > >>> > >>> Please see AVSTREAM_PARSE_TIMESTAMPS > >>> > >>> This codec specific code in demuxers should not be needed > >>> > >> I tried that before, and you are right that it takes care of timestamp > >> adjustments. > >> > >> However, after a seek the parsed packet still contains a partial frame > >> before the > >> next full one. I had expected libavformat/mpegaudio_parser.c to detect this > >> situation and discard the fragment, but unfortunately it does not. Instead > >> it passes > >> it unchanged to the codec, which plays it as a pop or even a very ugly > >> BLEEEP - > >> painful while wearing headphones! > > > > I think you mis-diagnose this at least somewhat > > your code searches for a specific mp3 header, the parser and decoder would > > accept a wider range of mp3 variants. > > But both can choose points that are not mp3 frame starts. (if that is the > > problem you are seeing, iam not completely sure it is) > > > It took a closer look at what happens when I hear a BLEEP: The packet begins > with a partial frame, starting with the byte sequence "ff ee 47 9d“. > Unfortunately, > the mp3 parser indeed accepts this as a valid mp3 header, causing the noise. > By looking for the more restricted header, my patch finds the real next frame > at > offset 78. > > BTW: Should this sequence actually pass? AFAIK 01 is not a valid MPEG audio > version ID? > > > Also is the more restricted header you search for always used or could > > this be failing with some files ? > > > Good question. So far, all mp3 aa files I tested with matched the format > (MPEG 2 > Layer III at 32 kbps and 22kHz). I doubt there are other variants, but can’t > be sure. > > > Either way, looking at the demuxer a bit deeper, theres a TOC list in the > > main header which points to chunks. The one file i found has 12 such chunks > > the first represents the whole file i suspect, the next largest the audio > > data, another one the metadata. > > I guess the remaining 2 large ones could be a cover image and an index. > Correct, seems like all aa files have the TOC, but its entries can be in a > different > order in each file. I guess thats why the original aadec.c implementation just > looks for the largest chunk to play. > > > I didnt really look at it, but theres a table in there with pairs of 32bit > > values. the first in the file i have goes from 0 to 3 the second starts > > multiple times from 0 and seems monotonly increasing and staying within > > the filesize. > > The sample i have does not store mp3 but it looks like this is a index > > maybe offsets for packets in each of the 3 chapters. > > > > Please look at the data, if it can be used. It would be much better than > > scaning the file linearly and searching for some byte sequence to find > > packet starts. > > > Short answer: Sorry, it is not possible to derive frame offsets nor chapter > offsets from the index. >
> Long answer: > All offsets in the index are the same, and matching the "codec_second_size" > = crypto chunk size, roughly one second of audio: > - 1045 for format 2 (SIPR 8kbps) > - 2000 for format 3 (SIPR 16kbps) > - 3982 for format 4 (MP3 32kbps) > This is different from the respective frame size, which is 19, 20, and 104/105 The first 2 are exact multiples of the frame size. I dont have a sample for the mp3 case so i cannot check but are you sure the actually writen numbers dont match up multiples of mp3 frames ? If not the question that remains is how does the official code seek in this? It seems a bit odd that they would get the design of the index wrong but then implement demuxer side searching for a mp3 header not noticing that this is not working fully reliable. [This would imply that while they implemented the demuxer they would have been aware that their index doesnt work, why would they not fix it at this point ...] Its hard to imagine they would not notice this bug with the index at all, not that this is impossible or anything [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB The real ebay dictionary, page 2 "100% positive feedback" - "All either got their money back or didnt complain" "Best seller ever, very honest" - "Seller refunded buyer after failed scam"
signature.asc
Description: PGP signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel