On Fri, Sep 06, 2019 at 05:00:14PM -0700, Niki Bowe wrote: > As it turns out the last patch still lets a lot of jpegs get misidentified. > > This new version avoids the problem by checking for jpeg magic at the start. > > I also added a FATE test, and attached the jpeg for the test. > > > > On Tue, Aug 27, 2019 at 6:30 PM Niki Bowe <nb...@google.com> wrote: > > > > > > > On Sun, Aug 25, 2019 at 11:39 AM Michael Niedermayer > > <mich...@niedermayer.cc> wrote: > > > >> On Fri, Aug 23, 2019 at 04:03:10PM -0700, Niki Bowe wrote: > >> > On Thu, Aug 22, 2019 at 2:30 AM Paul B Mahol <one...@gmail.com> wrote: > >> > > >> > > On Thu, Aug 22, 2019 at 11:19 AM Carl Eugen Hoyos <ceffm...@gmail.com > >> > > >> > > wrote: > >> > > > >> > > > Am Mi., 21. Aug. 2019 um 23:05 Uhr schrieb Niki Bowe > >> > > > <nbowe-at-google....@ffmpeg.org>: > >> > > > > > >> > > > > On Mon, Aug 19, 2019 at 7:22 PM Carl Eugen Hoyos < > >> ceffm...@gmail.com> > >> > > > wrote: > >> > > > > > >> > > > > > > >> > > > > > This score would mean that mjpeg can never be detected. > >> > > > > > I suspect you have to reduce one of the demuxers to "- 1". > >> > > > > > > >> > > > > > > >> > > > > Thanks Carl. > >> > > > > Attached patch to reduce mpeg probe by -1, which also fixes the > >> issue. > >> > > > > >> > > > Sorry, I misread the original report, it looked to me as if mJpeg > >> was > >> > > > the culprit. > >> > > > > >> > > > Imo, the mpeg probing should be fixed (return a smaller value) for > >> your > >> > > > sample > >> > > > by detecting that it is not mpeg, not by returning a smaller value > >> for > >> > > > all samples. > >> > > > > >> > > > >> > > 1000% agree. > >> > > > >> > > > >> > It didn't return a smaller value for all samples, only the "invalid VDR > >> > files and short PES streams" case. > >> > Most mpegps files still return 26 immediately, because they have pack > >> > headers. > >> > > >> > However, here is another patch where I try to limit it to only changing > >> > score for these jpegs. > >> > I noticed that these jpegs had a lot of 0x00000100 sequences, which > >> matches > >> > mpeg picture header start code. I added another heuristic which matches > >> > these jpegs, but not any mpegps files I could find. > >> > > >> > Alternatively I could make reduce score if it doesn't start with a start > >> > code? At the moment its happy to search until it finds start codes. > >> > > >> > > >> > Is everyone really sure the best approach is to modify mpegps_probe for > >> > this? > >> > The mpegps_probe function returns 25 in many instances where it may not > >> be > >> > mpegps. It does only minimal structural checking, and allows invalid > >> data > >> > to still classify as mpegps. > >> > jpeg probing returns 25 in some cases where it is almost certainly a > >> jpeg > >> > (Has to go through multiple tags to get to SOS, many of which early out > >> if > >> > they find invalid data). > >> > Note that 25 is still treated as "low confidence" for jpeg. It logs > >> "Format > >> > jpeg_pipe detected only with low score of 25, misdetection possible!" > >> for > >> > these jpegs. > >> > So I still think a score of 25 is too low for these jpegs, and that a > >> > better fix would be to return 26 for jpeg_pipe and mjpeg if it makes it > >> > past multiple tags to SOS. > >> > >> jpegs can be in other container formats its not jpeg in that case but the > >> other container format > >> > > about this patch > >> it breaks this: > >> > >> ./ffplay tickets//3327/issue3327-libc-2.17.so > >> > >> https://trac.ffmpeg.org/raw-attachment/ticket/3327/issue3327-libc-2.17.so > >> > >> which is detected as mpeg after the patch. > >> really nothing should be detected as mpeg after this that was not before > >> the idea IIUC is make something that was detected as mpeg to be not > >> anymore > >> > >> > > Thanks Michael. Good find. > > Attached patch which only applies the extra sanity checks to the "Invalid > > VDR files and short PES streams" path. > > > > > > > >> Thanks > >> > >> [...] > >> > >> > >> -- > >> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB > >> > >> Awnsering whenever a program halts or runs forever is > >> On a turing machine, in general impossible (turings halting problem). > >> On any real computer, always possible as a real computer has a finite > >> number > >> of states N, and will either halt in less than N cycles or never halt. > >> _______________________________________________ > >> ffmpeg-devel mailing list > >> ffmpeg-devel@ffmpeg.org > >> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > >> > >> To unsubscribe, visit link above, or email > >> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". > > > > > > > > -- > > > > Nikolas Bowe | SWE | nb...@google.com | 408-565-5137 > > > > > -- > > Nikolas Bowe | SWE | nb...@google.com | 408-565-5137
> libavformat/mpeg.c | 5 +++++ > tests/fate/probe.mak | 3 +++ > 2 files changed, 8 insertions(+) > e74611d3e5d3eb5170b8145edacfed2f298f86fd > 0001-Fix-JPEGs-being-misidentified-as-mpeg.patch > From 49fdc6549a0b8413f80aa16a14473e956d0ea08b Mon Sep 17 00:00:00 2001 > From: Nikolas Bowe <nb...@google.com> > Date: Wed, 4 Sep 2019 18:23:16 -0700 > Subject: [PATCH] Fix JPEGs being misidentified as mpeg. > > mpeg probing is very forgiving, willing to skip over arbitrary junk between > "start codes". Its not unusual to find things which look like start codes in > other formats. Normally that isn't a problem as other formats will get higher > probe scores, but jpeg probe scores will only be 25 unless it reaches the End > Of Image tag. > For this reason, if it starts with JPEG magic, treat it as not an mpeg. > --- > libavformat/mpeg.c | 5 +++++ > tests/fate/probe.mak | 3 +++ > 2 files changed, 8 insertions(+) > > diff --git a/libavformat/mpeg.c b/libavformat/mpeg.c > index 3205f209e6..d8657c397e 100644 > --- a/libavformat/mpeg.c > +++ b/libavformat/mpeg.c > @@ -96,6 +96,11 @@ static int mpegps_probe(const AVProbeData *p) > else if ((code & 0xf0) == VIDEO_ID && !pes) invalid++; > else if ((code & 0xe0) == AUDIO_ID && !pes) invalid++; > else if (code == PRIVATE_STREAM_1 && !pes) invalid++; > + } else if (i == 3) { > + if ((code & 0xFFFFFF00) == 0xFFD8FF00) { > + // Starts with JPEG magic. > + return 0; > + } > } > } this will break detection of 1 out of 16M randomly cut mpegs also add 1 byte and all this breaks down. also if you find word files that are detected as mpeg, will you add a check for word files in here ? I dont think this is the correct approach. can you provide links to the misdetected files ? Thanks [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Good people do not need laws to tell them to act responsibly, while bad people will find a way around the laws. -- Plato
signature.asc
Description: PGP signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".