On Fri, Sep 06, 2019 at 05:00:14PM -0700, Niki Bowe wrote:
> As it turns out the last patch still lets a lot of jpegs get misidentified.
> 
> This new version avoids the problem by checking for jpeg magic at the start.
> 
> I also added a FATE test, and attached the jpeg for the test.
> 
> 
> 
> On Tue, Aug 27, 2019 at 6:30 PM Niki Bowe <nb...@google.com> wrote:
> 
> >
> >
> > On Sun, Aug 25, 2019 at 11:39 AM Michael Niedermayer
> > <mich...@niedermayer.cc> wrote:
> >
> >> On Fri, Aug 23, 2019 at 04:03:10PM -0700, Niki Bowe wrote:
> >> > On Thu, Aug 22, 2019 at 2:30 AM Paul B Mahol <one...@gmail.com> wrote:
> >> >
> >> > > On Thu, Aug 22, 2019 at 11:19 AM Carl Eugen Hoyos <ceffm...@gmail.com
> >> >
> >> > > wrote:
> >> > >
> >> > > > Am Mi., 21. Aug. 2019 um 23:05 Uhr schrieb Niki Bowe
> >> > > > <nbowe-at-google....@ffmpeg.org>:
> >> > > > >
> >> > > > > On Mon, Aug 19, 2019 at 7:22 PM Carl Eugen Hoyos <
> >> ceffm...@gmail.com>
> >> > > > wrote:
> >> > > > >
> >> > > > > >
> >> > > > > > This score would mean that mjpeg can never be detected.
> >> > > > > > I suspect you have to reduce one of the demuxers to "- 1".
> >> > > > > >
> >> > > > > >
> >> > > > > Thanks Carl.
> >> > > > > Attached patch to reduce mpeg probe by -1, which also fixes the
> >> issue.
> >> > > >
> >> > > > Sorry, I misread the original report, it looked to me as if mJpeg
> >> was
> >> > > > the culprit.
> >> > > >
> >> > > > Imo, the mpeg probing should be fixed (return a smaller value) for
> >> your
> >> > > > sample
> >> > > > by detecting that it is not mpeg, not by returning a smaller value
> >> for
> >> > > > all samples.
> >> > > >
> >> > >
> >> > > 1000% agree.
> >> > >
> >> > >
> >> > It didn't return a smaller value for all samples, only the "invalid VDR
> >> > files and short PES streams" case.
> >> > Most mpegps files still return 26 immediately, because they have pack
> >> > headers.
> >> >
> >> > However, here is another patch where I try to limit it to only changing
> >> > score for these jpegs.
> >> > I noticed that these jpegs had a lot of 0x00000100 sequences, which
> >> matches
> >> > mpeg picture header start code. I added another heuristic which matches
> >> > these jpegs, but not any mpegps files I could find.
> >> >
> >> > Alternatively I could make reduce score if it doesn't start with a start
> >> > code? At the moment its happy to search until it finds start codes.
> >> >
> >> >
> >> > Is everyone really sure the best approach is to modify mpegps_probe for
> >> > this?
> >> > The mpegps_probe function returns 25 in many instances where it may not
> >> be
> >> > mpegps. It does only minimal structural checking, and allows invalid
> >> data
> >> > to still classify as mpegps.
> >> > jpeg probing returns 25 in some cases where it is almost certainly a
> >> jpeg
> >> > (Has to go through multiple tags to get to SOS, many of which early out
> >> if
> >> > they find invalid data).
> >> > Note that 25 is still treated as "low confidence" for jpeg. It logs
> >> "Format
> >> > jpeg_pipe detected only with low score of 25, misdetection possible!"
> >> for
> >> > these jpegs.
> >> > So I still think a score of 25 is too low for these jpegs, and that a
> >> > better fix would be to return 26 for jpeg_pipe and mjpeg if it makes it
> >> > past multiple tags to SOS.
> >>
> >> jpegs can be in other container formats its not jpeg in that case but the
> >> other container format
> >>
> > about this patch
> >> it breaks this:
> >>
> >> ./ffplay tickets//3327/issue3327-libc-2.17.so
> >>
> >> https://trac.ffmpeg.org/raw-attachment/ticket/3327/issue3327-libc-2.17.so
> >>
> >> which is detected as mpeg after the patch.
> >> really nothing should be detected as mpeg after this that was not before
> >> the idea IIUC is make something that was detected as mpeg to be not
> >> anymore
> >>
> >>
> > Thanks Michael. Good find.
> > Attached patch which only applies the extra sanity checks to the "Invalid
> > VDR files and short PES streams" path.
> >
> >
> >
> >> Thanks
> >>
> >> [...]
> >>
> >>
> >> --
> >> Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
> >>
> >> Awnsering whenever a program halts or runs forever is
> >> On a turing machine, in general impossible (turings halting problem).
> >> On any real computer, always possible as a real computer has a finite
> >> number
> >> of states N, and will either halt in less than N cycles or never halt.
> >> _______________________________________________
> >> ffmpeg-devel mailing list
> >> ffmpeg-devel@ffmpeg.org
> >> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >>
> >> To unsubscribe, visit link above, or email
> >> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
> >
> >
> >
> > --
> >
> > Nikolas Bowe |  SWE |  nb...@google.com |  408-565-5137
> >
> 
> 
> -- 
> 
> Nikolas Bowe |  SWE |  nb...@google.com |  408-565-5137


>  libavformat/mpeg.c   |    5 +++++
>  tests/fate/probe.mak |    3 +++
>  2 files changed, 8 insertions(+)
> e74611d3e5d3eb5170b8145edacfed2f298f86fd  
> 0001-Fix-JPEGs-being-misidentified-as-mpeg.patch
> From 49fdc6549a0b8413f80aa16a14473e956d0ea08b Mon Sep 17 00:00:00 2001
> From: Nikolas Bowe <nb...@google.com>
> Date: Wed, 4 Sep 2019 18:23:16 -0700
> Subject: [PATCH] Fix JPEGs being misidentified as mpeg.
> 
> mpeg probing is very forgiving, willing to skip over arbitrary junk between 
> "start codes". Its not unusual to find things which look like start codes in 
> other formats. Normally that isn't a problem as other formats will get higher 
> probe scores, but jpeg probe scores will only be 25 unless it reaches the End 
> Of Image tag.
> For this reason, if it starts with JPEG magic, treat it as not an mpeg.
> ---
>  libavformat/mpeg.c   | 5 +++++
>  tests/fate/probe.mak | 3 +++
>  2 files changed, 8 insertions(+)
> 
> diff --git a/libavformat/mpeg.c b/libavformat/mpeg.c
> index 3205f209e6..d8657c397e 100644
> --- a/libavformat/mpeg.c
> +++ b/libavformat/mpeg.c
> @@ -96,6 +96,11 @@ static int mpegps_probe(const AVProbeData *p)
>              else if ((code & 0xf0) == VIDEO_ID && !pes) invalid++;
>              else if ((code & 0xe0) == AUDIO_ID && !pes) invalid++;
>              else if (code == PRIVATE_STREAM_1  && !pes) invalid++;
> +        } else if (i == 3) {
> +            if ((code & 0xFFFFFF00) == 0xFFD8FF00) {
> +                // Starts with JPEG magic.
> +                return 0;
> +            }
>          }
>      }

this will break detection of 1 out of 16M randomly cut mpegs
also add 1 byte and all this breaks down.

also if you find word files that are detected as mpeg, will you add a
check for word files in here ?

I dont think this is the correct approach.

can you provide links to the misdetected files ?

Thanks

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Good people do not need laws to tell them to act responsibly, while bad
people will find a way around the laws. -- Plato

Attachment: signature.asc
Description: PGP signature

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to