Derek Buitenhuis <derek.buitenhuis <at> gmail.com> writes: > On 2/25/2016 4:30 PM, Carl Eugen Hoyos wrote: > >> In terms of how the score for a MIME type match compares with > >> those of the individual content matching probe functions, I'd > >> say it makes sense. The stronger probing functions have a > >> score which reflects their reliability. > > > > But even _EXTENSION + 1 is correct in practically all cases > > (the exception are mpeg streams that start with the needed four > > bytes) and should not be beaten by mime_type. > > URLs with no extension at all, or extensions with query params > after them are very normal. This argument is silly.
You misunderstand: Most "imageauto" demuxers return a score of 51. In all cases when 51 is returned, the detection is nearly certain. Your patch disables the detection for all these cases (although it is nearly certain) in favour of the mime type. (That is how I understand the patch, please correct me if setting mime_type actually has another effect than returning 75 in case of conformance.) I don't think the solution is to increase the score for these probe functions: mpeg streams may start with any bytes and 32 bit accordance should not return maximum (or nearly maximum) score. > >> Improves probing, especially over http when there is a > >> Content-Type header > > > > Please give an example of a failing stream. > > If you cannot share, please provide console output and > > hexdump of the relevant bytes. > > ... your argument is "it's not 100% broken so let's not > improve it"? Really? My argument is: Let's please improve the jpeg auto-detection (the only one that shows issues) instead of disabling it. [...] > > The alternative is to only use mime type for jpeg: I am > > assuming this is the only problematic case, note that we > > have to make sure it's not mjpeg, that's why the probe > > function is so long. > > Um, what? Why the heck would you only use mime-type for > JPEG? It is definitely not the only problematic case, and > this sounds utterly silly. (You should be slightly more careful with your arguments: I originally thought that there might be one or two probe functions that should sometimes return a score instead of 0 because the decoders may accept invalid files but that is not the case except for the case without mime_type.) There are 14 "imageauto" demuxers with image type auto-detection. Three (png, dds, webp) return a score of 99, I don't know if they would be effected by the patch in question or not but all three decoders do not work if the respective probe function fails. The qdraw decoder fails for samples that fail auto-detection. Of the nine image format types with probe functions that return 51, the following fail decoding if the probe function returns 0: bmp, dpx, exr, j2k, sgi, sunrast, tiff The pictor decoder fails for images that are not auto-detected. That leaves only the jpegls autodetection that may not return a score for images that can be decoded: Since the patch you provided does not add a mime_type for jpegls, I don't think it can be used as an argument here. So let's improve the jpeg auto-detection, I know it isn't perfect yet. > >> If a MIME type is specified, that does have significant > >> weight in terms of probability, given that it was set > >> somewhere either by a client or server program doing a > >> content check or explicitly by some user. > > > > Wouldn't the server simply run "file" doing less checks > > than we already do now? (If it doesn't trust the extension.) > > No. This is especially not true on modern CDNs (such as AWS or > GCS), where files are stored as 'blob's and mimetype can be > set properly during the push to storage. It's not 1999 anymore. I am sorry if I misunderstand this (who does "the push"?): Are you arguing user-set mime_types are (except for jpeg) more plausible than an autodetection that never returns 0 for images that can be decoded? Carl Eugen _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel