Possibly in response to a recent PDF Days talk (?)[0], Micky Lindlar asked on twitter if anyone had seen JPEG XL files in the wild[1].
I added jxl detection to Tika and re-detected all the files that had been previously identified as "application/octet-stream". I found ~462 likely jxl files. I have not yet looked for them embedded in other files. I've tgz'd the files (20M) and made them available here: https://corpora.tika.apache.org/base/share/CC-MAIN-2021-31-jxls.tgz For those interested in JXL, Jon Sneyers also pointed to this resource: https://github.com/libjxl/conformance Cheers, Tim [0] https://twitter.com/CHLThor/status/1443585512426520584?s=20 and https://www.pdfa.org/presentation/a-work-in-progress-pdf-r-revisions-and-new-highly-compressed-image-format/ [1] https://twitter.com/MickyLindlar/status/1443585512258695169?s=20