Just one of those days, weeks, months, years.... Sorry... and thank you, Ken!
AIFF, we're now getting more eofs than we were. This might be a Java issue, but I don't think there's anything to do at the Tika level. I don't remember any changes in the AudioParser in 1.27. I'll dig into it, but I'm concerned...famous last words... o.a.t.exception.TikaException at o.a.t.parser.CompositeParser.parse(CompositeParser.java:287) at o.a.t.parser.CompositeParser.parse(CompositeParser.java:281) at o.a.t.parser.CompositeParser.parse(CompositeParser.java:281) at o.a.t.parser.AutoDetectParser.parse(AutoDetectParser.java:143) at o.a.t.parser.ParserDecorator.parse(ParserDecorator.java:188) at o.a.t.parser.DigestingParser.parse(DigestingParser.java:84) at o.a.t.parser.ParserDecorator.parse(ParserDecorator.java:188) at o.a.t.parser.RecursiveParserWrapper$EmbeddedParserDecorator.parse(RecursiveParserWrapper.java:376) at o.a.t.parser.DelegatingParser.parse(DelegatingParser.java:72) at o.a.t.extractor.ParsingEmbeddedDocumentExtractor.parseEmbedded(ParsingEmbeddedDocumentExtractor.java:104) at o.a.t.parser.pkg.RarParser.parse(RarParser.java:95) at o.a.t.parser.CompositeParser.parse(CompositeParser.java:281) at o.a.t.parser.CompositeParser.parse(CompositeParser.java:281) at o.a.t.parser.CompositeParser.parse(CompositeParser.java:281) at o.a.t.parser.AutoDetectParser.parse(AutoDetectParser.java:143) at o.a.t.parser.ParserDecorator.parse(ParserDecorator.java:188) at o.a.t.parser.DigestingParser.parse(DigestingParser.java:84) at o.a.t.parser.RecursiveParserWrapper.parse(RecursiveParserWrapper.java:239) at o.a.t.batch.FileResourceConsumer.parse(FileResourceConsumer.java:406) at o.a.t.batch.fs.RecursiveParserWrapperFSConsumer.processFileResource(RecursiveParserWrapperFSConsumer.java:105) at o.a.t.batch.FileResourceConsumer._processFileResource(FileResourceConsumer.java:181) at o.a.t.batch.FileResourceConsumer.call(FileResourceConsumer.java:115) at o.a.t.batch.FileResourceConsumer.call(FileResourceConsumer.java:50) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java at com.sun.media.sound.AiffFileReader.getCOMM(AiffFileReader.java:267) at com.sun.media.sound.AiffFileReader.getAudioFileFormat(AiffFileReader.java:76) at javax.sound.sampled.AudioSystem.getAudioFileFormat(AudioSystem.java:1004) at o.a.t.parser.audio.AudioParser.parse(AudioParser.java:73) at o.a.t.parser.CompositeParser.parse(CompositeParser.java:281) ... 28 more On Wed, Jun 30, 2021 at 4:37 PM Ken Krugler <kkrugler_li...@transpac.com> wrote: > Hi Tim, > > Don’t leave us hanging… :) > > — Ken > > On Jun 30, 2021, at 12:47 PM, Tim Allison <talli...@apache.org> wrote: > > There's an apparent change in mime detection: application/msword -> > application/pkcs7-signature and a few other file formats are now > apparently being detected as pkcs2-signature... > > This is an artifact of tika-eval and not a problem. The issue is that > we used to parse files wrapped in pkcs7 sigs twice, and tika-eval > mailed to match up diff numbers of attachments. > > There may be a genuine new issue with > > > On Wed, Jun 30, 2021 at 3:06 PM Tim Allison <talli...@apache.org> wrote: > > > Reports are here: > https://corpora.tika.apache.org/base/reports/tika-1.27-pre-rc1-reports.tgz > > I've since fixed the MP4 issue. > > I'm running prepping 1.27-rc1 now. > > On Mon, Jun 28, 2021 at 3:56 PM Tim Allison <talli...@apache.org> wrote: > > > Updated dependencies that I could. Kicking off regression tests now. > Onwards to 1.27! > > Cheers, > > Tim > > On Mon, Jun 28, 2021 at 1:11 PM Nicholas DiPiazza > <nicholas.dipia...@gmail.com> wrote: > > > +1 on 1.27 release. > > On Mon, Jun 28, 2021, 10:57 AM Tim Allison <talli...@apache.org> wrote: > > > All, > The recent release of PDFBox fixed 2 DoS CVEs. Let's update our > dependencies and go for a 1.27 release soon? Any blockers? Any > strong prefs to go for a 2.0.0 or 2.0.0-BETA2 first? > > Cheers, > > Tim > > > -------------------------- > Ken Krugler > http://www.scaleunlimited.com > Custom big data solutions > Flink, Pinot, Solr, Elasticsearch > > > >