Just one of those days, weeks, months, years.... Sorry... and thank you,
Ken!

AIFF, we're now getting more eofs than we were.  This might be a Java
issue, but I don't think there's anything to do at the Tika level.  I don't
remember any changes in the AudioParser in 1.27.  I'll dig into it, but I'm
concerned...famous last words...

o.a.t.exception.TikaException
at o.a.t.parser.CompositeParser.parse(CompositeParser.java:287)
at o.a.t.parser.CompositeParser.parse(CompositeParser.java:281)
at o.a.t.parser.CompositeParser.parse(CompositeParser.java:281)
at o.a.t.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
at o.a.t.parser.ParserDecorator.parse(ParserDecorator.java:188)
at o.a.t.parser.DigestingParser.parse(DigestingParser.java:84)
at o.a.t.parser.ParserDecorator.parse(ParserDecorator.java:188)
at
o.a.t.parser.RecursiveParserWrapper$EmbeddedParserDecorator.parse(RecursiveParserWrapper.java:376)
at o.a.t.parser.DelegatingParser.parse(DelegatingParser.java:72)
at
o.a.t.extractor.ParsingEmbeddedDocumentExtractor.parseEmbedded(ParsingEmbeddedDocumentExtractor.java:104)
at o.a.t.parser.pkg.RarParser.parse(RarParser.java:95)
at o.a.t.parser.CompositeParser.parse(CompositeParser.java:281)
at o.a.t.parser.CompositeParser.parse(CompositeParser.java:281)
at o.a.t.parser.CompositeParser.parse(CompositeParser.java:281)
at o.a.t.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
at o.a.t.parser.ParserDecorator.parse(ParserDecorator.java:188)
at o.a.t.parser.DigestingParser.parse(DigestingParser.java:84)
at
o.a.t.parser.RecursiveParserWrapper.parse(RecursiveParserWrapper.java:239)
at o.a.t.batch.FileResourceConsumer.parse(FileResourceConsumer.java:406)
at
o.a.t.batch.fs.RecursiveParserWrapperFSConsumer.processFileResource(RecursiveParserWrapperFSConsumer.java:105)
at
o.a.t.batch.FileResourceConsumer._processFileResource(FileResourceConsumer.java:181)
at o.a.t.batch.FileResourceConsumer.call(FileResourceConsumer.java:115)
at o.a.t.batch.FileResourceConsumer.call(FileResourceConsumer.java:50)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java
at com.sun.media.sound.AiffFileReader.getCOMM(AiffFileReader.java:267)
at
com.sun.media.sound.AiffFileReader.getAudioFileFormat(AiffFileReader.java:76)
at javax.sound.sampled.AudioSystem.getAudioFileFormat(AudioSystem.java:1004)
at o.a.t.parser.audio.AudioParser.parse(AudioParser.java:73)
at o.a.t.parser.CompositeParser.parse(CompositeParser.java:281)
... 28 more


On Wed, Jun 30, 2021 at 4:37 PM Ken Krugler <kkrugler_li...@transpac.com>
wrote:

> Hi Tim,
>
> Don’t leave us hanging… :)
>
> — Ken
>
> On Jun 30, 2021, at 12:47 PM, Tim Allison <talli...@apache.org> wrote:
>
> There's an apparent change in mime detection: application/msword ->
> application/pkcs7-signature and a few other file formats are now
> apparently being detected as pkcs2-signature...
>
> This is an artifact of tika-eval and not a problem.  The issue is that
> we used to parse files wrapped in pkcs7 sigs twice, and tika-eval
> mailed to match up diff numbers of attachments.
>
> There may be a genuine new issue with
>
>
> On Wed, Jun 30, 2021 at 3:06 PM Tim Allison <talli...@apache.org> wrote:
>
>
> Reports are here:
> https://corpora.tika.apache.org/base/reports/tika-1.27-pre-rc1-reports.tgz
>
> I've since fixed the MP4 issue.
>
> I'm running prepping 1.27-rc1 now.
>
> On Mon, Jun 28, 2021 at 3:56 PM Tim Allison <talli...@apache.org> wrote:
>
>
> Updated dependencies that I could.  Kicking off regression tests now.
> Onwards to 1.27!
>
> Cheers,
>
>         Tim
>
> On Mon, Jun 28, 2021 at 1:11 PM Nicholas DiPiazza
> <nicholas.dipia...@gmail.com> wrote:
>
>
> +1 on 1.27 release.
>
> On Mon, Jun 28, 2021, 10:57 AM Tim Allison <talli...@apache.org> wrote:
>
>
> All,
>  The recent release of PDFBox fixed 2 DoS CVEs.  Let's update our
> dependencies and go for a 1.27 release soon?  Any blockers?  Any
> strong prefs to go for a 2.0.0 or 2.0.0-BETA2 first?
>
>  Cheers,
>
>              Tim
>
>
> --------------------------
> Ken Krugler
> http://www.scaleunlimited.com
> Custom big data solutions
> Flink, Pinot, Solr, Elasticsearch
>
>
>
>

Reply via email to