[ 
https://issues.apache.org/jira/browse/TIKA-3290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17290481#comment-17290481
 ] 

Tim Allison commented on TIKA-3290:
-----------------------------------

We're hoping to release 1.26 fairly soon.  My feeling is that 1.26 doesn't have 
many changes over 1.25, but you can check with {{git diff}} for the specific 
code changes.  I share your desire not to "break any other scenarios."  We 
can't make guarantees...but we do what we can.

We run Tika against at least 1 million files, sometimes 4 million, and look for 
diffs/increased exceptions etc. before we make releases.  That does _not_ mean 
that we won't miss something like the above.

I'd recommend building an internal regression corpus and running tika-eval over 
the diffs btwn output from 1.25 and 1.26 to see if there are any problems on 
your documents.

I'm probably going to kick off our regression testing tomorrow or Friday.  If 
you need any other fixes in 1.26, please let us know.

> Extension reading it as eml instead of txt
> ------------------------------------------
>
>                 Key: TIKA-3290
>                 URL: https://issues.apache.org/jira/browse/TIKA-3290
>             Project: Tika
>          Issue Type: Bug
>          Components: core, mime
>    Affects Versions: 1.25
>            Reporter: Vamsi Molli
>            Priority: Major
>              Labels: tika-parsers
>             Fix For: 1.24.1
>
>         Attachments: image-2021-02-22-10-13-08-447.png, 
> image-2021-02-23-12-39-00-778.png, test_sample_message.txt
>
>
> The attached file extension is reading it as eml instead of txt. With version 
> 1.24.1 it is reading it as txt and now with the upgrade to 1.25, it is 
> reading it as eml. So that while parsing we are getting mail corrupted error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to