[ 
https://issues.apache.org/jira/browse/TIKA-4375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17923367#comment-17923367
 ] 

Tim Allison commented on TIKA-4375:
-----------------------------------

Y, the bmp thing is weird... {{New BMP version not implemented yet.}} This zip 
file has most of the bmps that caused problems: 
{{commoncrawl3/JK/JKMFT7XDUF7VRB6WH4D6ECD6DE6MX32T}}. It is trivially 
reproducible.

I'll take a look.

The json, I'm not as concerned with because we have a hard time detecting json 
without a filename hint. The encoding difference (which I acknowledge is wrong) 
comes in with the updated encoding detector. I don't like it, but I'm not sure 
there's much we can do.

> Regression tests for 2.9.3 release
> ----------------------------------
>
>                 Key: TIKA-4375
>                 URL: https://issues.apache.org/jira/browse/TIKA-4375
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Major
>         Attachments: 43R5U3BXJUDJXDZ25OAE33ZU47362WLV.zip, 
> LTWA2JGVJGJ5RVKHTUX6SDS4NTL5UJVQ-p139.pdf, RYT4H6OCPKZPFG3YK5PGLETS6Q3SBUDV, 
> reports-tika-2.9.3-rc1.tgz, tika-2.9.2-v-tika-2.9.3-reports.tgz
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to