[ 
https://issues.apache.org/jira/browse/TIKA-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17888321#comment-17888321
 ] 

Tilman Hausherr commented on TIKA-4280:
---------------------------------------

One weird thing: commoncrawl3/2P/2PSMEFJEYU7EPAZXQQDD6OL2WOQLBJRY, this is a 
compressed file. In "A" it appears as "application/json; charset=ISO-8859-1", 
in "B" as "text/csv; charset=ISO-8859-1; delimiter=colon". The file itself 
starts with "PK" so shouldn't this be easy?

> Tasks for the 3.0.0 release
> ---------------------------
>
>                 Key: TIKA-4280
>                 URL: https://issues.apache.org/jira/browse/TIKA-4280
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Major
>
> I'm too lazy to open separate tickets. Please do so if desired.
> Some items:
> * Before releasing the real 3.0.0 we need to remove any "-M" dependencies
> * Decide about the ffmpeg issue and the hdf5 issue
> * Run the regression tests vs 2.9.x
> * Convert tika-grpc to use the dependency plugin instead of the shade plugin
> * Turn javadocs back on. I got errors during the deploy process because 
> javadoc needed the auto-generated code ("cannot find symbol 
> DeleteFetcherRequest"). We need to enable javadocs for the rest of the 
> project.
> * TIKA-4290 Tilman question
> Other things? Thank you [~tilman] for the first two!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to