[
https://issues.apache.org/jira/browse/TIKA-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18066245#comment-18066245
]
Tilman Hausherr edited comment on TIKA-4563 at 3/17/26 6:01 AM:
----------------------------------------------------------------
About the new exceptions:
|FILE_PATH|FILE_NAME_A|FILE_NAME_B|CONTAINER_LENGTH|MIME_TYPE_A|MIME_TYPE_B|
|commoncrawl3/HL/HLWYVCE7GCMA7ROCTWHB4U7POX3IQOTU|HLWYVCE7GCMA7ROCTWHB4U7POX3IQOTU|HLWYVCE7GCMA7ROCTWHB4U7POX3IQOTU|727229|application/vnd.openxmlformats-officedocument.wordprocessingml.document|application/vnd.openxmlformats-officedocument.wordprocessingml.document|
|commoncrawl3_refetched/QC/QCBMMQI7XKECYIXBTSCCIWGUKRN5AYY3|/thumbnail.jpeg|/thumbnail.jpeg|16059258|audio/mpeg|image/jpeg|
|commoncrawl3_refetched/QC/QCBMMQI7XKECYIXBTSCCIWGUKRN5AYY3|/image3.png|/image3.png|16059258|audio/mpeg|image/png|
|commoncrawl3_refetched/QC/QCBMMQI7XKECYIXBTSCCIWGUKRN5AYY3|/image11.png|/image11.png|16059258|audio/mpeg|image/png|
|commoncrawl3_refetched/QC/QCBMMQI7XKECYIXBTSCCIWGUKRN5AYY3|/image12.png|/image12.png|16059258|audio/mpeg|image/png|
|commoncrawl3_refetched/QC/QCBMMQI7XKECYIXBTSCCIWGUKRN5AYY3|/image13.png|/image13.png|16059258|audio/mpeg|image/png|
|commoncrawl3_refetched/QC/QCBMMQI7XKECYIXBTSCCIWGUKRN5AYY3|/image15.png|/image15.png|16059258|audio/mpeg|image/png|
|bug_trackers/MOZILLA/1187239-1240516/MOZILLA-1192770-0.gz|/MOZILLA-1192770-0|/testcase-expat.xml|465|text/plain;
charset=UTF-16LE|application/xml|
The first two and the last one were in the previous test. The other ones are
new but these images are really broken. The last one is really broken.
The first one has "Could not locate compiled schema resource
org/apache/poi/schemas/ooxml/system/ooxml/ctgvmlgroupshape6aabtype.xsb", I
wonder if this is a dependency problem or a bug in POI. However I looked in my
IDE and it does have the path and many other xsb files.
was (Author: tilman):
About the new exceptions:
|FILE_PATH|FILE_NAME_A|FILE_NAME_B|CONTAINER_LENGTH|MIME_TYPE_A|MIME_TYPE_B|
|commoncrawl3/HL/HLWYVCE7GCMA7ROCTWHB4U7POX3IQOTU|HLWYVCE7GCMA7ROCTWHB4U7POX3IQOTU|HLWYVCE7GCMA7ROCTWHB4U7POX3IQOTU|727229|application/vnd.openxmlformats-officedocument.wordprocessingml.document|application/vnd.openxmlformats-officedocument.wordprocessingml.document|
|commoncrawl3_refetched/QC/QCBMMQI7XKECYIXBTSCCIWGUKRN5AYY3|/thumbnail.jpeg|/thumbnail.jpeg|16059258|audio/mpeg|image/jpeg|
|commoncrawl3_refetched/QC/QCBMMQI7XKECYIXBTSCCIWGUKRN5AYY3|/image3.png|/image3.png|16059258|audio/mpeg|image/png|
|commoncrawl3_refetched/QC/QCBMMQI7XKECYIXBTSCCIWGUKRN5AYY3|/image11.png|/image11.png|16059258|audio/mpeg|image/png|
|commoncrawl3_refetched/QC/QCBMMQI7XKECYIXBTSCCIWGUKRN5AYY3|/image12.png|/image12.png|16059258|audio/mpeg|image/png|
|commoncrawl3_refetched/QC/QCBMMQI7XKECYIXBTSCCIWGUKRN5AYY3|/image13.png|/image13.png|16059258|audio/mpeg|image/png|
|commoncrawl3_refetched/QC/QCBMMQI7XKECYIXBTSCCIWGUKRN5AYY3|/image15.png|/image15.png|16059258|audio/mpeg|image/png|
|bug_trackers/MOZILLA/1187239-1240516/MOZILLA-1192770-0.gz|/MOZILLA-1192770-0|/testcase-expat.xml|465|text/plain;
charset=UTF-16LE|application/xml|
The first two and the last one were in the previous test. The other ones are
new but these images are really broken. The last one is really broken.
The first one has "Could not locate compiled schema resource
org/apache/poi/schemas/ooxml/system/ooxml/ctgvmlgroupshape6aabtype.xsb", I
wonder if this is a dependency problem or a bug in POI.
> Prep for 3.3.0 release
> ----------------------
>
> Key: TIKA-4563
> URL: https://issues.apache.org/jira/browse/TIKA-4563
> Project: Tika
> Issue Type: Task
> Reporter: Tim Allison
> Priority: Major
> Attachments: kio5_perldoc.mo, tika-3.3.0-20260110.tgz,
> tika-3.3.0-reports.tgz, tika-3.3.0.tgz, tika-3.3.0c.tgz
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)