mannixli created TIKA-3604:
--
Summary: Upgrade pdfbox3
Key: TIKA-3604
URL: https://issues.apache.org/jira/browse/TIKA-3604
Project: Tika
Issue Type: Improvement
Components: parser
Affec
mannixli created TIKA-4398:
--
Summary: When extracting a docx file with Tika 3.1.0, the package
parser was detected instead of the OOXML parser
Key: TIKA-4398
URL: https://issues.apache.org/jira/browse/TIKA-4398
[
https://issues.apache.org/jira/browse/TIKA-4398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17942070#comment-17942070
]
mannixli commented on TIKA-4398:
no,same params test on 3.1.0 and 3.0.0
> When extracting
[
https://issues.apache.org/jira/browse/TIKA-4398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
mannixli updated TIKA-4398:
---
Attachment: image-2025-04-16-20-46-07-228.png
> When extracting a docx file with Tika 3.1.0, the package parse
[
https://issues.apache.org/jira/browse/TIKA-4398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17946283#comment-17946283
]
mannixli commented on TIKA-4398:
main code, poms tika.version=3.1.0,see parsers in log `
[
https://issues.apache.org/jira/browse/TIKA-4398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
mannixli updated TIKA-4398:
---
Attachment: image-2025-04-22-11-37-15-401.png
> When extracting a docx file with Tika 3.1.0, the package parse
[
https://issues.apache.org/jira/browse/TIKA-4398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17946321#comment-17946321
]
mannixli commented on TIKA-4398:
please check the test code.
List> excludeParsers = Arra
[
https://issues.apache.org/jira/browse/TIKA-4398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
mannixli updated TIKA-4398:
---
Attachment: image-2025-04-22-11-26-09-936.png
> When extracting a docx file with Tika 3.1.0, the package parse
[
https://issues.apache.org/jira/browse/TIKA-4398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
mannixli updated TIKA-4398:
---
Attachment: image-2025-04-22-11-27-33-655.png
> When extracting a docx file with Tika 3.1.0, the package parse
[
https://issues.apache.org/jira/browse/TIKA-4398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17946411#comment-17946411
]
mannixli commented on TIKA-4398:
I used your code, output is:
Extracted? yes
X-TIKA:Parse
[
https://issues.apache.org/jira/browse/TIKA-4398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17946700#comment-17946700
]
mannixli commented on TIKA-4398:
I get it, i use the org.apache.poi 5.2.5, but new tika is
[
https://issues.apache.org/jira/browse/TIKA-4398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17945208#comment-17945208
]
mannixli commented on TIKA-4398:
[^01.docx]
^still not work, please check^
> When extrac
[
https://issues.apache.org/jira/browse/TIKA-4398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
mannixli updated TIKA-4398:
---
Attachment: 01.docx
> When extracting a docx file with Tika 3.1.0, the package parser was detected
> instead
[
https://issues.apache.org/jira/browse/TIKA-4398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17945209#comment-17945209
]
mannixli commented on TIKA-4398:
[^01.docx]
> When extracting a docx file with Tika 3.1.0
14 matches
Mail list logo