[ https://issues.apache.org/jira/browse/TIKA-4398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17946411#comment-17946411 ]
mannixli commented on TIKA-4398: -------------------------------- I used your code, output is: Extracted? yes X-TIKA:Parsed-By: [org.apache.tika.parser.DefaultParser, org.apache.tika.parser.pkg.PackageParser] X-TIKA:Parsed-By-Full-Set: [org.apache.tika.parser.DefaultParser, org.apache.tika.parser.pkg.PackageParser, org.apache.tika.parser.xml.DcXMLParser, org.apache.tika.parser.image.ImageParser] X-TIKA:detectedEncoding: ISO-8859-1 X-TIKA:encodingDetector: UniversalEncodingDetector Content-Type: application/zip :( > When extracting a docx file with Tika 3.1.0, the package parser was detected > instead of the OOXML parser > -------------------------------------------------------------------------------------------------------- > > Key: TIKA-4398 > URL: https://issues.apache.org/jira/browse/TIKA-4398 > Project: Tika > Issue Type: Bug > Components: tika-core > Affects Versions: 3.1.0 > Environment: java17 > Reporter: mannixli > Priority: Major > Attachments: 01.docx, image-2025-04-16-20-46-07-228.png, > image-2025-04-22-11-26-09-936.png, image-2025-04-22-11-27-33-655.png, > image-2025-04-22-11-37-15-401.png > > > 3.0.0 detected ooxml parser -- This message was sent by Atlassian Jira (v8.20.10#820010)