[ https://issues.apache.org/jira/browse/TIKA-4398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17946283#comment-17946283 ]
mannixli commented on TIKA-4398: -------------------------------- main code, pomsĀ tika.version=3.1.0,see parsers inĀ log `meta X-TIKA:Parsed-xx` !image-2025-04-22-11-37-15-401.png!!image-2025-04-22-11-26-09-936.png!!image-2025-04-22-11-27-33-655.png! 2025-04-22 11:33:01.359 [parser-1] INFO [trace_id=100865326431632558047] c.b.n.a.aspect.LimitImageExtractor - embedded name [Content_Types].xml, type null 2025-04-22 11:33:01.360 [parser-1] INFO [trace_id=100865326431632558047] c.b.n.a.aspect.LimitImageExtractor - embedded name _rels/.rels, type null 2025-04-22 11:33:01.360 [parser-1] INFO [trace_id=100865326431632558047] c.b.n.a.aspect.LimitImageExtractor - embedded name word/document.xml, type null 2025-04-22 11:33:01.361 [parser-1] INFO [trace_id=100865326431632558047] c.b.n.a.aspect.LimitImageExtractor - embedded name word/_rels/document.xml.rels, type null 2025-04-22 11:33:01.361 [parser-1] INFO [trace_id=100865326431632558047] c.b.n.a.aspect.LimitImageExtractor - embedded name word/styles.xml, type null 2025-04-22 11:33:01.362 [parser-1] INFO [trace_id=100865326431632558047] c.b.n.a.aspect.LimitImageExtractor - embedded name word/settings.xml, type null 2025-04-22 11:33:01.362 [parser-1] INFO [trace_id=100865326431632558047] c.b.n.a.aspect.LimitImageExtractor - embedded name word/numbering.xml, type null 2025-04-22 11:33:01.363 [parser-1] INFO [trace_id=100865326431632558047] c.b.n.a.aspect.LimitImageExtractor - embedded name docProps/core.xml, type null 2025-04-22 11:33:01.363 [parser-1] INFO [trace_id=100865326431632558047] c.b.n.a.aspect.LimitImageExtractor - embedded name docProps/app.xml, type null 2025-04-22 11:33:01.363 [parser-1] INFO [trace_id=100865326431632558047] c.b.n.a.s.impl.ParserServiceImpl - text_parse_over_success , meta: X-TIKA:Parsed-By=org.apache.tika.parser.DefaultParser X-TIKA:Parsed-By=org.apache.tika.parser.pkg.PackageParser X-TIKA:Parsed-By-Full-Set=org.apache.tika.parser.DefaultParser X-TIKA:Parsed-By-Full-Set=org.apache.tika.parser.pkg.PackageParser resourceName=aaa.docx X-TIKA:detectedEncoding=ISO-8859-1 X-TIKA:encodingDetector=UniversalEncodingDetector Content-Type=application/vnd.openxmlformats-officedocument.wordprocessingml.document > When extracting a docx file with Tika 3.1.0, the package parser was detected > instead of the OOXML parser > -------------------------------------------------------------------------------------------------------- > > Key: TIKA-4398 > URL: https://issues.apache.org/jira/browse/TIKA-4398 > Project: Tika > Issue Type: Bug > Components: tika-core > Affects Versions: 3.1.0 > Environment: java17 > Reporter: mannixli > Priority: Major > Attachments: 01.docx, image-2025-04-16-20-46-07-228.png, > image-2025-04-22-11-26-09-936.png, image-2025-04-22-11-27-33-655.png, > image-2025-04-22-11-37-15-401.png > > > 3.0.0 detected ooxml parser -- This message was sent by Atlassian Jira (v8.20.10#820010)