https://bz.apache.org/bugzilla/show_bug.cgi?id=62092
Bug ID: 62092 Summary: Text not extracted from grouped text shapes in HSLF Product: POI Version: 3.17-FINAL Hardware: PC Status: NEW Severity: normal Priority: P2 Component: HSLF Assignee: dev@poi.apache.org Reporter: talli...@mitre.org Target Milestone: --- On TIKA-2569, a user reported that we aren't extracting text from grouped textshapes in HSLF...all works in pptx. I added a workaround at the Tika level for now. Test file: https://github.com/apache/tika/blob/master/tika-parsers/src/test/resources/test-documents/testPPT_groups.ppt Unit test at the Tika level: https://github.com/apache/tika/blob/master/tika-parsers/src/test/java/org/apache/tika/parser/microsoft/PowerPointParserTest.java#L300 When the user calls getTextParagraphs() on a slide, that should include the text from grouped textshapes, right? If not and we have the intended behavior, and the user has to walk through HSLFGroupShapes, we can close this out. -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org For additional commands, e-mail: dev-h...@poi.apache.org