https://bz.apache.org/bugzilla/show_bug.cgi?id=62092

            Bug ID: 62092
           Summary: Text not extracted from grouped text shapes in HSLF
           Product: POI
           Version: 3.17-FINAL
          Hardware: PC
            Status: NEW
          Severity: normal
          Priority: P2
         Component: HSLF
          Assignee: dev@poi.apache.org
          Reporter: talli...@mitre.org
  Target Milestone: ---

On TIKA-2569, a user reported that we aren't extracting text from grouped
textshapes in HSLF...all works in pptx.  I added a workaround at the Tika level
for now.

Test file:
https://github.com/apache/tika/blob/master/tika-parsers/src/test/resources/test-documents/testPPT_groups.ppt

Unit test at the Tika level:
https://github.com/apache/tika/blob/master/tika-parsers/src/test/java/org/apache/tika/parser/microsoft/PowerPointParserTest.java#L300

When the user calls getTextParagraphs() on a slide, that should include the
text from grouped textshapes, right?

If not and we have the intended behavior, and the user has to walk through
HSLFGroupShapes, we can close this out.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org

Reply via email to