[ https://issues.apache.org/jira/browse/TIKA-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16850930#comment-16850930 ]
Luis Filipe Nassif edited comment on TIKA-2883 at 5/29/19 3:12 PM: ------------------------------------------------------------------- Thanks for taking a look, [~talli...@apache.org]! Found this [https://github.com/joniles/rtfparserkit] But it fails with some RTFs with } chars not opened before. I've patched it to ignore those chars and it worked with all Outlook RTFs I have. Don't know if that lib supports embedded files... was (Author: lfcnassif): Thanks for taking a look, [~talli...@apache.org]! Found this [https://github.com/joniles/rtfparserkit] But it fails with some RTFs with } chars not opened before... > Text not extracted from RTF files > --------------------------------- > > Key: TIKA-2883 > URL: https://issues.apache.org/jira/browse/TIKA-2883 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 1.20, 1.19.1, 1.21 > Reporter: Luis Filipe Nassif > Assignee: Tim Allison > Priority: Major > Attachments: Message (5).rtf > > > I have a number of RTF files (extracted from PST email bodies) which text is > not extracted currently. Sample file attached. [~talli...@apache.org], do you > have any ideia what is going on? -- This message was sent by Atlassian JIRA (v7.6.3#76005)