[ https://issues.apache.org/jira/browse/TIKA-4250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17844976#comment-17844976 ]
Tim Allison commented on TIKA-4250: ----------------------------------- libpff issue opened: https://github.com/libyal/libpff/issues/128 Note that I found non-deterministic behavior even without debug on -- sometimes I got 7 extracted files, sometimes 8. I noted that in the issue. > Add a libpst-based parser > ------------------------- > > Key: TIKA-4250 > URL: https://issues.apache.org/jira/browse/TIKA-4250 > Project: Tika > Issue Type: Task > Reporter: Tim Allison > Priority: Major > Attachments: 8.eml, 8.msg > > > We currently use the com.pff Java-based PST parser for PST files. It would be > useful to add a wrapper for libpst as an optional parser. > One of the benefits of libpst is that it creates .eml or .msg files from the > PST records. This is critical for those who want the original bytes from > embedded files. Obv, PST doesn't store eml or msg, but some users want the > "original" emails even if they are constructed from PST records. -- This message was sent by Atlassian Jira (v8.20.10#820010)