[ https://issues.apache.org/jira/browse/TIKA-4250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17843604#comment-17843604 ]
Luís Filipe Nassif commented on TIKA-4250: ------------------------------------------ Updating results with Libpff-20231205: |For 258 pst/ost files (93GB)| | | | | | | | | |LibPst-0.6.76|LibPff-20231205|Java-libpst-0.9.5*| |Emails|195698|201792|208373| |Contacts|19738|19949|24342| |Attachments|242394|286669|275481| |Feeds|0|47916|47913| |Appointments|0|12664|15885| |Meetings|0|5285|0| |Activity|0|3457|3457| |Documents|0|2202|0| |Taks|0|578|562| |Notes|0|391|0| |Vcalendar|8642|0|0| |Vjournal|2352|0|0| |Total|468824|580903|576013| > Add a libpst-based parser > ------------------------- > > Key: TIKA-4250 > URL: https://issues.apache.org/jira/browse/TIKA-4250 > Project: Tika > Issue Type: Task > Reporter: Tim Allison > Priority: Major > > We currently use the com.pff Java-based PST parser for PST files. It would be > useful to add a wrapper for libpst as an optional parser. > One of the benefits of libpst is that it creates .eml or .msg files from the > PST records. This is critical for those who want the original bytes from > embedded files. Obv, PST doesn't store eml or msg, but some users want the > "original" emails even if they are constructed from PST records. -- This message was sent by Atlassian Jira (v8.20.10#820010)