[ https://issues.apache.org/jira/browse/TIKA-623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13922890#comment-13922890 ]
Luis Filipe Nassif commented on TIKA-623: ----------------------------------------- Good job. I think a possible improvement would be to generate a html for each email, containing its metadata and content, and call the embeddedExtractor to process the generated html, instead of printing all emails directly to xhtmlContentHandler. So, in addition to attachments, emails could also be extracted from PST files if that is the goal of the application. What do you think? > Add support for Outlook PST > --------------------------- > > Key: TIKA-623 > URL: https://issues.apache.org/jira/browse/TIKA-623 > Project: Tika > Issue Type: New Feature > Components: parser > Reporter: Tran Nam Quang > Assignee: Hong-Thai Nguyen > Fix For: 1.6 > > Attachments: OutlookPSTParser.java > > > Hello everyone, > As you might know, Outlook stores its mails and other stuff in a single PST > file. There's a relatively new Java library called java-libpst for reading > Outlook PST files. It is licensed under the LGPL and available over here: > http://code.google.com/p/java-libpst/ > I have tested the library on Outlook 2000 and Outlook 2003, with good > results. It would be great if the library could be integrated into Tika. > Best regards > Tran Nam Quang -- This message was sent by Atlassian JIRA (v6.2#6252)