[ 
https://issues.apache.org/jira/browse/TIKA-623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13922890#comment-13922890
 ] 

Luis Filipe Nassif commented on TIKA-623:
-----------------------------------------

Good job. I think a possible improvement would be to generate a html for each 
email, containing its metadata and content, and call the embeddedExtractor to 
process the generated html, instead of printing all emails directly to 
xhtmlContentHandler.  So, in addition to attachments, emails could also be 
extracted from PST files if that is the goal of the application. What do you 
think?

> Add support for Outlook PST
> ---------------------------
>
>                 Key: TIKA-623
>                 URL: https://issues.apache.org/jira/browse/TIKA-623
>             Project: Tika
>          Issue Type: New Feature
>          Components: parser
>            Reporter: Tran Nam Quang
>            Assignee: Hong-Thai Nguyen
>             Fix For: 1.6
>
>         Attachments: OutlookPSTParser.java
>
>
> Hello everyone,
> As you might know, Outlook stores its mails and other stuff in a single PST 
> file. There's a relatively new Java library called java-libpst for reading 
> Outlook PST files. It is licensed under the LGPL and available over here: 
> http://code.google.com/p/java-libpst/
> I have tested the library on Outlook 2000 and Outlook 2003, with good 
> results. It would be great if the library could be integrated into Tika.
> Best regards
> Tran Nam Quang



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to