[ 
https://issues.apache.org/jira/browse/TIKA-4381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17930392#comment-17930392
 ] 

Tim Allison commented on TIKA-4381:
-----------------------------------

I attached to this issue the hackery of a parser I wrote to extract info from 
MS-OXPROPS to generate the table of props that the ExtendedMetadataExtractor is 
now using. 

I first copied/pasted the text from the PDF spec into a text file, and then I 
also had to manually fix at least one typo and I manually added some spaces for 
rare page break issues.

> Improve extraction of metadata from Appointment/Task msgs
> ---------------------------------------------------------
>
>                 Key: TIKA-4381
>                 URL: https://issues.apache.org/jira/browse/TIKA-4381
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Major
>         Attachments: Parser.java
>
>
> Our metadata extraction on msgs is mostly focused on "NOTE"/regular emails. 
> We could do to improve extraction from appointments, tasks and other msg 
> types.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to