[ 
https://issues.apache.org/jira/browse/TIKA-4381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17930387#comment-17930387
 ] 

Tim Allison commented on TIKA-4381:
-----------------------------------

K. The trick was to actually read the specs and to rely on some great work 
already done on Apache POI. Specifically, the work in parsing the nameidchunks 
was critical. That data has to be used to map the local-in-file storage ids 
with the actual ids and long ids in MS-OXPROPS.

This is what we're now pulling from an appointment msg file in our unit tests:
{noformat}
mapi:raw:PidLidAgingDontAgeMe : false
mapi:raw:PidLidAppointmentAuxiliaryFlags : 0
mapi:raw:PidLidAppointmentColor : 0
mapi:raw:PidLidAppointmentCounterProposal : false
mapi:raw:PidLidAppointmentDuration : 30
mapi:raw:PidLidAppointmentEndWhole : 2017-02-28T19:00:00Z
mapi:raw:PidLidAppointmentNotAllowPropose : false
mapi:raw:PidLidAppointmentProposalNumber : 0
mapi:raw:PidLidAppointmentProposedDuration : 0
mapi:raw:PidLidAppointmentSequence : 0
mapi:raw:PidLidAppointmentStartWhole : 2017-02-28T18:30:00Z
mapi:raw:PidLidAppointmentStateFlags : 0
mapi:raw:PidLidAppointmentSubType : false
mapi:raw:PidLidAutoFillLocation : false
mapi:raw:PidLidBusyStatus : 2
mapi:raw:PidLidClipEnd : 2017-02-28T19:00:00Z
mapi:raw:PidLidClipStart : 2017-02-28T18:30:00Z
mapi:raw:PidLidCommonEnd : 2017-02-28T19:00:00Z
mapi:raw:PidLidCommonStart : 2017-02-28T18:30:00Z
mapi:raw:PidLidConferencingType : 0
mapi:raw:PidLidCurrentVersion : 166965
mapi:raw:PidLidFInvited : false
mapi:raw:PidLidIntendedBusyStatus : -1
mapi:raw:PidLidPrivate : false
mapi:raw:PidLidRecurrenceType : 0
mapi:raw:PidLidRecurring : false
mapi:raw:PidLidReminderDelta : 15
mapi:raw:PidLidReminderSet : false
mapi:raw:PidLidReminderSignalTime : 4501-01-01T00:00:00Z
mapi:raw:PidLidReminderTime : 2017-02-28T18:30:00Z
mapi:raw:PidLidResponseStatus : 0
mapi:raw:PidLidSideEffects : 369
mapi:raw:PidLidTaskMode : 0
mapi:raw:PidLidValidFlagStringProof : 2017-02-28T18:42:23Z
{noformat}

> Improve extraction of metadata from Appointment/Task msgs
> ---------------------------------------------------------
>
>                 Key: TIKA-4381
>                 URL: https://issues.apache.org/jira/browse/TIKA-4381
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Major
>         Attachments: Parser.java
>
>
> Our metadata extraction on msgs is mostly focused on "NOTE"/regular emails. 
> We could do to improve extraction from appointments, tasks and other msg 
> types.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to