[
https://issues.apache.org/jira/browse/TIKA-3979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694154#comment-17694154
]
ASF GitHub Bot commented on TIKA-3979:
--------------------------------------
apismensky commented on PR #985:
URL: https://github.com/apache/tika/pull/985#issuecomment-1446966128
Confirming with my file:
Before fix: 26844 ms
After fix: 692 ms
Yay yay!
@nddipiazza thanks for fixing!
> OneNoteParser - Improve performance for deserialization
> -------------------------------------------------------
>
> Key: TIKA-3979
> URL: https://issues.apache.org/jira/browse/TIKA-3979
> Project: Tika
> Issue Type: Improvement
> Components: parser
> Affects Versions: 2.7.0
> Reporter: David Xie
> Priority: Major
> Attachments: image-2023-02-20-14-42-10-590.png,
> image-2023-02-25-12-01-40-311.png
>
>
> We noticed some performance issues specific to parsing OneNote files. Our cpu
> profiler reports that the parser spends a lot of time on deserializing byte
> arrays (image included below)
> !image-2023-02-20-14-42-10-590.png!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)