[
https://issues.apache.org/jira/browse/PDFBOX-6140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18049941#comment-18049941
]
Tilman Hausherr edited comment on PDFBOX-6140 at 1/5/26 9:13 PM:
-----------------------------------------------------------------
I looked at the list of fixed issues, none are related to parsing or text
extraction. Xmpbox is not used for PDF extraction in Tika.
I did run regression tests with xmp on our test corpus, but with my own
software and I didn't keep track of individual files, but I sorted the error
messages and then looked for unexplainable ones, which is why there were so
many issues.
was (Author: tilman):
I looked at the list of fixed issues, none are related to parsing or text
extraction. Xmpbox is not used for PDF extraction in Tika.
I did run regression tests with xmp on our test corpus, but with my own
software and I didn't keep track of individual files, but I sorted the error
messages and then looked for unexplainable ones, which is why there were so
many.
> Run regression tests for 3.0.7
> ------------------------------
>
> Key: PDFBOX-6140
> URL: https://issues.apache.org/jira/browse/PDFBOX-6140
> Project: PDFBox
> Issue Type: Task
> Affects Versions: 3.0.7 PDFBox
> Reporter: Tilman Hausherr
> Assignee: Tilman Hausherr
> Priority: Major
> Fix For: 3.0.7 PDFBox
>
> Attachments: reports_pdfbox_3.0.6_vs_3.0.7_1.tar.xz
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]