It turns out that the Colorburn / Colordodge change was done 5 years ago:

https://svn.apache.org/viewvc?view=revision&revision=1823593

The other 2 ones definitively not. I also found the pdf20-utf8-test.pdf file from January 2022 in my directory. So much things to do...

Tilman

On 09.11.2023 19:17, Tilman Hausherr wrote:
Hi Peter,

That's a lot... I'll create issues for some of these topics. The negative dash phase thing has been fixed in the latest release :
https://issues.apache.org/jira/browse/PDFBOX-5636

Things that might be possible:
- UTF-8
- Encryption
- Colorburn / Colordodge

The fill & stroke problem is a problem with Java itself, because it doesn't offer a combined fill & stroke.

OutputIntents - we don't support these at all, so we'd have to start supporting it.

Tilman

On 09.11.2023 02:28, Peter Wyatt wrote:
I would think supporting the following PDF 2.0 features are highly relevant, given that other implementations are already generating PDF 2.0 files today (see https://pdfa.org/supporting-pdf20/):

* UTF-8 support for user-visible strings, such as bookmarks (outlines), OCG layer names, certain annot fields, etc. Missing support results in the ugly display or extraction of "mojibake" for the UTF-8 BoMs. See https://pdfa.org/understanding-utf-8-in-pdf-2-0. Note that this does NOT impact content streams or text extraction (unless you also combine with Logical Structure)!

* the latest encryption (AES-GCM 256 bit), dig-sig, hash algorithms, and Unicode password support which is FAR more up-to-date and secure than all legacy crypto. 3rd parties are already generating such files which will otherwise be unreadable/unprocessable by PDFBox. I don't know what crypto library PDFBox uses, but all the new algorithms are standard modern crypto algorithms. Unicode passwords also follow the latest ICU (not knowing what you use internally) but be careful so legacy files continue to work.

* if high-quality semantic content extraction is important, then updating for PDF 2.0 standard structure types, although this is likely to be a larger dev effort.


If you want to address rendering issues, then:

* PLEASE PLEASE PLEASE fix your incorrect rendering of "fill and stroke" in the presence of transparency! See https://github.com/pdf-association/pdf-differences/blob/main/Atomic-Fill%2BStroke/README.md

* making sure you using the correct blend mode formulae for ColorBurn and ColorDodge (from 2009 - I have not checked your implementation): https://github.com/pdf-association/pdf-differences/tree/main/ColorBurn-ColorDodge

* ensuring negative dash phase is correct (previously unstated as what to do) - see https://github.com/pdf-association/pdf-differences/tree/main/Negative-DashPhase

* page-based OutputIntents. Already in use, especially in print-centric workflows where page merging and imposition across PDFs is now far easier to do. Implementation update to select a page-based OutputIntent ahead of the document-level OutputIntent should be relatively easy.

* the use of transparency and blend mode of an annotations appearance stream when rendered onto a page (not _within_ the annot).


Obviously there are other PDF 2.0 features but these would be my go-to short list for starting to address the most obvious visible differences. See also https://pdfa.org/how-to-get-started-with-pdf-2-0/ since reporting a simple PDF version is unlikely to withstand the test of time...


Of course I am also biased 😊 - and I'm not a Java expert!




---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

Reply via email to