It turns out that the Colorburn / Colordodge change was done 5 years ago:
https://svn.apache.org/viewvc?view=revision&revision=1823593
The other 2 ones definitively not. I also found the pdf20-utf8-test.pdf
file from January 2022 in my directory. So much things to do...
Tilman
On 09.11.2023 19:17, Tilman Hausherr wrote:
Hi Peter,
That's a lot... I'll create issues for some of these topics. The
negative dash phase thing has been fixed in the latest release :
https://issues.apache.org/jira/browse/PDFBOX-5636
Things that might be possible:
- UTF-8
- Encryption
- Colorburn / Colordodge
The fill & stroke problem is a problem with Java itself, because it
doesn't offer a combined fill & stroke.
OutputIntents - we don't support these at all, so we'd have to start
supporting it.
Tilman
On 09.11.2023 02:28, Peter Wyatt wrote:
I would think supporting the following PDF 2.0 features are highly
relevant, given that other implementations are already generating PDF
2.0 files today (see https://pdfa.org/supporting-pdf20/):
* UTF-8 support for user-visible strings, such as bookmarks
(outlines), OCG layer names, certain annot fields, etc. Missing
support results in the ugly display or extraction of "mojibake" for
the UTF-8 BoMs. See https://pdfa.org/understanding-utf-8-in-pdf-2-0.
Note that this does NOT impact content streams or text extraction
(unless you also combine with Logical Structure)!
* the latest encryption (AES-GCM 256 bit), dig-sig, hash algorithms,
and Unicode password support which is FAR more up-to-date and secure
than all legacy crypto. 3rd parties are already generating such files
which will otherwise be unreadable/unprocessable by PDFBox. I don't
know what crypto library PDFBox uses, but all the new algorithms are
standard modern crypto algorithms. Unicode passwords also follow the
latest ICU (not knowing what you use internally) but be careful so
legacy files continue to work.
* if high-quality semantic content extraction is important, then
updating for PDF 2.0 standard structure types, although this is
likely to be a larger dev effort.
If you want to address rendering issues, then:
* PLEASE PLEASE PLEASE fix your incorrect rendering of "fill and
stroke" in the presence of transparency! See
https://github.com/pdf-association/pdf-differences/blob/main/Atomic-Fill%2BStroke/README.md
* making sure you using the correct blend mode formulae for ColorBurn
and ColorDodge (from 2009 - I have not checked your implementation):
https://github.com/pdf-association/pdf-differences/tree/main/ColorBurn-ColorDodge
* ensuring negative dash phase is correct (previously unstated as
what to do) - see
https://github.com/pdf-association/pdf-differences/tree/main/Negative-DashPhase
* page-based OutputIntents. Already in use, especially in
print-centric workflows where page merging and imposition across PDFs
is now far easier to do. Implementation update to select a page-based
OutputIntent ahead of the document-level OutputIntent should be
relatively easy.
* the use of transparency and blend mode of an annotations appearance
stream when rendered onto a page (not _within_ the annot).
Obviously there are other PDF 2.0 features but these would be my
go-to short list for starting to address the most obvious visible
differences.
See also https://pdfa.org/how-to-get-started-with-pdf-2-0/ since
reporting a simple PDF version is unlikely to withstand the test of
time...
Of course I am also biased 😊 - and I'm not a Java expert!
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org