Hi,
Please share that file (upload to a sharehoster that doesn't need
registration)
Tilman
Am 16.02.2026 um 16:32 schrieb Yuxiao Zeng:
Hi team, thank you for your great work on PDFBox!
I want to report an issue with PDF parsing/rendering.
In production, we have encountered with a PDF file that is not rendered
properly with PDFBox. It looks like it's cut in the middle. On the other
hand, Acrobat and pdf.js can render it without any problem.
I troubleshot the issue. PDFBox reports a warning at a specific offset,
which is in the middle of a string parameter to a TJ operator. What's
interesting is that, the string contains the byte sequence "\\)\n>" (hex:
5C 29 0A 3E) around the offset. I found that PDFBox has a special handling
<https://github.com/apache/pdfbox/blob/2.0.28/pdfbox/src/main/java/org/apache/pdfbox/pdfparser/BaseParser.java#L480>
for this byte sequence. This seems to explain our issue perfectly.
Looking at the comment
<https://github.com/apache/pdfbox/blob/2.0.28/pdfbox/src/main/java/org/apache/pdfbox/pdfparser/BaseParser.java#L365>
I
can understand that it's trying to work around some PDF producer bug.
However, now it causes a rendering error for properly generated PDF files.
Is there something that we can do to get our PDFs rendered correctly?
--
Yuxiao Zeng(ユーシャオ ゼン)
*スタッフエンジニアリングマネージャー*
*医療情報技師*
Flatiron Health株式会社
https://flatiron.co.jp
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]