Hi, thanks for help! I could finally try the 3.0.1-SNAPSHOT version, an hour ago. The main issue with garbage text on the pdf/table's header line, was solved, this looks ok.
There are some other problems, all related to the flatten() method. One is, that images added to a pdf form are converted to a PDPushButton component on the AcroForm, and then they are subject to flatten. Version 2.0.29 correctly removes them from the form, and adds the image to the content stream. Version 3.0.0, and also 3.0.1-SNAPSHOT produces a pdf with missing images. I added a filtering, so these are filtered from flatten call: private static List<PDField> filterPDPushButtonFields(PDAcroForm form) { return form.getFields().stream().filter(checkBeingPdPushButton()).collect(Collectors.toList()); } form.flatten(filterPDPushButtonFields(form), false); this looks ok, regarding the image. I re-tested these with 3.0.1-SNAPSHOT, and it has text distorted, letters badly positioned: [image: distorted-text.png] I am using the following to load the template pdf: InputStream resourceAsStream = ... PDDocument a1doc = Loader.loadPDF(new RandomAccessReadBuffer( resourceAsStream), () -> new ScratchFile(MemoryUsageSetting. setupTempFileOnly())); Original code used tempfile only for buffering the pdf load, I tried to imitate it. When the errors popped up, also tried to use RandomAccessReadBuffer, and it had no effect on the problems, so I set back using tempfile. PDAcroForm form = a1doc.getDocumentCatalog().getAcroForm(null); intentionally using the null parameter to prevent fixups running, so new behavior does not change much. On Sat, Sep 16, 2023 at 7:20 PM Tilman Hausherr <thaush...@t-online.de> wrote: > Hi, > > Andreas fixed it, please try a snapshot > > https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/3.0.1-SNAPSHOT/ > > Tilman > > On 16.09.2023 05:05, Tilman Hausherr wrote: > > This sounds similar to > > https://issues.apache.org/jira/browse/PDFBOX-5666 > > > > Tilman > > > > On 15.09.2023 23:40, Pados Attila wrote: > >> Our application uses 2.0.29 version without any problem. I am trying > >> to upgrade to 3.0.0. > >> > >> The application follows the pattern of loading in a template pdf, > >> which was edited to have some fields, with title and default value. > >> The program later updates the field's value, and then calles flatten() > >> on the AcroForm, so the fields are transformed to a readable, > >> positioned text. > >> > >> After the change to 3.0.0 and calling the flatten() method, the > >> field's title/caption values which are created on the template, turn > >> into garbage. > >> > >> One example is, when a table/grid element's header line contains a > >> text that includes () signs. > >> After the braces, the text becomes garbage. > >> > >> Images added on the template, as part of the form, were left from the > >> output. > >> I could work around this by filtering the acroform fields, and skip > >> the PDPushButton type fields. > >> > >> I could check with the pdfdebugger, that the main font used in the > >> template is lost in the output. > >> Any help would be appreciated, I am basically stucked with this task, > >> and probably there is a PdfBox bug in the background. > >> > >> I am expert with java, but not with pdfbox, the original working code > >> was made by someone else in the team. > >> > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org > > For additional commands, e-mail: users-h...@pdfbox.apache.org > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org > For additional commands, e-mail: users-h...@pdfbox.apache.org > > -- Attila Pados Java developer +36204432457