[ https://issues.apache.org/jira/browse/PDFBOX-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17954020#comment-17954020 ]
Tilman Hausherr commented on PDFBOX-6009: ----------------------------------------- While testing with a document sent to me (9783779975731) I noticed another problem, orphan annotations in the structure tree but only when splitting the whole document, not when splitting the complete page. The cause of this is similar to the problem with the pageDictMap, and so is the solution. > Splitter does not include structure tree in documents past the first split > -------------------------------------------------------------------------- > > Key: PDFBOX-6009 > URL: https://issues.apache.org/jira/browse/PDFBOX-6009 > Project: PDFBox > Issue Type: Bug > Components: Utilities > Affects Versions: 2.0.34, 3.0.5 PDFBox > Reporter: Tilman Hausherr > Assignee: Tilman Hausherr > Priority: Major > Labels: StructureTree > Fix For: 2.0.35, 3.0.6 PDFBox, 4.0.0 > > Attachments: pdfbox-split-missing-tags_mail 15.5.2025-p1.pdf, > pdfbox-split-missing-tags_mail 15.5.2025-p2.pdf, > pdfbox-split-missing-tags_mail 15.5.2025-p3.pdf, > pdfbox-split-missing-tags_mail 15.5.2025.pdf > > > As submitted by Alastair Porter in the users mailing list > java -jar pdfbox/app/target/pdfbox-app-4.0.0-SNAPSHOT.jar split -i input.pdf > -outputPrefix output-split > Only first page has the appropriate structure tree (/K is missing) > === from the post in the mailing list === > In the first file, I correctly see the /K element. What's more, this element > has correctly been pruned and doesn't include any items from the input > document which point to pages that are not in this split. > In subsequent split files, I see no /K element in the StructTreeRoot at all. > I attached a PDF which I've been using for simple testing, which exhibits > this behaviour. > I had a bit of a look through the existing code, and I see that in > Splitter.java, in cloneStructureTree > {code:java} > COSBase k1 = srcStructureTreeRoot.getK(); > COSBase k2 = new KCloner(dstPageTree).createClone(k1, > dstStructureTreeRoot.getCOSObject(), null); > dstStructureTreeRoot.setK(k2); > {code} > k2 is always null after the first split, it seems like it may not be created > correctly. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org