[ https://issues.apache.org/jira/browse/TIKA-4386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17928414#comment-17928414 ]
Tim Allison commented on TIKA-4386: ----------------------------------- Thank you for opening this and sharing a triggering file. I don't know if I'll have much time to work on this. Perhaps another dev might or you? It looks like the numbering that isn't being picked up relies on a paragraph style in {{document.xml}} (e.g. {{ListNumber}}) and a reference to that paragraph style in {{numbering.xml}} in the first case. In the second, Tika is relying on the {{numId}} in the paragraph, but not iterating it because of a failure in {{incrementLevel}}? {noformat} <w:abstractNum w15:restartNumberingAfterBreak="0" w:abstractNumId="1"> <w:nsid w:val="FFFFFF88"/> <w:multiLevelType w:val="singleLevel"/> <w:tmpl w:val="94064EA8"/> <w:lvl w:ilvl="0"> <w:start w:val="1"/> <w:numFmt w:val="decimal"/> <w:pStyle w:val="ListNumber"/> <w:lvlText w:val="%1."/> <w:lvlJc w:val="left"/> <w:pPr> <w:tabs> <w:tab w:pos="360" w:val="num"/> </w:tabs> <w:ind w:hanging="360" w:left="360"/> </w:pPr> </w:lvl> </w:abstractNum> {noformat} > Issues with numbered lists in Word .docx files > ---------------------------------------------- > > Key: TIKA-4386 > URL: https://issues.apache.org/jira/browse/TIKA-4386 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 3.1.0 > Reporter: Stephen H > Priority: Minor > Attachments: list-numbering-examples.docx > > > There seem to be some inconsistencies with processing numbered lists. On a > brand new Word document: > - If I select the List Number style and enter items then Tika's output > includes none of the numbers. > - If I select the numbered list button in the ribbon and enter items then in > Tika's output every item in the list has a number of '1'. > - If I don't select anything first and just rely on Word automatically > creating items (which it does in a List Paragraph style rather than List > Number) then Tika's output is correct. > An example document is attached. > Not sure if this is related to TIKA-2781 but this isn't in headings. -- This message was sent by Atlassian Jira (v8.20.10#820010)