Stephen H created TIKA-4386: ------------------------------- Summary: Issues with numbered lists in Word .docx files Key: TIKA-4386 URL: https://issues.apache.org/jira/browse/TIKA-4386 Project: Tika Issue Type: Bug Components: parser Affects Versions: 3.1.0 Reporter: Stephen H Attachments: list-numbering-examples.docx
There seem to be some inconsistencies with processing numbered lists. On a brand new Word document: - If I select the List Number style and enter items then Tika's output includes none of the numbers. - If I select the numbered list button in the ribbon and enter items then in Tika's output every item in the list has a number of '1'. - If I don't select anything first and just rely on Word automatically creating items (which it does in a List Paragraph style rather than List Number) then Tika's output is correct. An example document is attached. Not sure if this is related to TIKA-2781 but this isn't in headings. -- This message was sent by Atlassian Jira (v8.20.10#820010)