[ https://issues.apache.org/jira/browse/TIKA-2589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tim Allison closed TIKA-2589. ----------------------------- Resolution: Not A Problem Thank you for opening this issue. MSWord calculates page counts dynamically and IMHO rarely stores the actual page count for a document, rather, it typically stores "1", which is incorrect. If you add .zip to your file, unzip it, and look in docProps/app.xml, you'll see: {noformat} <Pages>1</Pages><Words>127171</Words><Characters>724878</Characters> {noformat} It is beyond the scope of Tika to calculate page counts dynamically, and so, we rely on whatever MSWord stored in the document. > Wrong page count detection (docx from dotm template) > ---------------------------------------------------- > > Key: TIKA-2589 > URL: https://issues.apache.org/jira/browse/TIKA-2589 > Project: Tika > Issue Type: Bug > Components: metadata > Affects Versions: 1.17 > Environment: $ java -version > java version "1.8.0_161" > Java(TM) SE Runtime Environment (build 1.8.0_161-b12) > Java HotSpot(TM) 64-Bit Server VM (build 25.161-b12, mixed mode > OS Version: 6.1.7601 Service Pack 1 сборка 7601 > Reporter: Leonid Korsakov > Priority: Major > Attachments: 262 страницы.docx > > > I have docx file cteated from dotm template. When I call > {code:java} > java -jar tika-app.jar -m path_to_file > {code} > i see xmpTPg:NPages: 1 but docx file contain 262 pages count -- This message was sent by Atlassian JIRA (v7.6.3#76005)