GitHub user akhikhl opened a pull request:

    https://github.com/apache/poi/pull/3

    fix for information loss on footnotes/endnotes within XWPFRun.toString

    Dear Apache POI Team,
    
    Please consider a problem: whenever MS-Word document with 
footnotes/endnotes is being parsed with XWPFWordExtractor, information on the 
location of footnote/endnote references is lost. This information loss is 
clearly observed in, for example, Apache Tika output.
    
    To reproduce a problem, please insert the following code to 
TestXWPFWordExtractor.testFootnotes:
    
            java.io.FileWriter w = new java.io.FileWriter(new 
java.io.File(System.getProperty("user.home"), "footnotes.output.txt"));
            try {
              w.write(extractor.getText());
            } finally {
              w.close();
            }
    
    and inspect the content of "footnotes.output.txt" - it contains "Eto ochen 
prostoy text so snoskoy", where between "prostoy" and "text" there should be a 
footnote reference (and it is lost).
    
    SOLUTION:
    I suggest to introduce additional markup like [footnoteRef:num], 
[endnoteRef:num], which will allow applications to correctly render footnote 
references.
    
    Please, see commit details.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/akhikhl/poi enhanced-footnote-support

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/poi/pull/3.patch

----

----


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to