Teresa, Thank you for sharing this with us. Would you mind opening a ticket on our Bugzilla [1] and attaching the triggering document there. Attachments don't come through on our mailing list. Thank you, again.
Best, Tim [1] https://bz.apache.org/bugzilla/enter_bug.cgi?product=POI -----Original Message----- From: Teresa Kim [mailto:teresa....@linguamatics.com] Sent: Thursday, October 5, 2017 4:11 AM To: POI Users List <user@poi.apache.org> Subject: characterRun.getSymbolChar() returns the same char for different symbols Dear POI users I got a doc document which contains uncommon greek mu and registered symbol and tried to use characterRun.getSymbolChar() method to identify these two symbols. I have noticed however, characterRun.getSymbolChar() always returns the same character and such that I could not find a way to notice the different symbols. I looked at the CharacterRun.java and tried to print out _props.getXchSym() and found that this infact prints two different values for the greek mu and regiestered symbol, i.e. '-3987' and '3870'. I really don't know if I am doing right thing in that I could use _props.getXchSym() directly instead of using characterRun.getSymbolChar() method which returns (char)_props.getXchSym(). To make it work, I added one method next to characterRun.getSymbolChar() in CharacterRun class that returns _props.getXchSym(). Would you please take a look at this and could see if this could be added into CharacterRun class? I enclose a word document which contains those two symbols and a snippet for the new method I made to the CharacterRun class for your reference. public int getSymbolCharacterAsitis() { if (isSymbol()) { return _props.getXchSym(); } else throw new IllegalStateException("Not a symbol CharacterRun"); } Many thanks in advance Teresa