Hi I am a Apache NIFI developer and we have a user reporting an issue regarding the use of TIka in our ExtractDocumentText processor. The user is noticing that a particular symbol is not being parsed correctly but rather is being translated either into a ? (question mark) or " (double quote). Please see NIFI-10218 <https://issues.apache.org/jira/browse/NIFI-10218> for more details.
Please advise if there is anything on our side to do to properly extract this text or is this a known limitation of parsing PDF documents. Thank you!
