Tim Allison created TIKA-4376:
---------------------------------

             Summary: tika-eval should tokenize on non-breaking/narrow/other 
space variants
                 Key: TIKA-4376
                 URL: https://issues.apache.org/jira/browse/TIKA-4376
             Project: Tika
          Issue Type: Task
          Components: tika-eval
            Reporter: Tim Allison


See TIKA-4375. Many thanks to [~tilman] for identifying this issue and 
supplying this link: 
[https://www.utf8-chartable.de/unicode-utf8-table.pl?start=8192&number=128]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to