Tim Allison created TIKA-4439:
---------------------------------

             Summary: Improve text extraction from EMF, round 2
                 Key: TIKA-4439
                 URL: https://issues.apache.org/jira/browse/TIKA-4439
             Project: Tika
          Issue Type: Task
            Reporter: Tim Allison


In our recent regression testing for the 3.2.1 release, we found that changes 
made on TIKA-4432 increased the number of common tokens extracted from emf 
files by about 3%. 

We also noticed some regressions. I'm opening this issue to track improvements 
to EMF parsing (hopefully after the 3.2.1 release :D).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to