Tim Allison created TIKA-4439: --------------------------------- Summary: Improve text extraction from EMF, round 2 Key: TIKA-4439 URL: https://issues.apache.org/jira/browse/TIKA-4439 Project: Tika Issue Type: Task Reporter: Tim Allison
In our recent regression testing for the 3.2.1 release, we found that changes made on TIKA-4432 increased the number of common tokens extracted from emf files by about 3%. We also noticed some regressions. I'm opening this issue to track improvements to EMF parsing (hopefully after the 3.2.1 release :D). -- This message was sent by Atlassian Jira (v8.20.10#820010)