Gregory Lepore created TIKA-4074:
------------------------------------
Summary: Add magic for TeX Virtual Font format
Key: TIKA-4074
URL: https://issues.apache.org/jira/browse/TIKA-4074
Project: Tika
Issue Type: Sub-task
Reporter: Gregory Lepore
Attachments: aebx10.vf, aebx12.vf, aebxsl10.vf
The TeX Virtual Font format occurs 6,047 times in the second most recent Common
Crawl dataset. No known mime type. The magic is:
F7CA\{9}F300\{4}0010 at offset 0.
The above signature will catch most TeX vf files, however some will be missed.
However, there were no false positives so I think it's a good compromise to
catch the majority of sample files.
It would be nice to see the results of additional testing.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)