I have recently moved my PHP command-line script to a new server, installed java
1.5.0 and am trying to run tika.

Each time I run it, I get the same error when running tika:

-bash-4.1$ java -jar tika-app-1.2.jar
Exception in thread "main" java.lang.RuntimeException: Unable to parse the
default media type registry
   at org.apache.tika.mime.MimeTypes.getDefaultMimeTypes(MimeTypes.java:482)
   at org.apache.tika.detect.DefaultDetector.<init>(DefaultDetector.java:98)
   at org.apache.tika.cli.TikaCLI.<init>(TikaCLI.java:303)
   at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:105)
Caused by: org.apache.tika.mime.MimeTypeException: Invalid type configuration
   at org.apache.tika.mime.MimeTypesReader.read(MimeTypesReader.java:119)
   at org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:64)
   at org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:93)
   at org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:149)
   at org.apache.tika.mime.MimeTypes.getDefaultMimeTypes(MimeTypes.java:479)
   ...3 more
Caused by: org.apache.tika.mime.MimeTypeException: Invalid media type name:
application/dita+xml;format=map
   at 
org.apache.tika.mime.MimeTypesReader.startElement(MimeTypesReader.java:148)
   at gnu.xml.stream.SAXParser.parse(libgcj.so.10)
   at javax.xml.parsers.SAXParser.parse(libgcj.so.10)
   at javax.xml.parsers.SAXParser.parse(libgcj.so.10)
   at org.apache.tika.mime.MimeTypesReader.read(MimeTypesReader.java:115)
   ...7 more

My PHP script is running command-lines like this:

java -jar tika-app-1.2.jar -eUTF-8 --text
"/var/www/vhosts/example/sample-docs/welsh_corpus.txt" >/tmp/phpVAc0aW
2>/tmp/phpcKRcIo

All the same. I have no idea what "Unable to parse the default media type
registry" could mean, how to fix it, or what it is that is different on this new
server (I installed java from the same source, and move from CloudLinux where it
was previously working to CentOS 6 now).

Anyone else had this error? I expect it has an easy fix, but the error just
means nothing to me and my Google-foo cannot find any explanation, apart from
the line of source in tika that generates the error.

-- Jason

Reply via email to