adjenks created TIKA-4024:
-----------------------------
Summary: Setting throwOnZeroBytes to false in
autoDetectParserConfig fails to start server
Key: TIKA-4024
URL: https://issues.apache.org/jira/browse/TIKA-4024
Project: Tika
Issue Type: Bug
Components: tika-server
Affects Versions: 2.7.0
Environment: docker
Reporter: adjenks
It says on this page that in > 2.7.0 you can configure this setting, but it
doesn't appear to be working:
[https://cwiki.apache.org/confluence/display/TIKA/ModifyingContentWithHandlersAndMetadataFilters#ModifyingContentWithHandlersAndMetadataFilters-4.AutoDetectParserConfig]
By >2.7.0 does this mean that 2.7.0 does not work with this config? I'm using
the docker image that is labelled 2.7.0.1-full and there appears to be nothing
newer.
I'm setting it like so in the XML:
{code:java}
<?xml version="1.0" encoding="UTF-8"?>
<properties>
<autoDetectParserConfig>
<params>
<throwOnZeroBytes>false</throwOnZeroBytes>
</params>
</autoDetectParserConfig>
</properties>{code}
but it's saying :
{quote}| INFO [main] 21:40:13,139
org.apache.tika.server.core.TikaServerProcess Using custom config:
/tika-config.xml
| ERROR [main] 21:40:13,205 org.apache.tika.server.core.TikaServerProcess Can't
start:
| org.apache.tika.exception.TikaConfigException: Couldn't find setter
'setThrowOnZeroBytes' or adder 'addThrowOnZeroBytes' for throwOnZeroBytes of
class: class org.apache.tika.parser.AutoDetectParserConfig
{quote}
Also, I am using the "2.7.0.1" version from docker hub, but this version
doesn't appear to exist everywhere or follow proper Sem.Ver format, which
confuses me.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)