[ 
https://issues.apache.org/jira/browse/SOLR-7632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16276888#comment-16276888
 ] 

Tim Allison commented on SOLR-7632:
-----------------------------------

bq. To carry out Erik Hatcher's recommendation...I don't know if we'd need CORS 
for this or not, but it might be neat to modify Tika's server to allow users to 
inject their own resources=endpoints via a config file and an extra jar. Within 
the Solr project, we'd just have to implement a resource that takes an input 
stream, runs Tika and then adds a SolrInputDocument.

[~gostep] has proposed allowing users to configure a custom ContentHandler in 
tika-server.  This could enable Solr to create its own content handler that 
tika-server could use to send the extracted text to Solr on endDocument().

> Change the ExtractingRequestHandler to use Tika-Server
> ------------------------------------------------------
>
>                 Key: SOLR-7632
>                 URL: https://issues.apache.org/jira/browse/SOLR-7632
>             Project: Solr
>          Issue Type: Improvement
>          Components: contrib - Solr Cell (Tika extraction)
>            Reporter: Chris A. Mattmann
>              Labels: gsoc2017, memex
>
> It's a pain to upgrade Tika's jars all the times when we release, and if Tika 
> fails it messes up the ExtractingRequestHandler (e.g., the document type 
> caused Tika to fail, etc). A more reliable way and also separated, and easier 
> to deploy version of the ExtractingRequestHandler would make a network call 
> to the Tika JAXRS server, and then call Tika on the Solr server side, get the 
> results and then index the information that way. I have a patch in the works 
> from the DARPA Memex project and I hope to post it soon.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to