SolrCell includes Tika and SolrCell is included with Solr, at least the
standard distribution of Solr. You can stream Office and PDF docs directly
to the extracting request handler where Tika will process them. You can also
ask SolrCell to "extract only" and return the extracted content.
See:
http://wiki.apache.org/solr/ExtractingRequestHandler
Whether the Azure distribution is "full" Solr including Solr Cell or not, I
cannot answer.
Note: For future reference, "Solr" questions should be asked on the
"solr-user" mailing list.
-- Jack Krupansky
-----Original Message-----
From: Aloke Ghoshal
Sent: Monday, October 29, 2012 3:22 AM
To: java-user@lucene.apache.org ; gene...@lucene.apache.org
Subject: Running Solr Core/ Tika on Azure
Hi,
Looking for feedback on running Solr Core/ Tika parsing engine on Azure.
There's one offering for Solr within Azure from Lucid works. This offering
however doesn't mention Tika.
We are looking at options to make content from files (doc, excel, pdfs,
etc.) stored within Azure storage search-able. And whether the parser could
run against our Azure store directly to index the content. The other option
could be to write a separate connector that streams in the files. Let me
know if you have experience along these lines.
Regards,
Aloke
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org