Hi Hayden,

On Jul 21, 2012, at 6:24 AM, Mr Havercamp wrote:

> Hi Chris
> 
> Thanks for your links, etc. I have successfully built and run Tika JAXRS and 
> will look to incorporate it into my component so that users can configure and 
> use it for Tika extraction (currently I have local Tika and SolrCell (Solr 
> server). I think it is important to provide users with different options 
> depending on their requirements (e.g. performance, simplicity, 
> cost-effectiveness, etc).

Awesome, +1!

> 
> Using Tika JAXRS I can very easily extract metadata which is great. I am also 
> able to extract content as plain text but I cannot see a setting for 
> returning content in xml/html. Is there a setting for this? Perhaps I'm 
> missing something.

You are most likely correct -- the JAXRS module is an evolving spec and we, as 
Jason put it,
would like to look to make it and the CLI and the server interface a bit more 
consistent and
standardized. If there is something that you don't see that it does (e.g., like 
xml/html output),
please file a feature request at: https://issues.apache.org/jira/browse/TIKA so 
that we can
keep it in mind going forward when folks are working on this. Also, 
contributions welcome,
so if you think you would/could take a crack at trying to add it, awesome. If 
not, I'm sure
one of the devs working on Tika JAXRS will get around to it.

Thanks!

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Reply via email to