Thanks for the advice, I just don't see where in the Lucene code I should plug OOParcer into Lucene.

I've walked the code in LIUS and Nutch (moving on to Solr) trying to find common objects. If I can find common objects in Lucene and Nutch I'll know where to plug in.


Lucene Objects looks like this

IndexWriter
                       Analyzer
                           StandardAnalyzer
                       Document
                           Reader
                               FileReader
                               StringReader
                       DocumentWriter


But when I search thru the Nutch or LIUS code I can not find these objects. LIUS uses reflection so I'm not going to find anything in the code, but unforturnately the liusConfig.xml is incomplete and I can not find the class names for the OpenOffice stuff in it.

This is all very frustrating since it should be a realatively easy to add support for unsupported formats. The Lucene code is very nice, lius code less so. Seems Lucene is setup to drop in new file formats I just do not know where to drop it in or what kind of objects need to be dropped in.

Oh well guess I will code up a Reader the just spites out "Here I am" a few hundred times and see what happens. LOL.


thank you for the reply and advice.

jim s



----- Original Message ----- From: "Andrzej Bialecki" <[EMAIL PROTECTED]>
To: <java-user@lucene.apache.org>
Sent: Friday, May 25, 2007 1:10 PM
Subject: Re: Indexing help needed


jim shirreffs wrote:

Thanks to all that try to help me out

Jim S

P.S. If I get it working I will be happy to email post the code.

If you looked at the code in Nutch, you can take most of the parse-oo plugin verbatim, because all this plugin does is it extracts the text content and metadata from OO files.



--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to