On Apr 6, 2005, at 3:07 AM, <[EMAIL PROTECTED]> wrote:
"THAT is HOW CAN I INDEX and SEARCH .pdf, .ppt,. xml, .doc etc DOCUMENTS WITH LUCENE."
I WILL BE REALLY HANKFUL IF U SOLVE MY PROBLEM.

<commercial>

Get a copy of Lucene in Action. Otis wrote a great chapter on how to handle various document formats with Lucene:

        http://www.lucenebook.com/search?query=index+pdf

</commercial>

It sounds like you're using the Lucene demo application. That is a reasonable starting point, but you will very quickly want to diverge from it and integrate in a PDF reader (PDFBox being the best open source library) and something to read Office documents (TextMining for Word docs, perhaps POI for the other formats).

Also, have a look at Nutch - it may actually be the indexing/searching solution you want for your intranet rather than a custom Lucene application.

        Erik


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to