You could also take a look at Solr. From http://lucene.apache.org/solr/features.html
* Easy ways to pull in data from databases and XML files from local disk and HTTP sources * Rich Document Parsing and Indexing (PDF, Word, HTML, etc) using Apache Tika Sounds just what you need. -- Ian. On Wed, Feb 1, 2012 at 1:34 PM, KARTHIK SHIVAKUMAR <nskarthi...@gmail.com> wrote: > Hi > >>>lucene-3.0.3 can be used for searching a text from > > Lucene 's primary job is to do a text search. > > May it be PDF/HTML/XML/MSword/PPT/XLS > > U have to have the code for plugin to do 2 things > > 1) Strip text from either of the Documents (PDF/HTML/XML/MSword/PPT/XLS) > 2) Index this processed text using Lucene > > The indexed process can be later used for Searching thru the required > content. > > ;) > with regards > karthik > > > On Wed, Feb 1, 2012 at 6:37 PM, Prasad KVSH <prasad.kokep...@ness.com>wrote: > >> Hi, >> >> >> >> lucene-3.0.3 can be used for searching a text from PDF, xlsx, docx, doc, >> xls, msg, TXT files. For this we have any common function to accomplish >> this. Please help me on this. >> >> >> >> Thanks >> >> Prasad >> >> >> >> > > > -- > *N.S.KARTHIK > R.M.S.COLONY > BEHIND BANK OF INDIA > R.M.V 2ND STAGE > BANGALORE > 560094* --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org