Hi We have added all the files including PDF/Word/Excel/Txt files but it is searching and finding which are there text files. How to Strip text from either of the Documents (PDF/HTML/XML/MSword/PPT/XLS)
Thanks, Prasad K.V.S.H. * Project Manager * PACIFIC COAST STEEL (Pinnacle) Project Ness Technologies Road No 11, Banjara Hills, Hyderabad, India.Tel: +91 40 66041401 | Mobile: +91 9247475840 prasad.kokep...@ness.com <mailto:prasad.kokep...@ness.com> | www.ness.com <https://hyd1owa.ness.com/exchweb/bin/redir.asp?URL=http://www.ness.com/> ________________________________ From: KARTHIK SHIVAKUMAR [mailto:nskarthi...@gmail.com] Sent: Wed 2/1/2012 7:04 PM To: java-user@lucene.apache.org Subject: Re: lucene-3.0.3 Hi >>lucene-3.0.3 can be used for searching a text from Lucene 's primary job is to do a text search. May it be PDF/HTML/XML/MSword/PPT/XLS U have to have the code for plugin to do 2 things 1) Strip text from either of the Documents (PDF/HTML/XML/MSword/PPT/XLS) 2) Index this processed text using Lucene The indexed process can be later used for Searching thru the required content. ;) with regards karthik On Wed, Feb 1, 2012 at 6:37 PM, Prasad KVSH <prasad.kokep...@ness.com>wrote: > Hi, > > > > lucene-3.0.3 can be used for searching a text from PDF, xlsx, docx, doc, > xls, msg, TXT files. For this we have any common function to accomplish > this. Please help me on this. > > > > Thanks > > Prasad > > > > -- *N.S.KARTHIK R.M.S.COLONY BEHIND BANK OF INDIA R.M.V 2ND STAGE BANGALORE 560094*
--------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org