Re: Help on DOCX and XLSX

2012-03-07 Thread Ian Lea
earch text is being performed on indexing, then we are > filtering the documents by reading document record from database table > for the above key values. > > Thanks > Prasad > > > > -Original Message- > From: Ian Lea [mailto:ian@gmail.com] > Sent: Wed

RE: Help on DOCX and XLSX

2012-03-07 Thread Prasad KVSH
Prasad -Original Message- From: Ian Lea [mailto:ian@gmail.com] Sent: Wednesday, March 07, 2012 4:03 PM To: java-user@lucene.apache.org Subject: Re: Help on DOCX and XLSX You'll have to find something that parses the formats you are interested in and extracts the text you

Re: Help on DOCX and XLSX

2012-03-07 Thread Ian Lea
You'll have to find something that parses the formats you are interested in and extracts the text you want. Apache Tika comes to mind. Why are you using such an old version of Lucene? Why aren't you using Solr? That might just work for you out of the box. See also http://www.lucidimagination.c