data corruption in lucene index 2.3.2

2011-10-28 Thread Zhang, Lisheng
We are using lucene 2.3.2 (yes we should upgrade) and recently we had Exception when opening index: ### java.io.IOException: read past EOF at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:146) at org.apache.lucene.store.BufferedIndexInput.readByte(

Re: multiple phrase search for topic

2011-10-28 Thread Ian Lea
Seems to me your approach should work, although I'd worry about performance. > A lot of top-ranked documents are not the best candidates for the "Software > Technology" topic, even > though they contain the phrases (not very frequent) Surely the docs that contain the phrases are going to be top

multiple phrase search for topic

2011-10-28 Thread deb.lucene
Hi Group, I am indexing and searching a large corpus of news articles. The indexing process is very straightforward, I am utilizing the standardAnalyzer and analyzing the content of the news document. ** document = new Document(); document.add(new Field("snum", snum, Field.

Re: Finding match term positions in the document

2011-10-28 Thread Anshum
Hi Vidya, Perhaps this could help you: http://hrycan.com/2009/10/25/lucene-highlighter-howto/ -- Anshum Gupta http://ai-cafe.blogspot.com On Fri, Oct 28, 2011 at 2:18 PM, Vidya Kanigiluppai Sivasubramanian < vidya...@hcl.com> wrote: > Hi, > > I am using lucene 2.4.1 in my project. > I need to d

Finding match term positions in the document

2011-10-28 Thread Vidya Kanigiluppai Sivasubramanian
Hi, I am using lucene 2.4.1 in my project. I need to display the search results when searched for a particular term and on selecting an item in the result page, I need to display the document where the term was found highlighting the match terms in the display. For this I need to know the match