Re: Searching Problem

2006-10-26 Thread Sunil Kumar PK
Thanks Mike for the information. Actually I am using RemoteParallelMultiSearcher with 10 Search Servers, my crawler program freequently add new documents in to all the Search Servers in a distributed manner. So in this case, if I add a document in a particular index, I need to restart the searche

Re: Search Problem

2006-10-26 Thread Sunil Kumar PK
Thanks Erick for the information. Actually I am using RemoteParallelMultiSearcher with 10 Search Servers, my crawler program freequently add new documents in to all the Search Servers in a distributed manner. So in this case, if I add a document in a particular index, I need to restart the search

Indexing slows down considerably after a few million documents

2006-10-26 Thread Mekin Maheshwari
I am creating an index of about 7 Million documents. The total size of the index is about 2.7G once indexing is done. For the 1st 3Million documents, the indexer takes about 3 hours (can i get better than this? ) - 4 seconds per thousand documents After this it slows down terribly and takes abou

Re: java-user Digest 26 Oct 2006 13:22:18 -0000 Issue 474

2006-10-26 Thread Paul Waite
Doron Cohen <[EMAIL PROTECTED]> wrote: > Perhaps another comment on the same line - I think you would be able to > get more from your system by bounding the number of open searchers to 2: Yep, this is exactly what I've done. >  - old, serving 'old' queries, would be soon closed; >  - new, being

Re: Analyzer.getPositionIncrementGap question

2006-10-26 Thread Chris Hostetter
: getPositionIncrementGap, but there appears to no way to set non : constant gap. say, gap between "value1" and "value2" is 10, but gap : between "value2" and "value3" is 100. by default, non of hte analyzers do anything special in getPositionIncrementGap -- that's up to you to control in any s

RE: Possible memory issue?

2006-10-26 Thread Aigner, Thomas
Thanks for the advice drj. I do close the searcher and set it to null before instantiating another searcher. I believe that I am closing the reader and writer at the correct times as well... -Original Message- From: d rj [mailto:[EMAIL PROTECTED] Sent: Thursday, October 26, 2006 11:40 A

Re: Analyzer.getPositionIncrementGap question

2006-10-26 Thread Erick Erickson
OK, how about injecting a special token in your input stream, then having your analyzer record that token and set the position increment of the next "real" token? Something like d.add("field", "veryspecialtokenincrementnext100 value1"...); d.add("field", "veryspecialtokenincrementnext10 value2"..

Re: obtaining the number of documents stored in a .cfs file

2006-10-26 Thread Volodymyr Bychkoviak
one mistake in this code should be infos.counter = ++counter; instead of infos.counter = counter++; Volodymyr Bychkoviak wrote: I've used following code to recover index. Note: it only works with .cfs files. String path = // path to index File file = new File(path); Directory d

RE: Poor performance "race condition" in FieldSortedHitQueue

2006-10-26 Thread vasu shah
Thanks Oliver. It works. Thanks, -Kalpesh Oliver Hutchison <[EMAIL PROTECTED]> wrote: Kalpesh, Are you using sorting? If you are, then the patch attached to LUCENE-651 may help. It fixes a race condition that exists in the initialization of the FieldCache (which is used to accelerate

Re: Analyzer.getPositionIncrementGap question

2006-10-26 Thread qaz zaq
Thanks Erick, Since value1, value2, value3, itself can also include multiply tokens, I am not sure the token based postion increment, d.add(new Field("value1 value2 value3"), will actually work. I was trying to use getPositionIncrementGap, but there appears to no way to set non constant

Re: Search Problem

2006-10-26 Thread Erick Erickson
Yes, but you must close and re-open your SEARCHER. There are various schemes for doing this based upon now expensive it is to open a new searcher and how often you need to do it, but it's not built into Lucene AFAIK. It all depends upon how quickly you have to see the results of your update. Also

Re: Analyzer.getPositionIncrementGap question

2006-10-26 Thread Erick Erickson
See the SynonymFilter in LIA for how to create your very own analyzer that gives you total control over the increment between terms. Essentially, that allows you to set the position increment for each and every token. I suspect that this would be easier, but what do I know? The difference is that

Re: Analyzer.getPositionIncrementGap question

2006-10-26 Thread qaz zaq
I am reposting this question. Could somebody help? qaz zaq <[EMAIL PROTECTED]> wrote: I have multiple values want to add to the same FIELD, and I also want to add non-zero but NON CONSTANT position increment gap among those values. e.g., gap between "value1" and "value2" is 10, but gap between

Re: Possible memory issue?

2006-10-26 Thread d rj
I would suggest checking that close() is properly called for all IndexSearcher/Reader/Writer objects when doing adds/deletes and when recreating IndexSearcher object. Free memory in the JVM heap can diminish quickly if these objects aren't properly disposed. -drj On 10/26/06, Aigner, Thomas < [EM

Putting some constraints on index optimization

2006-10-26 Thread Stanislav Jordanov
Have the following problem with (explicitly invoked) index optimization - it seems to always merge all existing index segments into a single huge segment, which is undesirable in my case. Is there a way to force index optimization to honor the IndexWriter.MAX_MERGE_DOCS setting? Stanislav --

Possible memory issue?

2006-10-26 Thread Aigner, Thomas
Howdy all, I have a issue with java running out of memory after the search has been running for a while. We are using 1.9.1 release and I check the indexreader's version to determine if I need to get a new searcher for searches (so I pick up any changes to the index). I am seeing jumps i

Re: Lucene 2.0.1 release date

2006-10-26 Thread Steven Rowe
George Aroush wrote: > From your email, I take it that even for the Java folks, they can't > accumulate the list of files that make up 2.0.1. Am I right? There has never been and likely will never be a 2.0.1 release. "2.0.1", "2.1" -- these are labels for *potential* future releases. "2.1" is m

Possible documentation error?

2006-10-26 Thread Johan Stuyts
Hi, On the page about the file formats I think there might be a documentation error below 'frequencies'. The example is '15, 22, 3', but if I read the paragraph starting with 'DocDelta determines both the document number and the frequency' correctly this example translates to: Doc ID Freq.

RE: Lucene 2.0.1 release date

2006-10-26 Thread George Aroush
Hi Otis and all, I am nearing to a point where I will be able to port in real-time -- and it's something I want to achieve. However, before doing so, my hope was to first sync up Lucene.Net with 2.0.1. From your email, I take it that even for the Java folks, they can't accumulate the list of fil

Re: Searching Problem

2006-10-26 Thread Michael McCandless
Sunil Kumar PK wrote: could you please explain? On 10/26/06, Karel Tejnora <[EMAIL PROTECTED]> wrote: Nope. IndexReader obtains a snapshot of index - not closing and opening indexreader leads to not deleting files (windows exception, linux will not free them). > Is it possible to get all the ma

Re: Searching Problem

2006-10-26 Thread Sunil Kumar PK
could you please explain? On 10/26/06, Karel Tejnora <[EMAIL PROTECTED]> wrote: Nope. IndexReader obtains a snapshot of index - not closing and opening indexreader leads to not deleting files (windows exception, linux will not free them). > Is it possible to get all the matching document in the

Re: Searching Problem

2006-10-26 Thread Karel Tejnora
Nope. IndexReader obtains a snapshot of index - not closing and opening indexreader leads to not deleting files (windows exception, linux will not free them). Is it possible to get all the matching document in the result without restarting the Searcher program?

Searching Problem

2006-10-26 Thread Sunil Kumar PK
Hi, I have a program to create a lucene index, and another program for searching that index. The Search program create an IndexSearcher object once in its constructor, and I created a method doSearch to search the index. The doSearch method uses the indexSearcher object to get the Hits. My Inde

Search Problem

2006-10-26 Thread Sunil Kumar PK
Hi, I have a program to create a lucene index, and another program for searching that index. The Search program create an IndexSearcher object once in its constructor, and I created a method doSearch to search the index. The doSearch method uses the indexSearcher object to get the Hits. My Inde

Re: index short text

2006-10-26 Thread zhongyi yuan
Thank you. My opinion is using the Current Similarity not suitable,because most term freq in the address content is one,but in the lucene, Freq is the very import factor,So I want know some other method to short information. - To