RE: An interesting thing

2006-06-11 Thread Flik Shen
I understand why buffered indexing seems running faster. It seems that initialization operation takes obvious time and impact the indexing performance. I found ram indexing is faster if I run buffered indexing prior to ram indexing. So I think the method "addDocuments" will take more time at first

RE: Asserting that a value must match the entire content of a field

2006-06-11 Thread Shivani Sawhney
Are you saying that there is no out-of-the-box way of doing this...? Can I not check the content length somehow? As in, mention that the length of the value (to be matched) must match the length of the field value... -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] Sent

RE: IndexWriter.addIndexes & optimizatio

2006-06-11 Thread Flik Shen
It means that to pick both high maxBufferedDocs and mergeFator will improve your indexing performance. But if too high, it will lead you to an OutOfMemberException.exception. And if you set mergeFactor too high will also lead you to problem "open too many files". So you should pick proper values a

Re: Asserting that a value must match the entire content of a field

2006-06-11 Thread Otis Gospodnetic
One (ugly) way might be to insert artificial begin/end markers at both index and search time. Otis - Original Message From: Shivani Sawhney <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Monday, June 12, 2006 12:30:57 AM Subject: Asserting that a value must match the entire c

Re: IndexWriter.addIndexes & optimizatio

2006-06-11 Thread vipin sharma
- > Just set your maxBufferedDocs to as high a number as your RAM/heap will let you, and pick a mergeFactor that is high, but doesn't get you in trouble with open files. can you please explaing this in brief?? regards and thanks, On 6/9/06, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: When wri

Asserting that a value must match the entire content of a field

2006-06-11 Thread Shivani Sawhney
Hi, I have a small query here. How do I do an exact match on the value of a field and also assert that the value must match the entire content of that field? For E.g., I want that only the documents with 'Product Lifecycle' as the value of a given field must be selected and even documents wit

RE: An interesting thing

2006-06-11 Thread Flik Shen
1. I use buffered indexing and ram indexing to index same 3000 documents. So I think they have same total sizes. 2. I store them in ram directory firstly no matter ram or buffered. I think it should have same performances. Then I take further step to hold index into file system directory. So why do

Re: An interesting thing

2006-06-11 Thread yueyu lin
1. Buffered index is using ram. They are small and samll enough to be easy for OS to allocate several(or only one) pages to store them. 2. RAMDirectory will have to apply huge blocks of ram from OS. Sometimes OS cannot allocate so many ram efficiently. So some of pages are moved to disk and a ram

RE: An interesting thing

2006-06-11 Thread Flik Shen
One thing could not be explained clearly. That is why "RAM" ALWAYS take more time than buffered indexing. On other hand the buffered indexing is to use "RAM" as a buffer. Is there some difference between these two "RAM"? To use "RAM" as a buffer I take additional step to convert buffered index to F

Re: Re[2]: Fwd: Lucene 2.0.0 release available

2006-06-11 Thread Martin Cooper
I've deployed 1.9.1 and 2.0.0 to the ASF Maven 2 repo and requested a sync with ibiblio, so hopefully they'll be available soon. -- Martin Cooper On 6/10/06, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: It's really just a matter of putting the Jars in the appropriate directory on the appropria

Re: An interesting thing

2006-06-11 Thread gekkokid
In Windows XP can't you change the registry to use only phyiscal RAM? - Original Message - From: "yueyu lin" <[EMAIL PROTECTED]> To: Sent: Sunday, June 11, 2006 12:31 PM Subject: Re: An interesting thing In some OS, the ram is not only "RAM". The virtual ram uses the disk. That's ve

Re: sitegeist

2006-06-11 Thread karl wettin
On Sat, 2006-05-27 at 01:17 +0200, karl wettin wrote: > Will report back with results in a month or so. so. Here is a report on my very simple sitegiest: I have about 200,000 documents in my corpus. All search results are passed on the SiteGeist-class that contains a Map, where the double represe

Re: An interesting thing

2006-06-11 Thread yueyu lin
In some OS, the ram is not only "RAM". The virtual ram uses the disk. That's very slow. In some windows platform, you will find half of some application's ram is virtual ram. That's some why windows is slow in some fields. On 6/11/06, Flik Shen <[EMAIL PROTECTED]> wrote: Hi, I am freshman t

An interesting thing

2006-06-11 Thread Flik Shen
Hi,   I am freshman to Lucene and I am reading the book “Lucene In Action”. Just as that we know, there are two kinds of directory to hold index, one is File System and the other is RAM. There is a sample to compare performances of these two kind directories and there is also a piece of

Re: Numbertools and efficient sorting

2006-06-11 Thread Chris Hostetter
: > : I want to use INT sorting instead, but these strings can not be parsed : > : back into integers by Java's built in parsing functions, which is : > : > 1) Take a look at FieldCache.IntParser and : > FieldCache.getInts(IndexReader,String,IntParser) .. you can use it in your : > own custom Sort