Re: OutOfMemoryError on small search in large, simple index

2008-01-01 Thread Chris Hostetter
: On Wed, 2007-12-12 at 11:37 +0100, Lars Clausen wrote: : Seems there's a reason we still use all this memory: : SegmentReader.fakeNorms() creates the full-size array for us anyway, so : the memory usage cannot be avoided as long as somebody asks for the : norms array at any point. The solution

Re: Question about Boolean Operators

2008-01-01 Thread Chris Hostetter
: What's the difference between the "AND" and the "+" boolean operators. Also : the "NOT" and "-" boolean operators? Are each of these pair of operators : functionally equivalent? : : >From the examples provided in the Query Syntax documentation at: : http://lucene.apache.org/java/docs/queryparse

Re: CachingWrapperFilter: why cache per IndexReader?

2008-01-01 Thread Chris Hostetter
: I suggest to use reader.directory() instead of reader as key for the : WeakHashMap. This way multiple IndexSearcher/IndexReacher instances would : share the cache. setting aside discussion of why you should/shouldn't use a single IndexReader, or why the various places in the Lucene code base

Re: Question about Boolean Operators

2008-01-01 Thread Chris
Dear CowBoyX . if you use lucene query function with boolean query . You can try to use the BooleanQuery to create the logic with yourself. Like the code: BooleanQuery query = new BooleanQuery(); query.add(new TermQuery() , BooleanClause.Occur.MUST); // for + or query.add(

Question about Boolean Operators

2008-01-01 Thread CowboyX
Hi, Ive search to see if anyone has asked this before, and i couldn't find anything. Apologies if this is a stupid question. What's the difference between the "AND" and the "+" boolean operators. Also the "NOT" and "-" boolean operators? Are each of these pair of operators functionally equivalen

Re: CachingWrapperFilter: why cache per IndexReader?

2008-01-01 Thread Grant Ingersoll
On Jan 1, 2008, at 4:40 PM, Timo Nentwig wrote: On Tuesday 01 January 2008 22:24:53 Mark Miller wrote: I believe that, in general, you'll find that ParallelMultiSearcher is You believe or you know? And if you know why is there a ParallelMultiSearcher at all? :) And I still wonder why eve

Re: CachingWrapperFilter: why cache per IndexReader?

2008-01-01 Thread Erick Erickson
On Jan 1, 2008 4:40 PM, Timo Nentwig <[EMAIL PROTECTED]> wrote: > On Tuesday 01 January 2008 22:24:53 Mark Miller wrote: > > I believe that, in general, you'll find that ParallelMultiSearcher is > > You believe or you know? And if you know why is there a > ParallelMultiSearcher > at all? :) > > An

Re: CachingWrapperFilter: why cache per IndexReader?

2008-01-01 Thread Mark Miller
The reason there is a ParallelMusltiSearcher is because of the reasons given: if you are distributing your index across machines or hard drives, doing things in parallel is fater. I don't think RAID counts. RAID will do the parallelism for you with a single index. I say I believe because its h

Re: CachingWrapperFilter: why cache per IndexReader?

2008-01-01 Thread Timo Nentwig
On Tuesday 01 January 2008 22:24:53 Mark Miller wrote: > I believe that, in general, you'll find that ParallelMultiSearcher is You believe or you know? And if you know why is there a ParallelMultiSearcher at all? :) And I still wonder why everybody belives and finds out on his own why isn't th

Re: CachingWrapperFilter: why cache per IndexReader?

2008-01-01 Thread Mark Miller
I believe that, in general, you'll find that ParallelMultiSearcher is much slower than just using a MultiSearcher. ParralelMultiSeacher is of use when you can put the different indexes on separate hard drives or even better, separate systems (using RMI or something). - Mark Timo Nentwig wrote

Re: CachingWrapperFilter: why cache per IndexReader?

2008-01-01 Thread Timo Nentwig
On Tuesday 01 January 2008 21:06:06 Mark Miller wrote: > The main reason to use a single IndexReader is because its very time > consuming to open an IndexReader. If your index is pretty static, maybe Yes, it takes quite some time to build it and it's not changed but rebuilt from scratch. > Perha

Re: CachingWrapperFilter: why cache per IndexReader?

2008-01-01 Thread Mark Miller
The main reason to use a single IndexReader is because its very time consuming to open an IndexReader. If your index is pretty static, maybe this is not much of a concern. Otherwise its a major concern. But lets say its not...then we have to assume your going to have a huge index (otherwise the

Re: CachingWrapperFilter: why cache per IndexReader?

2008-01-01 Thread Timo Nentwig
On Tuesday 01 January 2008 19:26:51 Grant Ingersoll wrote: > My guess would be b/c best practice is usually to only have one Reader/ > Searcher per Directory, but I don't know if that is the real reason. > Most discussions/testing I have seen indicate a single Reader/Searcher > performs best. Well

Re: CachingWrapperFilter: why cache per IndexReader?

2008-01-01 Thread Timo Nentwig
On Tuesday 01 January 2008 19:38:48 Shailendra Sharma wrote: > > Is there are particular reason why CachingWrapperFilter caches per > > IndexReader > > and not per IndexReader.directory()? If there are multiple > > IndexSearcher/IndexReader instances (and only one Directory) cache will > > be built

Re: CachingWrapperFilter: why cache per IndexReader?

2008-01-01 Thread Shailendra Sharma
> > Is there are particular reason why CachingWrapperFilter caches per > IndexReader > and not per IndexReader.directory()? If there are multiple > IndexSearcher/IndexReader instances (and only one Directory) cache will be > built and held in memory redundantly. I don't see any sense in doing so >

Re: CachingWrapperFilter: why cache per IndexReader?

2008-01-01 Thread Grant Ingersoll
My guess would be b/c best practice is usually to only have one Reader/ Searcher per Directory, but I don't know if that is the real reason. Most discussions/testing I have seen indicate a single Reader/Searcher performs best. -Grant On Jan 1, 2008, at 11:57 AM, Timo Nentwig wrote: Hi!

Re: Prioiritze new documents

2008-01-01 Thread Shailendra Sharma
> > I got a large index and when searching for a term I want the newer > documents be at the begining of the result set. I dont need a real order > by time but lucene should prioritze the newer documents. > I got the time of the document creation as a index-field but it takes > very long if I would

CachingWrapperFilter: why cache per IndexReader?

2008-01-01 Thread Timo Nentwig
Hi! Is there are particular reason why CachingWrapperFilter caches per IndexReader and not per IndexReader.directory()? If there are multiple IndexSearcher/IndexReader instances (and only one Directory) cache will be built and held in memory redundantly. I don't see any sense in doing so (?).