Filters, what's going on under the hood?

2009-04-06 Thread Lebiram
Hi All, I am thinking of adding search filters to my application thinking that they would more efficient. Can anyone explain what lucene does with search filters? Like, what generally happens when calling search()

Re: Search using MultiSearcher generates OOM on a 1GB total Partitioned indeces

2009-04-02 Thread Lebiram
you'd have to grab it from solr's codebase until 2.9 comes out though. No clause limit, and reportedly *much* faster on large indexes. -- - Mark http://www.lucidimagination.com Lebiram wrote: > Hi Erick > > The query was a test data basically in anticipation of searches on

Re: Search using MultiSearcher generates OOM on a 1GB total Partitioned indeces

2009-04-02 Thread Lebiram
antScoreQuery) That'll chew up about 1.5M each of memory, far less than you're consuming presently and will be blazingly fast. If you're not limited to single-characters, *still* consider filters. They'll consume little memory and are quite speedy to construct. Best Erick On

Re: Search using MultiSearcher generates OOM on a 1GB total Partitioned indeces

2009-04-02 Thread Lebiram
results at all, so is there something else going on? Best er...@morequestionsthananswers. On Wed, Apr 1, 2009 at 1:31 PM, Lebiram wrote: > Hi All, > > I have the following query on a 1GB index with about 12 million docs : > As you can see the terms consist of wildcards... >

Search using MultiSearcher generates OOM on a 1GB total Partitioned indeces

2009-04-01 Thread Lebiram
Hi All, I have the following query on a 1GB index with about 12 million docs : As you can see the terms consist of wildcards... query.toString()=+(+content:g* +content:h* +content:d* +content:s* +content:a* +content:w* +content:b* +content:c* +content:m* +content:e*) +((+sender:cpuser9 +viewer

Re: Minimum HD usage during an optimize() call

2009-03-30 Thread Lebiram
docs. Mike On Mon, Mar 30, 2009 at 1:09 PM, Lebiram wrote: > Hi all, > > I was trying to determine if the documentation for optimize() is true: > > http://lucene.apache.org/java/2_4_0/api/core/org/apache/lucene/index/IndexWriter.html#optimize() > > Testing was done using Lucen

Minimum HD usage during an optimize() call

2009-03-30 Thread Lebiram
Hi all, I was trying to determine if the documentation for optimize() is true: http://lucene.apache.org/java/2_4_0/api/core/org/apache/lucene/index/IndexWriter.html#optimize() Testing was done using Lucene 2.4 I basically have 2 lucene index, Index A) one with no Searcher open during optimize

Re: TermQuery search returns the same Document several times

2009-02-05 Thread Lebiram
veral times I don't understand your question. From the API docs for HitCollector.collect: <<>> Can you ask your question another way? Because the only answer I can come up with is "HitCollector.collect only sees each document once by definition". Best Erick On Thu,

TermQuery search returns the same Document several times

2009-02-05 Thread Lebiram
Hi All, Is it possible to somehow ensure that a document will be returned only once when collecting from HitCollector?

Re: Extract the text that was indexed

2009-01-02 Thread Lebiram
Hi Hoss, Before posting this question, I did try FieldNormModifier approach. It did modify it. >From one big segment it added 7 more small segments per field. However, upon testing this index, the norms problem still occurs with the same stack trace error. This leads me to believe that FieldN

Re: Extract the text that was indexed

2008-12-30 Thread Lebiram
et back all terms for a > field > > for a given document you won't be able to reconstruct original words > > sequence. > > > > And remember that not all words are indexed. > > > > Alex > > > > 2008/12/30 Lebiram > > > > > Hi All, &g

Extract the text that was indexed

2008-12-30 Thread Lebiram
Hi All, Is it possible to extract the text that was indexed but not stored for a field in a document? Right now, reader.document() returns only fields that was stored. However I'd also want to get the text on the indexed only field... I'd appreciate your help

Re: Optimize and Out Of Memory Errors

2008-12-27 Thread Lebiram
lter(parser.parse(content))); query.add(contentQuery, BooleanClause.Occur.MUST); } catch (ParseException pe) { log.error("content could not be parsed."); } } ______

Re: Optimize and Out Of Memory Errors

2008-12-24 Thread Lebiram
t could be a good idea to turn them off for particular fields. - Mark Lebiram wrote: > Is there away to not factor in norms data in scoring somehow? > > I'm just stumped as to how Luke is able to do a seach (with limit) on the > docs but in my code it just dies with OutOfMe

Re: Optimize and Out Of Memory Errors

2008-12-24 Thread Lebiram
r To: java-user@lucene.apache.org Sent: Tuesday, December 23, 2008 5:25:30 PM Subject: Re: Optimize and Out Of Memory Errors Mark Miller wrote: > Lebiram wrote: >> Also, what are norms > Norms are a byte value per field stored in the index that is factored into > the score.

Re: Optimize and Out Of Memory Errors

2008-12-23 Thread Lebiram
Hi All, Thanks for the replies, I've just managed to reproduced the error on my test machine. What we did was, generate about 100,000,000 documents with about 7 fields in it, with terms from 1 to 10. After the index of about 20GB, we did an optimize and it was able to make 1 big index of th