Re: BooleanQuery questions

2007-10-03 Thread Warren
Thanks for the reply. Everything is working correctly now. I jumped the gun without debuging it more. booleanANDSearch was not getting set correctly. Erick Erickson wrote: I don't see a problem with your booleanANDSearch thingy, although I haven't tried it. Does toString() return the same stri

FW: Eliminating duplicate documents when indexing

2007-10-03 Thread Rod Giles
Duplicate Documents In An Index The updateDocument method of Index Writer indicates that a delete term occurs before the update document takes place (i.e. the document is replaced in the index, but not duplicated).Has anyone been able to get this process to work? The term that I am using ha

Generalized proximity query performance

2007-10-03 Thread Kyle Maxwell
Hi again,As the subject would suggest I'm trying to implement a layer of proximity weighting over lucene. This has greatly increased search relevance, but at the same time has knocked down performance by a substantial amount (see footer). I am using a hand rolled query of the following form (impl

Re: BooleanQuery questions

2007-10-03 Thread Erick Erickson
I don't see a problem with your booleanANDSearch thingy, although I haven't tried it. Does toString() return the same string regardless of the value of booleanANDSearch? That would surprise me. The default is OR, so the toString output looks like booleanANDSearch is false. In general, the Lucene

BooleanQuery questions

2007-10-03 Thread Warren
I am new to Lucene and am having problems with booleanQueries. How do you write Boolean OR and AND queries? Is this an OR query booleanQuery.add(query1, BooleanClause.Occur.SHOULD); booleanQuery.add(query2, BooleanClause.Occur.SHOULD); and is this an AND query booleanQuery.add(query1, Boolean

Re: Subset match query?

2007-10-03 Thread Chris Hostetter
: I understand how that recommendation could potentially cover fields with : undesired terms mixed in with the desired terms. I fail to see that it : covers the case where the undesired term(s) are last, i.e. "desired desired : undesired." Could you please elaborate? Thanks! my bad ... you are

Re: Subset match query?

2007-10-03 Thread Kyle Maxwell
I understand how that recommendation could potentially cover fields with undesired terms mixed in with the desired terms. I fail to see that it covers the case where the undesired term(s) are last, i.e. "desired desired undesired." Could you please elaborate? Thanks! -Kyle On 10/3/07, Chris H

Re: Subset match query?

2007-10-03 Thread Chris Hostetter
A typical solution to problems in this "space" is to index marker terms to denote boundaries in the term sequence ... in combination with things like SpanNear and SpanNot, this can be used to make queries like "these 5 words must be in the same sentence" in your specific example however, where

Re: Subset match query?

2007-10-03 Thread Grant Ingersoll
Oops, too quick to reply... coord() won't quite do it, since it does terms matched in doc versus terms in query. On Oct 3, 2007, at 2:20 PM, Kyle Maxwell wrote: I'm indexing a dataset with lots of short fields. I have determined that it would be useful to highly boost matches where every

Re: Subset match query?

2007-10-03 Thread Grant Ingersoll
See the Similarity.coord() method. /** Computes a score factor based on the fraction of all query terms that a * document contains. This value is multiplied into scores. * * The presence of a large portion of the query terms indicates a better * match with the query, so implemen

Subset match query?

2007-10-03 Thread Kyle Maxwell
I'm indexing a dataset with lots of short fields. I have determined that it would be useful to highly boost matches where every term in this field is represented in the query. i.e.: Query: lucene field matches Field: lucene field but not Field: lucene has a field ... Field: lucene field foo... I'