Re: Very high fieldNorm for a field resulting in bad results

2006-09-29 Thread Mek
You might want to look into the DisjunctionMaxQuery class ... in particular building a BooleanQuery containing a DisjunctionMaxQuery for each 'word' of your input in the various fields ... i've found it to be very effective. when it was first proposed it was called "MaxDisjunctionQuery" and you c

Re: Sort by date THEN by relevancy

2006-09-29 Thread KEGan
Erick, Thanks for the great advice!! About closing/opening searcher on each request isnt this unavoidable in some cases? The application I am building will have users insert/search documents all the time. So for every insert, the searcher need to be recreated again, isnt it? Else new docume

Stefan Raspl/Germany/IBM is out of the office.

2006-09-29 Thread Stefan Raspl
I will be out of the office starting 09/30/2006 and will not return until 10/09/2006. I will respond to your message when I return. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Re[3]: how to enhance speed of sorted search

2006-09-29 Thread Yonik Seeley
On 9/26/06, Chris Hostetter <[EMAIL PROTECTED]> wrote: you might be able to shave a little bit of speed off by accessing the bits from the Filter directly and iterating over them yourself to check the FieldCache ad build up your sorted list of the first "N" This is one optimization that Solr do

Re: StandardAnalyzer question

2006-09-29 Thread Doron Cohen
QueryParser can do that for you - something like: QueryParser qp = new QueryParser( "CONTENTS" , new StandardAnalyzer() ); qp.setDefaultOperator ( Operator.AND ); Query q = qp.parse ( "TOOLS FOR TRAILER" ); Result query should be: +content:tools +content:trailer "Van Ng

Re: [BULK] Re: NPE thrown in invertDocument

2006-09-29 Thread Ryan Heinen
Daniel Naber wrote: On Thursday 28 September 2006 23:55, Ryan Heinen wrote: I am creating an index using a RAMDirectory, and am running across a situation where when I call IndexSearcher.addDocument it throws a NullPointerException. Could you create a small test case that reporduces this? Thi

Re[4]: how to enhance speed of sorted search

2006-09-29 Thread Chris Hostetter
: CH> you might be able to shave a little bit of speed off by accessing the bits : CH> from the Filter directly and iterating over them yourself to check the : CH> FieldCache ad build up your sorted list of the first "N" -- i think that : CH> would save you one method call per match (the score met

Re: Very high fieldNorm for a field resulting in bad results

2006-09-29 Thread Chris Hostetter
: Assuming I want to boost the fields with the same value for all documents, : can this be replaced by query-time boosting. if i'm understanding what you mena, then yes. : I, though, am storing the norms & yet do not get exact matches ranking : higher than others. the notion that norms help "ex

Re: [BULK] StandardAnalyzer question

2006-09-29 Thread Ryan Heinen
Van Nguyen wrote: I have a field in my index that is being tokenized using the StandardAnalyzer. Let’s say that field was: TOOLS FOR TRAILER The word “FOR” is a stop word so it is not being indexed (based on the StandardAnaylzyer). When someone types in TOOLS FOR TRAILER in, I have a Boole

Re: IndexModifier and finding records

2006-09-29 Thread Daniel Naber
On Friday 29 September 2006 22:28, Mark Modrall wrote: > So is IndexModifier opening an IndexReader when someone calls .delete() > then closing the reader and opening an IndexWriter when someone calls > addDocument() (for example)? If someone calls delete and the reader is not open yet, it opens

RE: IndexModifier and finding records

2006-09-29 Thread Mark Modrall
So is IndexModifier opening an IndexReader when someone calls .delete() then closing the reader and opening an IndexWriter when someone calls addDocument() (for example)? Sounds like that could get fairly inefficient. Is IndexModifier for more convenience (and less performance) than using reader

StandardAnalyzer question

2006-09-29 Thread Van Nguyen
I have a field in my index that is being tokenized using the StandardAnalyzer.  Let’s say that field was:   TOOLS FOR TRAILER   The word “FOR” is a stop word so it is not being indexed (based on the StandardAnaylzyer).  When someone types in TOOLS FOR TRAILER in, I have a BooleanQuery s

Re: NPE thrown in invertDocument

2006-09-29 Thread Daniel Naber
On Thursday 28 September 2006 23:55, Ryan Heinen wrote: > I am creating an index using a RAMDirectory, and am running across a > situation where when I call IndexSearcher.addDocument it throws a > NullPointerException. Could you create a small test case that reporduces this? This usually makes i

Re: IndexModifier and finding records

2006-09-29 Thread Daniel Naber
On Friday 29 September 2006 14:54, Mark Modrall wrote: > It > would be nice if I could do IndexSearcher(IndexModifier) or > IndexSearcher(IndexModifier.getReader()) or something. The reader and writer are closed automatically if needed, so they cannot easily be given to the outside. If you want

Re: BooleanQuery

2006-09-29 Thread Find Me
For: BooleanQuery bQuery=new BooleanQuery(); bQuery.add(messageQuery,true,false) Use: BooleanQuery bQuery=new BooleanQuery(); bQuery.add(messageQuery, BooleanClause.Occur.MUST); Mapping is as follows: For add(query, true, false) use add(query, BooleanClause.Occur.MUST) For add(query, false, fal

BooleanQuery

2006-09-29 Thread Ismail Siddiqui
Hi, I have two pharase queries messageQuery = new PhraseQuery(); titleQuery = new PhraseQuery(); messageQuery.setSlop(3); titleQuery.setSlop(1); for (int i=0; i

Re: Splitting the index

2006-09-29 Thread karl wettin
On Fri, 2006-09-29 at 11:50 +0200, karl wettin wrote: > I don't consider a 300M to be a fairly small index. Oups. I /do/ think it is. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED

Re: NPE thrown in invertDocument

2006-09-29 Thread Ryan Heinen
I just thought I should add that I am using Lucene 2.0.0. Thanks, Ryan Ryan Heinen wrote: Hello, I am creating an index using a RAMDirectory, and am running across a situation where when I call IndexSearcher.addDocument it throws a NullPointerException. I'll provide the stack trace first,

Re: Sort by date THEN by relevancy

2006-09-29 Thread Erick Erickson
Sorting will inevitably have an impact on your speed, but it's impossible to generalize. FWIW, my app has 870K documents, the index is around 1.4G and search/sort times are fine. But even that statement is misleading. "Fine" means that the product manager for this product is satisfied with perform

Re: Sort by date THEN by relevancy

2006-09-29 Thread KEGan
Erick, Ouch!! Please excuse the cut-n-paste ;) LIA mentions a lot about performance when doing sorting. Is it something to be cautious about? You mention doing 5 fields and it works ok, ... can share with us how many documents you are handling there with 5 fields ? Thanks. ~KEGan On 9/29/06,

Re: Sort by date THEN by relevancy

2006-09-29 Thread Erick Erickson
Yes. I do this with 5 fields and it works just fine. Although your cut-n-paste got kind of hard to read Erick On 9/29/06, KEGan <[EMAIL PROTECTED]> wrote: I think I am going to answer my own question. Just use the *Sort*< file:///D:/library/apache/lucene-2.0.0/docs/api/org/apache/lucene

Re: Sort by date THEN by relevancy

2006-09-29 Thread KEGan
I think I am going to answer my own question. Just use the *Sort* (SortField [] fields) *Sort* (String [] fields) This should do it right ? On 9/29/06, KEGan <[EMAIL PROTECTED]> wrote: Hi, I have seen some sort examples in LIA.

IndexModifier and finding records

2006-09-29 Thread Mark Modrall
Hi... I was just looking at the IndexModifier class, which seems like a nice consolidation for some of our operations. There is one question I have though. The class says that it internally contains an IndexReader and an IndexWriter and has examples of operations doing both. But

Sort by date THEN by relevancy

2006-09-29 Thread KEGan
Hi, I have seen some sort examples in LIA. But cant find what I am looking for. How do I sort document by date, AND for all the documents with the same date ... these are sorted by relavency. (Date has higher sort priority in this case). Thanks.

Re: Splitting the index

2006-09-29 Thread karl wettin
On Thu, 2006-09-28 at 10:05 +0100, Rob Young wrote: > > > total file system size of the index? > segments31b > deletable4b > index 286Mb If you experience that a 300M index is much slower than a.. 30M or so, then something is probably rotten. I don't consider a 300M to be a fairly s