Re: NullPointerException in FieldDocSortedHitQueue.lessThan with custom SortComparator

2009-02-02 Thread ninaS
That's not true: have a look at the "else"-block. The problem is that lucene's FieldDocSortedHitQueue does only test for null values if uses the type FieldDoc.STRING. With FieldDoc.CUSTOM lucene assumes ci to be never null: FieldDocSortedHitQueue 163-166: case SortField.CUSTOM:{

Re: NullPointerException in FieldDocSortedHitQueue.lessThan with custom SortComparator

2009-02-02 Thread Erick Erickson
To quote the guys... "patches are always welcome". Glad you found a solution Erick On Mon, Feb 2, 2009 at 8:35 AM, ninaS wrote: > > I already found another solution: I don't use a custom SortComparator. > Another solution would be to define a default value for null. > > Would be nice if lucene

Performance issue

2009-02-02 Thread Mittal, Sourabh (IDEAS)
Hi All, We face serious performance issues when users do 2 letter search e.g ho, jo, pa ma, um ar, ma fi etc. time taken between 10 - 15 secs. Below is our implementation details: 1. Search performs on 7 fields. 2. PrefixQuery implementation on all fields 3. AND search. 4. Our indexer size is

Re: NullPointerException in FieldDocSortedHitQueue.lessThan with custom SortComparator

2009-02-02 Thread Erick Erickson
Ah, I see. The indention and lack of braces fooled me. You might consider making things as easy as possible when asking people to volunteer their time trying to help you. Then I'm unsure what's the problem, you could try showing us the entire stack trace. Have you defined your own compare function

Re: Performance issue

2009-02-02 Thread Erick Erickson
Prefix queries are expensive here. The problem is that each one forms a very large OR clause on all the terms that start with those two letters. For instance, if a field in your index contained mine milanta mica a prefix search on "mi" would form mine OR milanta OR mica. Doing this across seven f

Re: Performance issue

2009-02-02 Thread Grant Ingersoll
Can you give us more info on what they are searching for w/ 2 letter searches? Typically, prefix queries that short are going to have a lot of terms to match. You might try having a field that you index using a variation of ngrams that are anchored at the first character. For example, en

Re: Performance issue

2009-02-02 Thread Matthew Hall
Do you NEED to be using 7 fields here? Like Erick said, if you could give us an example of the types of data you are trying to search against, it would be quite helpful. Its possible that you might be able to say collapse your 7 fields down to a single field, which would likely reduce the ove

Re: Best Practice for Lucene Search

2009-02-02 Thread Karsten F.
Hi ilwes, Did you noticed the thread http://www.nabble.com/Lucene-vs.-Database-td19755932.html ? I think it is usefull for the question about using lucene storage fields even if you already have the information in DB. Best regards Karsten ilwes wrote: > > Hello, > > I googled, searched t

Re: NullPointerException in FieldDocSortedHitQueue.lessThan with custom SortComparator

2009-02-02 Thread ninaS
I already found another solution: I don't use a custom SortComparator. Another solution would be to define a default value for null. Would be nice if lucene in future would be able to search by null values also if a custom SortComparator is used. To tell you more: public class MyComparator ext

Re: How to extract Document object after the search?

2009-02-02 Thread Ganesh
Searcher.doc(int) or IndexReader.document(int) will give you the document object and to my knowledge this is the only way available, however it is not advisable to query all documents (MatchAllDocsQuery) and load all document objects. While using Searcher.doc(int) or IndexReader.document(int), l

How to extract Document object after the search?

2009-02-02 Thread mittals
As per Lucene documentation - "For good search performance, implementations of this method should not call Searcher.doc(int) or IndexReader.document(int) on every document number encountered. Doing so can slow searches by an order of magnitude or more." My question is - what's the other way to g

RE: How to extract Document object after the search?

2009-02-02 Thread Uwe Schindler
Hi, you should generally not download all fields for all documents in the HitCollector Loop, if you really need it (because you want to do some analysis on the whole result set after search), you should do the following: - only retrieve those document fields, you really need (using a FieldSelecto

Re: How to extract Document object after the search?

2009-02-02 Thread Ian Lea
Hi That quote is from the javadoc for HitCollector/TopDocCollector.collect(). You missed out the bit saying "This is called in an inner search loop". If, as your subject implies, you want to get at the Document object AFTER the search, those methods are fine. Just don't use them for any more d

Optimization error

2009-02-02 Thread Scott Smith
I'm optimizing a database and getting the error: maxClauseCount is set to 1024 I understand what that means coming out of the query parser, but what does it mean coming from the optimizer? Scott

Re: Optimization error

2009-02-02 Thread Erick Erickson
There is not enough information here to even guess at an answer. Please post the stack trace and any other relevant information you can think of and maybe there'll be some useful pointers people can give. Best Erick On Mon, Feb 2, 2009 at 7:21 PM, Scott Smith wrote: > I'm optimizing a database a

MergePolicy$MergeException during IndexWriter.addIndexesNoOptimize

2009-02-02 Thread David Fertig
Hello. Hopefully this is the correct forum. I am currently using release 2.3.2 as my stable release, but have tried this 2.4 as well. I have 4 threads indexing documents into separate indexes and then merging them into a larger master index. If the master index is previously corrupted (suc

RE: How to extract Document object after the search?

2009-02-02 Thread mittals
Hi, I have not seen much time difference between when I load the single field & all the fields of a document. After search, lucene cache the documents into the memory. Is there any way to configure the no. of documents to be cached into the memory? what could be the benefit in using FieldSelect

Poor QPS with highlighting

2009-02-02 Thread Michael Stoppelman
Hi all, My search backends are only able to eek out 13-15 qps even with the entire index in memory (this makes it very expensive to scale). According to my YourKit profiler 80% of the program's time ends up in highlighting. With highlighting disabled my backend gets about 45-50 qps (cheaper scalin