Query works in Luke but not in code...

2008-05-22 Thread Casey Dement
Hi - trying to execute a search in Lucene and getting results I don't understand :( The index contains fields search_text and type - both indexed tokenized. I'm attempting to execute the query: +(search_text:austell~0.9 search_text:ga~0.9) +(type:1 type:4) And expect to match document 156297 (

Re: Improving search performance

2008-05-22 Thread Jason Rutherglen
Query time boosting has no bottlenecks. Storing will not affect performance. You will probably want to use PrefixFilter and ConstantScoreRangeQuery. Solr has ConstantScorePrefixQuery. Simply means if the document contains the term, the result will show, the scoring will not be quite the same be

Re: Improving search performance

2008-05-22 Thread Jason Rutherglen
It would be interesting to see the results of using a custom IndexReader that implements http://dsiutils.dsi.unimi.it/docs/it/unimi/dsi/util/ImmutableExternalPrefixMap.htmlor something like it. The only problem right now would be hooking into the Lucene SegmentMerger to merge other indices such as

Re: Handeling when a field does not exist in the document

2008-05-22 Thread Jason Rutherglen
That is an interesting problem. https://issues.apache.org/jira/browse/LUCENE-1292 will build a tag index that uses a ParallelReader to allow tag fields to be searchable. The tag index does not use the usual IndexWriter but uses a specialized realtime updateable index built for tags. Depending on

Re: Improving search performance

2008-05-22 Thread Glen Newton
2008/5/22 Otis Gospodnetic <[EMAIL PROTECTED]>: > Some quick feedback. Those are all very expensive queries (wildcards and > ranges). The first thing I'd do is try without Hibernate Search (to make > sure HS is not the bottleneck). 100 threads is a lot, I'm guessing you are > reusing your sea

RE: Improving search performance

2008-05-22 Thread Rakesh Shete
Hi Otis, Thanks for the quick response. Yes my application requirements mandate me to perform wildcard and range matches. I can't avoid it. HibernateSearch(HS) uses a kind of IndexReader (HS specific) for optimizing querying over indexes. I'll explore on the idea of using a pool of searchers.

Re: Improving search performance

2008-05-22 Thread Otis Gospodnetic
Some quick feedback. Those are all very expensive queries (wildcards and ranges). The first thing I'd do is try without Hibernate Search (to make sure HS is not the bottleneck). 100 threads is a lot, I'm guessing you are reusing your searcher, which is good, but you will actually improve perf

Re: Handeling when a field does not exist in the document

2008-05-22 Thread Otis Gospodnetic
"lucene user", Look at MemoryIndex (in Lucene contrib) for the "alert about new documents that match an interest" part of the problem. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: lucene user <[EMAIL PROTECTED]> > To: java-user@lucene.

Improving search performance

2008-05-22 Thread Rakesh Shete
Hi all, I have index of size 85MB. My query looks as follows: +(t:boss* d:boss* dd:boss* tg:boss*) +st:act +ntid:0 +cid:1 +dr:[20080410 TO 20081010] +rT:[002 TO 005] All the fields used in the query are stored in the indexes (Indexed & Stored) The query response time for me is around 30 secon

Re: Handeling when a field does not exist in the document

2008-05-22 Thread Erick Erickson
See below... On Thu, May 22, 2008 at 5:44 AM, lucene user <[EMAIL PROTECTED]> wrote: > We have a requirement to inform users on a regular basis of new material on > which they have expressed interest. How are we to know what is "new" from > the point of view of a particular user? Our idea is to t

Re: Best way to get payloads

2008-05-22 Thread Grant Ingersoll
Unfortunately, I haven't had time to work on https://issues.apache.org/jira/browse/LUCENE-1001 There is a _HALF BAKED_ patch up there, but I got stuck on it at the time due to not being sure how to handle NearSpans and haven't had a chance to go back to it. I do believe it is possible to ex

Handeling when a field does not exist in the document

2008-05-22 Thread lucene user
We have a requirement to inform users on a regular basis of new material on which they have expressed interest. How are we to know what is "new" from the point of view of a particular user? Our idea is to tag each new item in some way (perhaps a date/time stamp in the lucene index indicating when t

Best way to get payloads

2008-05-22 Thread Eran Sevi
Hi, I'm running a SpanQuery and get the Spans result which tell me the documents and positions of what I searched for. I would now like to get the payloads in those documents and positions without having to iterate on TermPositions since I don't have a term but I do have the document and position.