subject:"Re\: search performance"

Re: Search Performance with NRT

2015-05-27 Thread kiwi clive

Hi Mike, Thanks for the very prompt and clear response. We look forward to using the new (new for us) Lucenene goodies :-) Clive From: Michael McCandless To: Lucene Users ; kiwi clive Sent: Thursday, May 28, 2015 2:34 AM Subject: Re: Search Performance with NRT As long as you

Re: Search Performance with NRT

2015-05-27 Thread Michael McCandless

As long as you call SM.maybeRefresh from a dedicated refresh thread (not from a query's thread) it will work well. You may want to use a warmer so that the new searcher is warmed before becoming visible to incoming queries ... this ensures any lazy data structures are initialized by the time a que

Re: search performance

2014-06-20 Thread Vitaly Funstein

If you are using stored fields in your index, consider playing with compression settings, or perhaps turning stored field compression off altogether. Ways to do this have been discussed in this forum on numerous occasions. This is highly use case dependent though, as your indexing performance may o

RE: search performance

2014-06-20 Thread Uwe Schindler

Hi, > Am I correct that using SearchManager can't be used with a MultiReader and > NRT? I would appreciate all suggestions on how to optimize our search > performance further. Search time has become a usability issue. Just have a SearcherManger for every index. MultiReader construction is cheap

Re: search performance

2014-06-20 Thread Jamie

Greetings Lucene Users As a follow-up to my earlier mail: We are also using Lucene segment warmers, as per recommendation, segments per tier is now set to five, buffer memory is set to (Runtime.getRuntime().totalMemory()*.08)/1024/1024; See below for code used to instantiate writer:

Re: search performance

2014-06-20 Thread Jamie

Hi All Thank you for all your suggestions. Some of the recommendations hadn't yet been implemented, as our code base was using older versions of Lucene with reduced capabilities. Thus, far, all the recommendations for fast search have been implemented (e.g. using pagination with searchAfter,

Re: search performance

2014-06-06 Thread Jamie

Jon I ended up adapting your approach. The solution involves keeping a LRU cache of page boundary scoredocs and their respective positions. New positions are added to the cache as new pages are discovered. To cut down on searches, when scrolling backwards and forwards, the search begins from

RE: search performance

2014-06-03 Thread Toke Eskildsen

Jamie [ja...@mailarchiva.com] wrote: > It would be nice if, in future, the Lucene API could provide a > searchAfter that takes a position (int). It would not really help with large result sets. At least not with the current underlying implementations. This is tied into your current performance pr

Re: search performance

2014-06-03 Thread Jamie

Thanks Jon I'll investigate your idea further. It would be nice if, in future, the Lucene API could provide a searchAfter that takes a position (int). Regards Jamie On 2014/06/03, 3:24 PM, Jon Stewart wrote: With regards to pagination, is there a way for you to cache the IndexSearcher, Que

Re: search performance

2014-06-03 Thread Jon Stewart

With regards to pagination, is there a way for you to cache the IndexSearcher, Query, and TopDocs between user pagination requests (a lot of webapp frameworks have object caching mechanisms)? If so, you may have luck with code like this: void ensureTopDocs(final int rank) throws IOException {

Re: search performance

2014-06-03 Thread Jamie

Robert. Thanks, I've already done a similar thing. Results on my test platform are encouraging.. On 2014/06/03, 2:41 PM, Robert Muir wrote: Reopening for every search is not a good idea. this will have an extremely high cost (not as high as what you are doing with "paging" but still not good).

Re: search performance

2014-06-03 Thread Robert Muir

Reopening for every search is not a good idea. this will have an extremely high cost (not as high as what you are doing with "paging" but still not good). Instead consider making it near-realtime, by doing this every second or so instead. Look at SearcherManager for code that helps you do this. O

Re: search performance

2014-06-03 Thread Jamie

Robert FYI: I've modified the code to utilize the experimental function.. DirectoryReader dirReader = DirectoryReader.openIfChanged(cachedDirectoryReader,writer, true); In this case, the IndexReader won't be opened on each search, unless absolutely necessary. Regards Jamie On 2014/06

Re: search performance

2014-06-03 Thread Jamie

Robert Hmmm. why did Mike go to all the trouble of implementing NRT search, if we are not supposed to be using it? The user simply wants the latest result set. To me, this doesn't appear out of scope for the Lucene project. Jamie On 2014/06/03, 1:17 PM, Robert Muir wrote: No, you are

Re: search performance

2014-06-03 Thread Robert Muir

No, you are incorrect. The point of a search engine is to return top-N most relevant. If you insist you need to open an indexreader on every single search, and then return huge amounts of docs, maybe you should use a database instead. On Tue, Jun 3, 2014 at 6:42 AM, Jamie wrote: > Vitality / Rob

Re: search performance

2014-06-03 Thread Jamie

Vitality / Robert I wouldn't go so far as to call our pagination naive!? Sub-optimal, yes. Unless I am mistaken, the Lucene library's pagination mechanism, makes the assumption that you will cache the scoredocs for the entire result set. This is not practical when you have a result set that e

Re: search performance

2014-06-03 Thread Vitaly Funstein

Jamie, What if you were to forget for a moment the whole pagination idea, and always capped your search at 1000 results for testing purposes only? This is just to try and pinpoint the bottleneck here; if, regardless of the query parameters, the search latency stays roughly the same and well below

Re: search performance

2014-06-03 Thread Robert Muir

Check and make sure you are not opening an indexreader for every search. Be sure you don't do that. On Mon, Jun 2, 2014 at 2:51 AM, Jamie wrote: > Greetings > > Despite following all the recommended optimizations (as described at > http://wiki.apache.org/lucene-java/ImproveSearchingSpeed) , in so

Re: search performance

2014-06-03 Thread Jamie

Vitaly See below: On 2014/06/03, 12:09 PM, Vitaly Funstein wrote: A couple of questions. 1. What are you trying to achieve by setting the current thread's priority to max possible value? Is it grabbing as much CPU time as possible? In my experience, mucking with thread priorities like this is

Re: search performance

2014-06-03 Thread Vitaly Funstein

A couple of questions. 1. What are you trying to achieve by setting the current thread's priority to max possible value? Is it grabbing as much CPU time as possible? In my experience, mucking with thread priorities like this is at best futile, and at worst quite detrimental to responsiveness and o

Re: search performance

2014-06-03 Thread Jamie

FYI: We are also using a multireader to search over multiple index readers. Search under a million documents yields good response times. When you get into the 60M territory, search slows to a crawl. On 2014/06/03, 11:47 AM, Jamie wrote: Sure... see below: --

Re: search performance

2014-06-03 Thread Jamie

Sure... see below: protected void search(Query query, Filter queryFilter, Sort sort) throws BlobSearchException { try { logger.debug("start search {searchquery='" + getSearchQuery() + "',query='"+query.toString()+"',filterQuery='"+queryFilter+"',sort='"+sort

Re: search performance

2014-06-03 Thread Rob Audenaerde

Hi Jamie, What is included in the 5 minutes? Just the call to the searcher? seacher.search(...) ? Can you show a bit more of the code you use? On Tue, Jun 3, 2014 at 11:32 AM, Jamie wrote: > Vitaly > > Thanks for the contribution. Unfortunately, we cannot use Lucene's > pagination function

Re: search performance

2014-06-03 Thread Jamie

Vitaly Thanks for the contribution. Unfortunately, we cannot use Lucene's pagination function, because in reality the user can skip pages to start the search at any point, not just from the end of the previous search. Even the first search (without any pagination), with a max of 1000 hits, tak

Re: search performance

2014-06-03 Thread Vitaly Funstein

Something doesn't quite add up. TopFieldCollector fieldCollector = TopFieldCollector.create(sort, max,true, > false, false, true); > > We use pagination, so only returning 1000 documents or so at a time. > > You say you are using pagination, yet the API you are using to create your collector isn't

Re: search performance

2014-06-03 Thread Jamie

Toke Thanks for the contact. See below: On 2014/06/03, 9:17 AM, Toke Eskildsen wrote: On Tue, 2014-06-03 at 08:17 +0200, Jamie wrote: Unfortunately, in this instance, it is a live production system, so we cannot conduct experiments. The number is definitely accurate. We have many different sy

Re: search performance

2014-06-03 Thread Toke Eskildsen

On Tue, 2014-06-03 at 08:17 +0200, Jamie wrote: > Unfortunately, in this instance, it is a live production system, so we > cannot conduct experiments. The number is definitely accurate. > > We have many different systems with a similar load that observe the same > performance issue. To my knowle

Re: search performance

2014-06-02 Thread Christoph Kaser

Can you take thread stacktraces (repeatedly) during those 5 minute searches? That might give you (or someone on the mailing list) a clue where all that time is spent. You could try using jstack for that: http://docs.oracle.com/javase/7/docs/technotes/tools/share/jstack.html Regards Christoph

Re: search performance

2014-06-02 Thread Jamie

Toke Thanks for the comment. Unfortunately, in this instance, it is a live production system, so we cannot conduct experiments. The number is definitely accurate. We have many different systems with a similar load that observe the same performance issue. To my knowledge, the Lucene integrati

Re: search performance

2014-06-02 Thread Toke Eskildsen

On Mon, 2014-06-02 at 08:51 +0200, Jamie wrote: [200GB, 150M documents] > With NRT enabled, search speed is roughly 5 minutes on average. > The server resources are: > 2x6 Core Intel CPU, 128GB, 2 SSD for index and RAID 0, with Linux. 5 minutes is extremely long. Is that really the right number

Re: search performance

2014-06-02 Thread Tri Cao

This is an interesting performance problem and I think there is probably not a single answer here, so I'll just layout the steps I would take to tackle this: 1. What is the variance of the query latency? You said the average is 5 minutes, but is it due to some really bad queries or most queries h

Re: search performance

2014-06-02 Thread Jamie

I assume you meant 1000 documents. Yes, the page size is in fact configurable. However, it only obtains the page size * 3. It preloads the following and previous page too. The point is, it only obtains the documents that are needed. On 2014/06/02, 3:03 PM, Tincu Gabriel wrote: My bad, It's u

Re: search performance

2014-06-02 Thread Tincu Gabriel

My bad, It's using the RamDirectory as a cache and a delegate directory that you pass in the constructor to do the disk operations, limiting the use of the RamDirectory to files that fit a certain size. So i guess the underlying Directory implementation will be whatever you choose it to be. I'd sti

Re: search performance

2014-06-02 Thread Jamie

I was under the impression that NRTCachingDirectory will instantiate an MMapDirectory if a 64 bit platform is detected? Is this not the case? On 2014/06/02, 2:09 PM, Tincu Gabriel wrote: MMapDirectory will do the job for you. RamDirectory has a big warning in the class description stating that

Re: search performance

2014-06-02 Thread Tincu Gabriel

MMapDirectory will do the job for you. RamDirectory has a big warning in the class description stating that the performance will get killed by an index larger than a few hundred MB, and NRTCachingDirectory is a wrapper for RamDirectory and suitable for low update rates. MMap will use the system RAM

Re: search performance

2014-06-02 Thread Jamie

Jack First off, thanks for applying your mind to our performance problem. On 2014/06/02, 1:34 PM, Jack Krupansky wrote: Do you have enough system memory to fit the entire index in OS system memory so that the OS can fully cache it instead of thrashing with I/O? Do you see a lot of I/O or are t

Re: search performance

2014-06-02 Thread Jack Krupansky

Do you have enough system memory to fit the entire index in OS system memory so that the OS can fully cache it instead of thrashing with I/O? Do you see a lot of I/O or are the queries compute-bound? You said you have a 128GB machine, so that sounds small for your index. Have you tried a 256GB

Re: search performance

2014-06-02 Thread Jamie

Tom Thanks for the offer of assistance. On 2014/06/02, 12:02 PM, Tincu Gabriel wrote: What kind of queries are you pushing into the index. We are indexing regular emails + attachments. Typical query is something like: filter: to:mbox08 from:mbox08 cc:mbox08 bcc:mbox08 deliver

Re: search performance

2014-06-02 Thread Tincu Gabriel

What kind of queries are you pushing into the index. Do they match a lot of documents ? Do you do any sorting on the result set? What is the average document size ? Do you have a lot of update traffic ? What kind of schema does your index use ? On Mon, Jun 2, 2014 at 6:51 AM, Jamie wrote: > Gre

Re: Search performance using BooleanQueries in BooleanQueries

2007-11-06 Thread Mike Klaas

On 6-Nov-07, at 3:02 PM, Paul Elschot wrote: On Tuesday 06 November 2007 23:14:01 Mike Klaas wrote: Wait--shouldn't the outer-most BooleanQuery provide most of this speedup already (since it should be skipTo'ing between the nested BooleanQueries and the outermost). Is it the indirection and

Re: Search performance using BooleanQueries in BooleanQueries

2007-11-06 Thread Paul Elschot

On Tuesday 06 November 2007 23:14:01 Mike Klaas wrote: > On 29-Oct-07, at 9:43 AM, Paul Elschot wrote: > > On Friday 26 October 2007 09:36:58 Ard Schrijvers wrote: > >> +prop1:a +prop2:b +prop3:c +prop4:d +prop5:e > >> > >> is much faster than > >> > >> (+(+(+(+prop1:a +prop2:b) +prop3:c) +prop4:d)

Re: Search performance using BooleanQueries in BooleanQueries

2007-11-06 Thread Mike Klaas

On 29-Oct-07, at 9:43 AM, Paul Elschot wrote: On Friday 26 October 2007 09:36:58 Ard Schrijvers wrote: +prop1:a +prop2:b +prop3:c +prop4:d +prop5:e is much faster than (+(+(+(+prop1:a +prop2:b) +prop3:c) +prop4:d) +prop5:e) where the second one is a result from BooleanQuery in BooleanQuery

RE: Search performance using BooleanQueries in BooleanQueries

2007-10-30 Thread Ard Schrijvers

> On Friday 26 October 2007 09:36:58 Ard Schrijvers wrote: > > Hello, > > > > I am seeing that a query with boolean queries in boolean > queries takes > > much longer than just a single boolean query when the > number of hits > > if fairly large. For example > > > > +prop1:a +prop2:b +prop3:c

Re: Search performance using BooleanQueries in BooleanQueries

2007-10-29 Thread Paul Elschot

On Friday 26 October 2007 09:36:58 Ard Schrijvers wrote: > Hello, > > I am seeing that a query with boolean queries in boolean queries takes > much longer than just a single boolean query when the number of hits if > fairly large. For example > > +prop1:a +prop2:b +prop3:c +prop4:d +prop5:e > > is

Re: Search performance question

2007-09-06 Thread Mike Klaas

On 6-Sep-07, at 4:41 AM, makkhar wrote: Hi, I have an index which contains more than 20K documents. Each document has the following structure : field : ID (Index and store) typical value - "1000" field : parameterName(index and store) typical value

Re: Search performance question

2007-09-06 Thread Grant Ingersoll

Have a look at http://wiki.apache.org/lucene-java/BasicsOfPerformance Are you opening the IndexSearcher every time you search, even when no documents have changed? -Grant On Sep 6, 2007, at 7:41 AM, makkhar wrote: Hi, I have an index which contains more than 20K documents. Each docu

Re: Search performance question

2007-09-06 Thread Mark Miller

Your not expecting too much. On cheap hardware I watch searches on over 5 mil + docs that match every doc come back in under a second. Able to post your search code? makkhar wrote: Hi, I have an index which contains more than 20K documents. Each document has the following structure : fiel

Re: Search Performance Problem 16 sec for 250K docs

2006-08-21 Thread M A

It is indeed alot faster ... Will use that one now .. hits = searcher.search(query, new Sort(new SortField(null,SortField.DOC,true))); That is completing in under a sec for pretty much all the queries .. On 8/22/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: On 8/21/06, M A <[EMAIL PROTECTE

Re: Search Performance Problem 16 sec for 250K docs

2006-08-21 Thread Yonik Seeley

On 8/21/06, M A <[EMAIL PROTECTED]> wrote: I still dont get this, How would i do this, so i can try it out .. http://lucene.apache.org/java/docs/api/org/apache/lucene/search/SortField.html#SortField(java.lang.String,%20int,%20boolean) new Sort(new SortField(null,SortField.DOC,true) -Yonik h

Re: Search Performance Problem 16 sec for 250K docs

2006-08-21 Thread M A

I still dont get this, How would i do this, so i can try it out .. is searcher.search(query, new Sort(SortField.DOC)) ..correct this would return stuff in the order of the documents, so how would i reverse this, i mean the later documents appearing fisrt .. searcher.search(query, new Sort(???

Re: Search Performance Problem 16 sec for 250K docs

2006-08-21 Thread Yonik Seeley

On 8/21/06, M A <[EMAIL PROTECTED]> wrote: Yeah I tried looking this up, If i wanted to do it by document id (highest docs first) , does this mean doing something like hits = searcher.search(query, new Sort(new SortFeild(DOC, true); // or something like that, is this way of sorting any differe

Re: Search Performance Problem 16 sec for 250K docs

2006-08-21 Thread M A

Yeah I tried looking this up, If i wanted to do it by document id (highest docs first) , does this mean doing something like hits = searcher.search(query, new Sort(new SortFeild(DOC, true); // or something like that, is this way of sorting any different performance wise to what i was doing befo

Re: Search Performance Problem 16 sec for 250K docs

2006-08-21 Thread Yonik Seeley

On 8/20/06, M A <[EMAIL PROTECTED]> wrote: The index is already built in date order i.e. the older documents appear first in the index, what i am trying to achieve is however the latest documents appearing first in the search results .. without the sort .. i think they appear by relevance .. wel

Re: Search Performance Problem 16 sec for 250K docs

2006-08-21 Thread Yonik Seeley

public void search(Weight weight, org.apache.lucene.search.Filterfilter, final HitCollector results) throws IOException { HitCollector collector = new HitCollector() { public final void collect(int doc, float score) { try {

Re: Search Performance Problem 16 sec for 250K docs

2006-08-21 Thread M A

Ok this is what i have done so far -> static class MyIndexSearcher extends IndexSearcher { IndexReader reader = null; public MyIndexSearcher(IndexReader r) { super(r); reader = r; } public void search(Weight weight, org.apache.lucene.search.

Re: Search Performance Problem 16 sec for 250K docs

2006-08-20 Thread M A

The index is already built in date order i.e. the older documents appear first in the index, what i am trying to achieve is however the latest documents appearing first in the search results .. without the sort .. i think they appear by relevance .. well thats what it looked like .. I am looking

Re: Search Performance Problem 16 sec for 250K docs

2006-08-20 Thread Erick Erickson

Talk about mails crossing in the aether.. wrote my resonse before seeing the last two... Sounds like you're on track. Erick On 8/20/06, Erick Erickson <[EMAIL PROTECTED]> wrote: About luke... I don't know about command-line interfaces, but if you copy your index to a different machine and

Re: Search Performance Problem 16 sec for 250K docs

2006-08-20 Thread Erick Erickson

About luke... I don't know about command-line interfaces, but if you copy your index to a different machine and use Luke there. I do this between Linux and Windows boxes all the time. Or, if you can mount the remote drive so you can see it, you can just use Luke to browse to it and open it up. You

Re: Search Performance Problem 16 sec for 250K docs

2006-08-20 Thread M A

Just ran some tests .. it appears that the problem is in the sorting .. i.e. //hits = searcher.search(query, new Sort("sid", true));-> 17 secs //hits = searcher.search(query, new Sort("sid", false)); -> 17 secs hits = searcher.search(query);-> less than 1 sec .. am trying something out

Re: Search Performance Problem 16 sec for 250K docs

2006-08-20 Thread Erik Hatcher

This is why a warming strategy like Solr takes is very valuable. The searchable index is always serving up requests as fast as Lucene works, which is achieved by warming a new IndexSearcher with searches/ sorts/filter creating/etc before it is swapped into use. Erik On Aug 20, 200

Re: Search Performance Problem 16 sec for 250K docs

2006-08-20 Thread M A

Ok I get your point, this still however means the first search on the new searcher will take a huge amount of time .. given that this is happening now .. i.e. new search -> new query -> get hits ->20+ secs .. this happens every 5 mins or so .. although subsequent searches may be quicker .. Am

Re: Search Performance Problem 16 sec for 250K docs

2006-08-19 Thread Chris Hostetter

: This is because the index is updated every 5 mins or so, due to the incoming : feed of stories .. : : When you say iteration, i take it you mean, search request, well for each : search that is conducted I create a new one .. search reader that is .. yeah ... i ment iteration of your test. don'

Re: Search Performance Problem 16 sec for 250K docs

2006-08-19 Thread M A

yes there is a new searcher opened each time a search is conducted, This is because the index is updated every 5 mins or so, due to the incoming feed of stories .. When you say iteration, i take it you mean, search request, well for each search that is conducted I create a new one .. search read

Re: Search Performance Problem 16 sec for 250K docs

2006-08-19 Thread Chris Hostetter

: hits = searcher.search(query, new Sort("sid", true)); you don't show where searcher is initialized, and you don't clarify how you are timing your multiple iterations -- i'm going to guess that you are opening a new searcher every iteration right? sorting on a field requires pre-computing a

Re: Search Performance Problem 16 sec for 250K docs

2006-08-19 Thread M A

what i am measuring is this Analyzer analyzer = new StandardAnalyzer(new String[]{}); if(fldArray.length > 1) { BooleanClause.Occur[] flags = {BooleanClause.Occur.SHOULD, BooleanClause.Occur.SHOULD, BooleanClause.Occur.SHOULD, BooleanClause.Occur.SHOULD}; query = MultiFieldQueryP

Re: Search Performance Problem 16 sec for 250K docs

2006-08-19 Thread Erick Erickson

This is a lnggg time, I think you're right, it's excessive. What are you timing? The time to complete the search (i.e. get a Hits object back) or the total time to assemble the response? Why I ask is that the Hits object is designed to return the fir st100 or so docs efficiently. Every 10

Re: search performance benchmarks

2006-06-27 Thread heritrix . lucene

post - your feedback is very helpful! Vlad -Original Message- From: Mike Streeton [mailto:[EMAIL PROTECTED] Sent: Tuesday, June 27, 2006 2:59 AM To: java-user@lucene.apache.org Subject: RE: search performance benchmarks We recently ran some benchmarks on Linux with 4 xeon cpus and 2gb

RE: search performance benchmarks

2006-06-27 Thread Vladimir Olenin

Message- From: Mike Streeton [mailto:[EMAIL PROTECTED] Sent: Tuesday, June 27, 2006 2:59 AM To: java-user@lucene.apache.org Subject: RE: search performance benchmarks We recently ran some benchmarks on Linux with 4 xeon cpus and 2gb of heap (not that this was needed). We managed to easily get 100

RE: search performance benchmarks

2006-06-26 Thread Mike Streeton

represent 1000 connected users. Mike www.ardentia.com the home of NetSearch -Original Message- From: Wang, Jeff [mailto:[EMAIL PROTECTED] Sent: 26 June 2006 19:50 To: java-user@lucene.apache.org Subject: RE: search performance benchmarks Performance varies a lot, and depends upon the number

RE: search performance benchmarks

2006-06-26 Thread Wang, Jeff

Performance varies a lot, and depends upon the number of indexes, the number of fields, and the CPU/memory configuration. For myself, a 65Gb source indexed to 1Gb (or so) returns single term queries (oh yeah, the query makeup also matters a lot) in sub seconds on a Intel dual processor (each is 3.

Re: search performance degrades by order of magnitude when using SortField.

2006-05-29 Thread Chris Hostetter

: default Sort.RELEVANCE, query response time is ~6ms. However, when I : specify a sort, e.g. Searcher.search( query, new Sort( "mydatefield" ) : ), the query response time gets multiplied by a factor of 10 or 20. ... : do a top-K ranking over the same number of raw hits. The performanc

Re: Lucene trunk update question. WAS RE: search performance enhancement

2005-09-26 Thread Erik Hatcher

On Sep 26, 2005, at 3:10 AM, Paul Elschot wrote: I used my bug votes already. I hope more people will do that, hint: http://issues.apache.org/jira/secure/BrowseProject.jspa?id=12310110 Is there a way to view the open issues sorted by number of votes? There is the "Popular Issues" view: <

Re: Lucene trunk update question. WAS RE: search performance enhancement

2005-09-26 Thread Paul Elschot

Otis, On Monday 26 September 2005 00:37, Otis Gospodnetic wrote: > As Erik Hatcher noted in another email (it might have been on the -dev > list), we'll go through JIRA before making the next release and try to > push the patches like this one into the core. Personally, it has been I used my bug

Re: Lucene trunk update question. WAS RE: search performance enhancement

2005-09-25 Thread Otis Gospodnetic

OTECTED] > > Sent: 21 September 2005 19:16 > > To: java-user@lucene.apache.org > > Subject: Re: search performance enhancement > > > > On Wednesday 21 September 2005 03:29, John Wang wrote: > > > Hi Paul and other gurus: > > > > > > In a r

Re: Lucene trunk update question. WAS RE: search performance enhancement

2005-09-22 Thread Paul Elschot

September 2005 19:16 > To: java-user@lucene.apache.org > Subject: Re: search performance enhancement > > On Wednesday 21 September 2005 03:29, John Wang wrote: > > Hi Paul and other gurus: > > > > In a related topic, seems lucene is scoring documents that wou

Lucene trunk update question. WAS RE: search performance enhancement

2005-09-22 Thread Peter Gelderbloem

:16 To: java-user@lucene.apache.org Subject: Re: search performance enhancement On Wednesday 21 September 2005 03:29, John Wang wrote: > Hi Paul and other gurus: > > In a related topic, seems lucene is scoring documents that would hit in a > "prohibited" boolean clause, e.

Re: search performance enhancement

2005-09-21 Thread Paul Elschot

On Wednesday 21 September 2005 03:29, John Wang wrote: > Hi Paul and other gurus: > > In a related topic, seems lucene is scoring documents that would hit in a > "prohibited" boolean clause, e.g. NOT field:value. It doesn't seem to make > sense to score a document that is to be excluded from the

Re: search performance enhancement

2005-09-20 Thread John Wang

Hi Paul and other gurus: In a related topic, seems lucene is scoring documents that would hit in a "prohibited" boolean clause, e.g. NOT field:value. It doesn't seem to make sense to score a document that is to be excluded from the result. Is this a difficult thing to fix? Also in Paul's ealie

Re: search performance enhancement

2005-08-19 Thread Paul Elschot

On Friday 19 August 2005 18:09, John Wang wrote: > Hi Paul: > > Thanks for the pointer. > >How would I extend from the patch you submitted to filter out > more documents not using a Filter. e.g. > > have a class to skip documents based on a docID: boolean > isValid(int docID)

Re: search performance enhancement

2005-08-19 Thread John Wang

Hi Paul: Thanks for the pointer. How would I extend from the patch you submitted to filter out more documents not using a Filter. e.g. have a class to skip documents based on a docID: boolean isValid(int docID) My problem is I want to discard documents at query time wit

Re: search performance enhancement

2005-08-16 Thread Paul Elschot

Hi John, On Wednesday 17 August 2005 04:46, John Wang wrote: > Hi: > >I posted a bug (36147) a few days ago and didn't hear anything, so > I thought I'd try my luck on this list. > >The idea is to avoid score calculations on documents to be filtered > out anyway. (e.g. via Filter object

Re: Search performance under high load

2005-04-07 Thread David Spencer

Yura Smolsky wrote: Hello, mark. mh> 2) My app uses long queries, some of which include mh> very common terms. Using the "MoreLikeThis" query to mh> drop common terms drastically improved performance. If mh> your "killer queries" are long ones you could spot mh> them and service them with a MoreLik

Re: Search performance under high load

2005-04-07 Thread mark harwood

In addition to the comments already made, I recently recently found these changes to be useful: 1) Swapping out Sun 1.4.2_05 JVM for BEA's JRockit JVM halved my query times. (In both cases did not tweak any default JVM settings other than -Xmx to ensure adequate memory allocation). 2) My app use

Re: Search performance under high load

2005-04-07 Thread Paul Elschot

Daniel, On Thursday 07 April 2005 00:54, Chris Hostetter wrote: > > : Queries: The query strings are of highly differing complexity, from > : simple x:y to long queries involving conjunctions, disjunctions and > : wildecard queries. > : > : 90% of the queries run brilliantly. Problem is that 10%

Re: Search performance under high load

2005-04-06 Thread David Spencer

Daniel Herlitz wrote: Hi everybody, We have been using Lucene for about one year now with great success. Recently though the index has growed noticably and so has the number of searches. I was wondering if anyone would like to comment on these figures and say if it works for them? Index size: ~

Re: Search performance under high load

2005-04-06 Thread Chris Hostetter

: Queries: The query strings are of highly differing complexity, from : simple x:y to long queries involving conjunctions, disjunctions and : wildecard queries. : : 90% of the queries run brilliantly. Problem is that 10% of the queries : (simple or not) take a long time, on average more that 10 se

86 matches

Mail list logo