Re: Search Performance Problem 16 sec for 250K docs

2006-08-21 Thread M A
It is indeed alot faster ... Will use that one now .. hits = searcher.search(query, new Sort(new SortField(null,SortField.DOC,true))); That is completing in under a sec for pretty much all the queries .. On 8/22/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: On 8/21/06, M A <[EMAIL PROTECTE

Re: Search Performance Problem 16 sec for 250K docs

2006-08-21 Thread Yonik Seeley
On 8/21/06, M A <[EMAIL PROTECTED]> wrote: I still dont get this, How would i do this, so i can try it out .. http://lucene.apache.org/java/docs/api/org/apache/lucene/search/SortField.html#SortField(java.lang.String,%20int,%20boolean) new Sort(new SortField(null,SortField.DOC,true) -Yonik h

Re: Search Performance Problem 16 sec for 250K docs

2006-08-21 Thread M A
I still dont get this, How would i do this, so i can try it out .. is searcher.search(query, new Sort(SortField.DOC)) ..correct this would return stuff in the order of the documents, so how would i reverse this, i mean the later documents appearing fisrt .. searcher.search(query, new Sort(???

Re: Search Performance Problem 16 sec for 250K docs

2006-08-21 Thread Yonik Seeley
On 8/21/06, M A <[EMAIL PROTECTED]> wrote: Yeah I tried looking this up, If i wanted to do it by document id (highest docs first) , does this mean doing something like hits = searcher.search(query, new Sort(new SortFeild(DOC, true); // or something like that, is this way of sorting any differe

Re: Search Performance Problem 16 sec for 250K docs

2006-08-21 Thread M A
Yeah I tried looking this up, If i wanted to do it by document id (highest docs first) , does this mean doing something like hits = searcher.search(query, new Sort(new SortFeild(DOC, true); // or something like that, is this way of sorting any different performance wise to what i was doing befo

Re: Search Performance Problem 16 sec for 250K docs

2006-08-21 Thread Yonik Seeley
On 8/20/06, M A <[EMAIL PROTECTED]> wrote: The index is already built in date order i.e. the older documents appear first in the index, what i am trying to achieve is however the latest documents appearing first in the search results .. without the sort .. i think they appear by relevance .. wel

Re: Search Performance Problem 16 sec for 250K docs

2006-08-21 Thread Yonik Seeley
public void search(Weight weight, org.apache.lucene.search.Filterfilter, final HitCollector results) throws IOException { HitCollector collector = new HitCollector() { public final void collect(int doc, float score) { try {

Re: Search Performance Problem 16 sec for 250K docs

2006-08-21 Thread M A
Ok this is what i have done so far -> static class MyIndexSearcher extends IndexSearcher { IndexReader reader = null; public MyIndexSearcher(IndexReader r) { super(r); reader = r; } public void search(Weight weight, org.apache.lucene.search.

Re: Search Performance Problem 16 sec for 250K docs

2006-08-20 Thread M A
The index is already built in date order i.e. the older documents appear first in the index, what i am trying to achieve is however the latest documents appearing first in the search results .. without the sort .. i think they appear by relevance .. well thats what it looked like .. I am looking

Re: Search Performance Problem 16 sec for 250K docs

2006-08-20 Thread Erick Erickson
Talk about mails crossing in the aether.. wrote my resonse before seeing the last two... Sounds like you're on track. Erick On 8/20/06, Erick Erickson <[EMAIL PROTECTED]> wrote: About luke... I don't know about command-line interfaces, but if you copy your index to a different machine and

Re: Search Performance Problem 16 sec for 250K docs

2006-08-20 Thread Erick Erickson
About luke... I don't know about command-line interfaces, but if you copy your index to a different machine and use Luke there. I do this between Linux and Windows boxes all the time. Or, if you can mount the remote drive so you can see it, you can just use Luke to browse to it and open it up. You

Re: Search Performance Problem 16 sec for 250K docs

2006-08-20 Thread M A
Just ran some tests .. it appears that the problem is in the sorting .. i.e. //hits = searcher.search(query, new Sort("sid", true));-> 17 secs //hits = searcher.search(query, new Sort("sid", false)); -> 17 secs hits = searcher.search(query);-> less than 1 sec .. am trying something out

Re: Search Performance Problem 16 sec for 250K docs

2006-08-20 Thread Erik Hatcher
This is why a warming strategy like Solr takes is very valuable. The searchable index is always serving up requests as fast as Lucene works, which is achieved by warming a new IndexSearcher with searches/ sorts/filter creating/etc before it is swapped into use. Erik On Aug 20, 200

Re: Search Performance Problem 16 sec for 250K docs

2006-08-20 Thread M A
Ok I get your point, this still however means the first search on the new searcher will take a huge amount of time .. given that this is happening now .. i.e. new search -> new query -> get hits ->20+ secs .. this happens every 5 mins or so .. although subsequent searches may be quicker .. Am

Re: Search Performance Problem 16 sec for 250K docs

2006-08-19 Thread Chris Hostetter
: This is because the index is updated every 5 mins or so, due to the incoming : feed of stories .. : : When you say iteration, i take it you mean, search request, well for each : search that is conducted I create a new one .. search reader that is .. yeah ... i ment iteration of your test. don'

Re: Search Performance Problem 16 sec for 250K docs

2006-08-19 Thread M A
yes there is a new searcher opened each time a search is conducted, This is because the index is updated every 5 mins or so, due to the incoming feed of stories .. When you say iteration, i take it you mean, search request, well for each search that is conducted I create a new one .. search read

Re: Search Performance Problem 16 sec for 250K docs

2006-08-19 Thread Chris Hostetter
: hits = searcher.search(query, new Sort("sid", true)); you don't show where searcher is initialized, and you don't clarify how you are timing your multiple iterations -- i'm going to guess that you are opening a new searcher every iteration right? sorting on a field requires pre-computing a

Re: Search Performance Problem 16 sec for 250K docs

2006-08-19 Thread M A
what i am measuring is this Analyzer analyzer = new StandardAnalyzer(new String[]{}); if(fldArray.length > 1) { BooleanClause.Occur[] flags = {BooleanClause.Occur.SHOULD, BooleanClause.Occur.SHOULD, BooleanClause.Occur.SHOULD, BooleanClause.Occur.SHOULD}; query = MultiFieldQueryP

Re: Search Performance Problem 16 sec for 250K docs

2006-08-19 Thread Erick Erickson
This is a lnggg time, I think you're right, it's excessive. What are you timing? The time to complete the search (i.e. get a Hits object back) or the total time to assemble the response? Why I ask is that the Hits object is designed to return the fir st100 or so docs efficiently. Every 10

Search Performance Problem 16 sec for 250K docs

2006-08-19 Thread M A
Hi there, I have an index with about 250K document, to be indexed full text. there are 2 types of searches carried out, 1. using 1 field, the other using 4 .. for a query string ... given the nature of the queries required, all stop words are maintained in the index, thereby allowing for phrasa