Re: SV: Sort problematics

2006-05-18 Thread Erik Hatcher
On May 18, 2006, at 5:22 PM, Günther Starnberger wrote: On Thu, May 18, 2006 at 10:53:23PM +0200, Marcus Falck wrote: Hello, The term scorer will give higher score on documents containing both terms. This is a problem (in our application) since in this case want the same score on documents a

SV: SV: Sort problematics

2006-05-18 Thread Marcus Falck
Från: Günther Starnberger [mailto:[EMAIL PROTECTED] Skickat: to 2006-05-18 23:22 Till: java-user@lucene.apache.org Ämne: Re: SV: Sort problematics On Thu, May 18, 2006 at 10:53:23PM +0200, Marcus Falck wrote: Hello, > The term scorer will give higher score on d

Re: SV: Sort problematics

2006-05-18 Thread Günther Starnberger
On Thu, May 18, 2006 at 10:53:23PM +0200, Marcus Falck wrote: Hello, > The term scorer will give higher score on documents containing both > terms. This is a problem (in our application) since in this case want > the same score on documents as long as they contain 1 of the terms > (since we are d

Re: Sort problematics

2006-05-18 Thread Yonik Seeley
On 5/18/06, Marcus Falck <[EMAIL PROTECTED]> wrote: If i use lucene default implementation of the TermScorer and search for "you" OR "her" The term scorer will give higher score on documents containing both terms. This is a problem (in our application) since in this case want the same score o

SV: SV: Sort problematics

2006-05-18 Thread Marcus Falck
Well that book is cool =) Från: Erik Hatcher [mailto:[EMAIL PROTECTED] Skickat: to 2006-05-18 22:56 Till: java-user@lucene.apache.org Ämne: Re: SV: Sort problematics On May 18, 2006, at 4:25 PM, Marcus Falck wrote: > Where can i read more about the luc

Re: SV: Sort problematics

2006-05-18 Thread Erik Hatcher
On May 18, 2006, at 4:25 PM, Marcus Falck wrote: Where can i read more about the lucene sort implementation? Does there exist any documentation on the sorting except for the Lucene API docs? Well, there is "Lucene in Action" which covers sorting in a fair bit of detail. I hear that book i

SV: Sort problematics

2006-05-18 Thread Marcus Falck
Från: Yonik Seeley [mailto:[EMAIL PROTECTED] Skickat: to 2006-05-18 20:39 Till: java-user@lucene.apache.org Ämne: Re: Sort problematics On 5/18/06, Marcus Falck <[EMAIL PROTECTED]> wrote: >> I'm well aware of the trade offs. But if you

SV: Sort problematics

2006-05-18 Thread Marcus Falck
@lucene.apache.org Ämne: Re: Sort problematics On 5/18/06, Marcus Falck <[EMAIL PROTECTED]> wrote: > I'm well aware of the trade offs. But if you were aware of the large amounts > of data that this system should be able to search you woldn't propose the > usage of a data

Re: Sort problematics

2006-05-18 Thread Yonik Seeley
On 5/18/06, Marcus Falck <[EMAIL PROTECTED]> wrote: I'm well aware of the trade offs. But if you were aware of the large amounts of data that this system should be able to search you woldn't propose the usage of a database. If you have a hard requirement of instantly seeing any update, you ca

SV: SV: SV: SV: SV: Sort problematics

2006-05-18 Thread Marcus Falck
ailto:[EMAIL PROTECTED] Skickat: to 2006-05-18 20:09 Till: java-user@lucene.apache.org Ämne: Re: SV: SV: SV: SV: Sort problematics On May 18, 2006, at 11:33 AM, Marcus Falck wrote: > But it will still require A LOT of RAM just to cache! Well, the more RAM you have the better when it comes

SV: Sort problematics

2006-05-18 Thread Marcus Falck
ch machine won't have 500 Million docs but maybe around 100Million. So i'm still interesting in changing the relevance. Any ideas? / Marcus Från: Yonik Seeley [mailto:[EMAIL PROTECTED] Skickat: to 2006-05-18 17:43 Till: java-user@lucene.apach

Re: SV: SV: SV: SV: Sort problematics

2006-05-18 Thread Erik Hatcher
On May 18, 2006, at 11:33 AM, Marcus Falck wrote: But it will still require A LOT of RAM just to cache! Well, the more RAM you have the better when it comes to Solr responsiveness, I'm sure. But, Solr leverages some caching cleverness so the queries and filters used most frequently are in

Re: Sort problematics

2006-05-18 Thread Yonik Seeley
On 5/18/06, Marcus Falck <[EMAIL PROTECTED]> wrote: But since my "real" index will be around 2TB in size I don't think sorting is the right way to go? I pretty sure I will have to modify the ranking. They are both sorts, and they both use a priority queue. The differences shouldn't be that gr

SV: SV: SV: SV: Sort problematics

2006-05-18 Thread Marcus Falck
But it will still require A LOT of RAM just to cache! -Ursprungligt meddelande- Från: Yonik Seeley [mailto:[EMAIL PROTECTED] Skickat: den 18 maj 2006 17:24 Till: java-user@lucene.apache.org Ämne: Re: SV: SV: SV: Sort problematics On 5/18/06, Marcus Falck <[EMAIL PROTECTED]>

Re: SV: SV: SV: Sort problematics

2006-05-18 Thread Yonik Seeley
On 5/18/06, Marcus Falck <[EMAIL PROTECTED]> wrote: Doesn't solr use the same sort implementation as Lucene ? Yes, but Solr handles the mechanics of warming up a new searcher in the background to avoid those lengthy first-time hits to the FieldCache and norms, and it warms any configured caches

SV: Sort problematics

2006-05-18 Thread Marcus Falck
e ranking. And yes the data must be instantly available. / Marcus -Ursprungligt meddelande- Från: karl wettin [mailto:[EMAIL PROTECTED] Skickat: den 18 maj 2006 16:48 Till: java-user@lucene.apache.org Ämne: Re: Sort problematics On Thu, 2006-05-18 at 16:22 +0200, Marcus Falck wrote:

Re: Sort problematics

2006-05-18 Thread karl wettin
On Thu, 2006-05-18 at 16:22 +0200, Marcus Falck wrote: > Doesn't solr use the same sort implementation as Lucene ? Solr comes with more cache. Is it a requirement that the new data is instantly available? - To unsubscribe, e-ma

SV: SV: SV: SV: Sort problematics

2006-05-18 Thread Marcus Falck
Doesn't solr use the same sort implementation as Lucene ? -Ursprungligt meddelande- Från: Erik Hatcher [mailto:[EMAIL PROTECTED] Skickat: den 18 maj 2006 14:57 Till: java-user@lucene.apache.org Ämne: Re: SV: SV: SV: Sort problematics On May 18, 2006, at 7:04 AM, Marcus Falck

Re: SV: SV: SV: Sort problematics

2006-05-18 Thread Erik Hatcher
: Re: SV: SV: Sort problematics On May 18, 2006, at 6:41 AM, Marcus Falck wrote: Yes Erik I'm instantiating a new IndexSearcher for every search. Then don't :) You only need a new IndexSearcher instance when the index itself has changed. -Ursprungligt meddelande- Från: Er

SV: SV: SV: Sort problematics

2006-05-18 Thread Marcus Falck
Yes I know. But the index is changed constantly. / Marcus -Ursprungligt meddelande- Från: Erik Hatcher [mailto:[EMAIL PROTECTED] Skickat: den 18 maj 2006 12:52 Till: java-user@lucene.apache.org Ämne: Re: SV: SV: Sort problematics On May 18, 2006, at 6:41 AM, Marcus Falck wrote: >

Re: SV: SV: Sort problematics

2006-05-18 Thread Erik Hatcher
PROTECTED] Skickat: den 18 maj 2006 12:08 Till: java-user@lucene.apache.org Ämne: Re: SV: Sort problematics On May 18, 2006, at 4:52 AM, Marcus Falck wrote: I have slow subsequent searches. And if i get the cache up and running is it persisted to disc? No, Lucene's caches are not persist

SV: SV: Sort problematics

2006-05-18 Thread Marcus Falck
Yes Erik I'm instantiating a new IndexSearcher for every search. -Ursprungligt meddelande- Från: Erik Hatcher [mailto:[EMAIL PROTECTED] Skickat: den 18 maj 2006 12:08 Till: java-user@lucene.apache.org Ämne: Re: SV: Sort problematics On May 18, 2006, at 4:52 AM, Marcus Falck wrote

Re: SV: Sort problematics

2006-05-18 Thread Erik Hatcher
ou're likely not leveraging any caches at all. Erik /Marcus Från: Yonik Seeley [mailto:[EMAIL PROTECTED] Skickat: on 2006-05-17 16:31 Till: java-user@lucene.apache.org Ämne: Re: Sort problematics On 5/17/06, Marcus Falck <[EMAIL PROTECTE

SV: Sort problematics

2006-05-18 Thread Marcus Falck
I have slow subsequent searches. And if i get the cache up and running is it persisted to disc? /Marcus Från: Yonik Seeley [mailto:[EMAIL PROTECTED] Skickat: on 2006-05-17 16:31 Till: java-user@lucene.apache.org Ämne: Re: Sort problematics On 5/17/06

Re: Sort problematics

2006-05-17 Thread Yonik Seeley
On 5/17/06, Marcus Falck <[EMAIL PROTECTED]> wrote: I did a quite interesting notice, if i search for IndexId:x (IndexId is unique) with a sort it still takes very long time, which it doesn't without the sort. This will only be the case the first time you sort on a field because a FieldCache

Re: Sort problematics

2006-05-17 Thread karl wettin
On Wed, 2006-05-17 at 14:23 +0200, Marcus Falck wrote: > > I did a quite interesting notice, if i search for IndexId:x > (IndexId is unique) with a sort it still takes very long time, which > it doesn't without the sort. > > Does anybody know why? I mean the resultset contains exactly 1 > doc

Sort problematics

2006-05-17 Thread Marcus Falck
I did a quite interesting notice, if i search for IndexId:x (IndexId is unique) with a sort it still takes very long time, which it doesn't without the sort. Does anybody know why? I mean the resultset contains exactly 1 document. /Regards Marcus