Re: Relevancy Practices

2010-05-05 Thread Avi Rosenschein
On Wed, May 5, 2010 at 5:08 PM, Grant Ingersoll wrote: > > On May 2, 2010, at 5:50 AM, Avi Rosenschein wrote: > > > On 4/30/10, Grant Ingersoll wrote: > >> > >> On Apr 30, 2010, at 8:00 AM, Avi Rosenschein wrote: > >>> Also, tuning the algorithms to the users can be very important. For > >>> ins

Re: Relevancy Practices

2010-05-05 Thread Peter Keegan
The feedback came directly from customers and customer facing support folks. Here is an example of a query with keywords: nurse, rn, nursing, hospital. The top 2 hits have scores of 26.86348 and 26.407215. To the customer, both results were equally relevant because all of their keywords were in the

Re: Relevancy Practices

2010-05-05 Thread Grant Ingersoll
Thanks, Peter. Can you share what kind of evaluations you did to determine that the end user believed the results were equally relevant? How formal was that process? -Grant On May 3, 2010, at 11:08 AM, Peter Keegan wrote: > We discovered very soon after going to production that Lucene's score

Re: Relevancy Practices

2010-05-05 Thread Grant Ingersoll
On May 2, 2010, at 5:50 AM, Avi Rosenschein wrote: > On 4/30/10, Grant Ingersoll wrote: >> >> On Apr 30, 2010, at 8:00 AM, Avi Rosenschein wrote: >>> Also, tuning the algorithms to the users can be very important. For >>> instance, we have found that in a basic search functionality, the default

Re: Relevancy Practices

2010-05-03 Thread Ivan Provalov
ns. 5. Some of the tools we use constantly - Lucene’s query Explanation and Luke. Thanks, Ivan Provalov --- On Thu, 4/29/10, Grant Ingersoll wrote: > From: Grant Ingersoll > Subject: Relevancy Practices > To: java-user@lucene.apache.org > Date: Thursday, April 29, 2010,

Re: Relevancy Practices

2010-05-03 Thread Peter Keegan
We discovered very soon after going to production that Lucene's scores were often 'too precise'. For example, a page of 25 results may have several different score values, and all within 15% of each other, but to the end user all 25 results were equally relevant. Thus we wanted the secondary sort f

AW: Relevancy Practices

2010-05-03 Thread Uwe Goetzke
2010 16:59 An: java-user@lucene.apache.org Betreff: Re: Relevancy Practices Hi Grant, You're welcome to use any of my slides (Dave's got them), with attribution of course. BUT Have you considered a section something like "why the hell do you think Relevancy tweaking is

Re: Relevancy Practices

2010-05-02 Thread Avi Rosenschein
On 4/30/10, Grant Ingersoll wrote: > > On Apr 30, 2010, at 8:00 AM, Avi Rosenschein wrote: >> Also, tuning the algorithms to the users can be very important. For >> instance, we have found that in a basic search functionality, the default >> query parser operator OR works very well. But on a page

Re: Relevancy Practices

2010-04-30 Thread MitchK
some thougths. I don't think that I tell you much new stuff, however, if you got any questions or want to know more about this or that, please ask. Unfortunately I can't go to the ApacheCon, but hopefully it helps to give a good presentation. Kind regards - Mitch -- View this messag

Re: Relevancy Practices

2010-04-30 Thread Grant Ingersoll
On Apr 30, 2010, at 8:00 AM, Avi Rosenschein wrote: > Also, tuning the algorithms to the users can be very important. For > instance, we have found that in a basic search functionality, the default > query parser operator OR works very well. But on a page for advanced users, > who want to very pre

Re: Relevancy Practices

2010-04-30 Thread Avi Rosenschein
On Thu, Apr 29, 2010 at 5:59 PM, Mark Bennett wrote: > Hi Grant, > > You're welcome to use any of my slides (Dave's got them), with attribution > of course. > > BUT > > Have you considered a section something like "why the hell do you think > Relevancy tweaking is gonna save you!?!?" > Basi

Re: Relevancy Practices

2010-04-29 Thread Mark Bennett
Hi Grant, You're welcome to use any of my slides (Dave's got them), with attribution of course. BUT Have you considered a section something like "why the hell do you think Relevancy tweaking is gonna save you!?!?" Basically that, as a corpus grows exponentially, so do results list sizes, so

RE: Relevancy Practices

2010-04-29 Thread Fornoville, Tom
and the scoring and relevancy in the search engine itself. Cheers, Tom -Original Message- From: Grant Ingersoll [mailto:gsi...@gmail.com] On Behalf Of Grant Ingersoll Sent: donderdag 29 april 2010 16:15 To: java-user@lucene.apache.org Subject: Relevancy Practices I'm putting on a talk

Relevancy Practices

2010-04-29 Thread Grant Ingersoll
I'm putting on a talk at Lucene Eurocon (http://lucene-eurocon.org/sessions-track1-day2.html#1) on "Practical Relevance" and I'm curious as to what people put in practice for testing and improving relevance. I have my own inclinations, but I don't want to muddy the water just yet. So, if you