Re: document diversity

2009-10-06 Thread Simon Willnauer
Michael, this sounds like a pretty good usecase for CustomScoreQuery (http://lucene.apache.org/java/2_9_0/api/core/org/apache/lucene/search/function/CustomScoreQuery.html) The org.apache.lucene.search.function package provides flexible programmatic control over document scores. You boost up documen

Re: document diversity

2009-10-06 Thread Paul Libbrecht
Just as you can add a query that will boost better things with a higher quality, you can add a query for a higher revenue. Basically, the default operator "should" in boolean-clauses can be used exactly for that: do not force this query to be matched but raise boost if there's something tha

Re: document diversity

2009-10-06 Thread Michael Masters
My initial description may have been a little abstract. Maybe I should explain exactly what I'm trying to do. My company has various revenue channels, one of which is per click. If a user does a search, we would like to show results with the greatest revenue, although we don't want people to be abl

Re: document diversity

2009-10-03 Thread Grant Ingersoll
I'm curious, can you elaborate more on the deeper use case for this? Perhaps just implementing faceting on doc type would be sufficient? That way users can drill in on doc type. Alternatively, I suppose you could implement a hit collector that accesses a field cache on the doc type field

Re: document diversity

2009-10-01 Thread Tricia Williams
Hi Mike, The first thing that comes to mind is to run a query for each document type (assuming that you have a field that stores the type) and qualify the document type: for example type:pdf. Then you would have to write something to combine the query results drawing an equal number of hits

Re: document diversity

2009-10-01 Thread Phil Whelan
Hi Mike, I'd simply store a field "doctype" with values "pdf", "txt", "html" and perform a separate search for each type. Although, I'd be interested if anyone has a cooler way of doing this. Cheers, Phil On Thu, Oct 1, 2009 at 9:56 AM, Michael Masters wrote: > I was wondering if there is any w

document diversity

2009-10-01 Thread Michael Masters
I was wondering if there is any way to control what kind of documents are returned from a search. For example, lets say we have an index built from different types of documents (pdf, txt, html, etc.). Is there a way to have the first x results have a specified distribution of document types? It wou