Hi, I'd like to apply a score filter. I realise that filtering by absolute (i.e. anything less than x) scores is pretty meaningless.
In my case I want to filter based on relative score - or on some function of score which looks for clustering of documents around certain score values. Context: I have set up field boosts such that a query hit on one indexed field will, in theory, result in a score one or more order of magnitudes greater than a hit on some other field. So if I have 2 fields A and B and I'm really really interested in hits on A, and only interested in hits on B if there were none on A, I boost A by 1000, relative to B. The resultant score should reflect this. The ability to do this becomes important when we want to re-order the search results around some other field (not score) and are not interested in displaying the least relevant documents. It is an easy thing to write a basic 'document collector/result filter' that uses relative score information to filter out documents where any score is less than some magnitude of the best score, but I'm sure this could be more elegantly generalised into some mathematical "relevance/significance" model/function which could determine some optimal cutoff for documents based on the clustering of results around scores. e.g. if my top 5 documents are all between score 0.9 and 0.7 and the remaining 10 are less than 0.01 then we could sensibly take the top 5 docs as most relevant. Has anyone experience of doing such a thing? Regards, Joel --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org