On Tue, May 02, 2006 at 01:57:46PM -0700, Chris Hostetter wrote: > As far as I can figure, your best bet really is to use a seperate > field for each title -- but instead of combining the queries into a > BooleanQuery, use a DisjunctionMaxQuery with a tiebreaker value of > 0.0f ... the whole purpose of that query is to score documents based > on the maximum score of a sub query, but you still have to make the > sub queries your self, and there's no easy way to make a query that > only searches the first "chunk" of terms from a field.
I don't understand what you mean with "first 'chunk' of terms from a field". I actually tried to implement this method, but unfortunately I face yet another problem. Document 1: Title-1: Rijksmuseum Title-2: Rijksmuseum Amsterdam Document 2: Title-1: Amsterdam Title-2: NLAMS Document 1 gets the higher score in this case (when I do a search on 'Amsterdam'). The reason is that the term 'Amsterdam' has a higher docFreq in Title-1 than in Title-2 (as Title-2 is only seldom used). I think if I combine your .setOmitNorms(true) suggestion with this second suggestion it may actually work - that will be my next attempt. bye, /gst
signature.asc
Description: Digital signature