: At index time, I used a per document boost (over all fields) and a per : field bost (over all documents). I can certainly factor out the first : into a query boost, but I was under the impression that if I ever wanted : to combine fields (eg to index all "name" "alias" and "title" data in a : single "head" field) then I had to pre-boost the data prior to combining
whoa, whoa, WHOA! ... not at ALL ... I'm not sure how you got that impression, but when combining differnet pieces of source data into single field Lucene has no idea where those differnet peices come from -- boosting a "title" field has no impact whatsoever on a "head" field just because you happen to put the same piece of text in both "title" and "head" furthermore, field boosts apply to the entire field value, if you are making a "head" field containing some text you think of as title and some text you think of as "name" you can't set a boost just on the "title" part of the "head" field. as i said -- loose those field boosts and you hsould see a *big* improcement ... in general, i would advise against any attempt to combine differnet ideas into a single field for the purpose of improving relevancy ... the only reason i would ever take something like a "title" and an "author" and combine them into a single field is to make hte quering simpler/faster, not in an attempt to improve relevancy ... query lots of seperate fields using unique query time boosts. : it. I tend to believe that these (short) fields contain more relevant : information than (long) wikipedia articles or other documents. : Should idf and tf take care of that short/long quality distinction? It : sounds like you feel they should. tf/idf will take care of recognizing that the word "John" is relaly common, so it's not as significant to the query as "Bush" ... the lengthNorm function of Similarity is what will help score fields better then longer fields. : I'll build an index without the per field boost and see if that produces : improved results. try the DisjunctionMaxQuery too .. particularly if you have multiword queries. the DisMaxQueryParser in solr thta i mentioned before can be very handy. -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]