if you are already seting the document boost based on the "date" of hte Document, then the next thing you should familiarize yourself with is Similarity.coord function. It's specific purpose is for dealing with Queries which "aggregate" other queries (like a BooleanQuery does with it's clauses) to let you determine how much "penalty" documents recieve for only matching on a subset of the clauses.
(the conventional wisdom being that a document matching more clauses is "better" then a document matching few clauses, even if the aggregate score of second document is a little higher thn hte first) If you don't like that conventional wisdon, you can make the coord method of your Similarity return a constant value (like "1") to illiminate it's impact completely, or define some function that has a lower impact then the overlap/maxOverlap algorithm used in the DefaultSimilarity. Since illiminating the impact completely is a common need among "artifically" constructed queries (like Prefix queries and Wild card ueries) there is acctually a constructor for BooleanQuery that takes in a boolean to do this. Lastly: if you want more exact control over the way the dates of your documents influence the score (without mucking with norms to make them floats instead of bytes) consider using FunctionQuery ... searching the archives will pop up several examples of how it can be used, and you can find it in the Solr code base... http://incubator.apache.org/solr/ http://incubator.apache.org/solr/docs/api/org/apache/solr/search/function/package-summary.html : Date: Fri, 8 Sep 2006 09:45:13 +0200 : From: Marcus Falck <[EMAIL PROTECTED]> : Reply-To: java-user@lucene.apache.org : To: java-user@lucene.apache.org : Subject: Changing the Scoring api for OR parameters : : Hi everyone, : : : : I want to override the default scoring when it comes to queries : containing the OR operator. : : : : For example if I got the following headlines in my index : : : "Sun sues Microsoft" : : "Microsoft want to buy Tiscali" : : ".NU domain sues Microsoft" : : "The sun is shining" : : "Sun brings antitrust suit against Microsoft" : : : : Those documents have been boosted in desc fashion ("Sun sues Microsoft" : has higher calculated norm value then "Sun brings antirust suit against : Microsoft"), : : The similarity class that has been used has made the norm values to be : exactly as the boost value ( I have even modified the norm to be a float : so I won't loose precision ). : : : : If I perform a search for: Microsoft OR Sun : : : : The topranked results will almost certainly be: : : Sun sues Microsoft : : Sun Brings antitrust suit against Microsoft : : .... : : : : I just want the documents returned like this: : : "Sun sues Microsoft" : : "Microsoft want to buy Tiscali" : : ".NU domain sues Microsoft" : : "The sun is shining" : : "Sun brings antitrust suit against Microsoft" : : : : I have to get this to work since I'm indexing news material and the : customers are only interested in the newest articles ( so the date of : the article is being used as a boost factor). : : : : Any ideas? My rank changes to lucene works as expected when it comes to : AND operator and single term queries. : : : : / : : Regards : : : : Marcus Falck : : : : -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]