Hi,

I am trying to boost the freshness of some of our documents in the index using 
the most efficient way (i.e. if 2 news stories have the same score based on the 
content then I want to promote the one that was created last)

I have tried several techniques which do not seems to be performing that well 
or do not seem very efficient. 
- range query 
- custom query
- sorting

While looking at the archives I came across this email: 
http://www.gossamer-threads.com/lists/lucene/java-user/43457 where "Andrzej 
Bialecki" proposes the addition of a column related with days (months, etc) and 
add a "1" for each day/month that has passed from the epoch. I tried his 
solution and it does not seem to performing that well. The reason (unless my 
math have completely failed me) is because the boost that this new field 
provides is always the same.

For example i tried the above idea in an index that contained 4 documents. The 
days were set from 1-4 respectively.
The field weight (using the explain from Luke) is calculated as: field weight = 
tf * idf * fieldNorm. The idf is the same for all the documents. tf is defined 
as sqrt (times the term in the doc) and field Norm = field Boost * 1 / sqrt 
(terms in field).

In my example (and I cannot find how it would be different in other scenarios) 
the field weight is the same for all the documents (and equal to idf) because 
the sqrt (times the term in doc) * 1/sqrt (terms in field) = 1.

I would appreciate any feed back on that and also any other way you would 
consider in addressing the issue of boost the freshness of documents.

Thank you in advance,

Yannis.

Reply via email to