Boost more recent document

2011-11-30 Thread Zhang, Lisheng
Hi, We need to boost document which is more recent (each doc has time stamp attribute). It seems that we cannot use doc boost at index time because it will be condensed into one byte (cannot differentiate 365 days), so we may use payload (save time stamp as payload) to boost at search time.

Re: Boost more recent document

2011-11-30 Thread Simon Willnauer
On Wed, Nov 30, 2011 at 6:59 PM, Zhang, Lisheng wrote: > Hi, > > We need to boost document which is more recent (each doc has time stamp > attribute). It seems that > we cannot use doc boost at index time because it will be condensed into one > byte (cannot differentiate > 365 days), so we may u

RE: Boost more recent document

2011-11-30 Thread Zhang, Lisheng
Thanks very much for your helps! I got the point, only problem is that I cannot afford to to use FieldCache because in our app we have many lucene index data folders, is there another simple way? Thanks again, Lisheng -Original Message- From: Simon Willnauer [mailto:simon.willna...@googl

Re: Boost more recent document

2011-11-30 Thread Simon Willnauer
If you use LogMergePolicy ie. do merges in order you could use the absolute docID as a relative age value. Smaller docIDs mean younger documents. Maybe this works for you? simon On Wed, Nov 30, 2011 at 9:08 PM, Zhang, Lisheng wrote: > Thanks very much for your helps! I got the point, only proble

RE: Boost more recent document

2011-11-30 Thread Zhang, Lisheng
Hi, Thanks for the very interesting idea! Currently we use lucene 2.3.2 and we just use default merge policy (at any time we have a few segments and after some accumulation small segments are merged into big ones). I need to double check if docId can reflect doc age. But I have one concern: docI

RE: Boost more recent document

2011-11-30 Thread Zhang, Lisheng
Hi Simon, Sorry I found that I cannot use payload for this purpose because payload can be accessed only through term positions but we did not use timestamp for query. Ideally it would be great if we can have some doc-level "payload" accessible through docId? Then your initial suggestion to use Cu