Storing the date as a long and then searching with NumericRangeQuery will 
provide you with exactly what you're looking for. This is an efficient search 
solution for numeric data.

Optimize() will reduce the size of your index and improve search time at the 
cost of a large burst of overhead. Unless your searches are getting noticeably 
slower or your index is expanding rapidly, you're better off using 
IndexReader.reopen() for regular updates and optimize() occasionally.

Note that when using IndexReader.reopen() you should close the original 
IndexReader if it is still open to avoid memory leaks.

Jason

-----Original Message-----
From: Sam Jiang [mailto:sam.ji...@karoshealth.com] 
Sent: Thursday, September 22, 2011 10:18 AM
To: java-user@lucene.apache.org
Subject: searching / sorting on timestamp and update efficiency

Hi all

I have some questions about how I should store timestamps.

From my readings, I can see two ways of indexing timestamps:
DateTools (which uses formated timestamp strings) and
NumericUtils (which uses a long?).

I'm not sure which one gives more performance in my scenario:
For each of my document, it needs to have an indexed millisecond resolution
timestamp. Almost all searches would be invoked with a range filter
(searching at hour resolution is sufficient).
There are usually 2-4 updates to this timestamp field for recently indexed
documents. And afterwards, updates to this field or any other fields are
rare.

It would be great if somebody can advice me which format should I use.
p.s. should I be calling optimize() often given my frequent updates?

thanks

-- 
Sam Jiang | karoshealth
(っ゚Д゚;)っ hidden cat here
7 Father David Bauer Drive, Suite 201
Waterloo, ON, N2L 0A2, Canada
www.karoshealth.com

Reply via email to