1. redefine the archivedate field as YYmmDD format, 2. add another field using timestamp for sort use. 3. use RangeFilter to get result and then sort by timestamp.
2008/2/27, Jamie <[EMAIL PROTECTED]>: > > Hi Michael & Others > > Ok. I've gathered some more statistics from a different machine for your > analysis. > (I had to switch machines because the original one was in production and > my tests were interfering). > > Here are the statistics from the new machine: > > Total Documents: 1.2 million > Results Returned: 900k > Store Size 238G (size of original documents) > Index Size 1.2G (lucene index size) > Index / Store Ratio 0.5% > > The search query is as follows: > > archivedate:[d20071229010000 TO d20080228235900] > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~why there is an > extra 'd' ?~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > As you can see, I am using a range query to search between specific dates. > Question: should this query be moved to a filter rather? I did not do > this as I needed to have the option to sort on date. > > There are no other specific filters applied and in this example sorting > is turned off. > > On this particular machine the search time varies between 2.64 seconds > and about 5 seconds. > > The limitations of this machine are that it does uses a normal IDE drive > to house the index, not a SATA drive > > IOStat Statistics > > Linux 2.6.20-15-server 27/02/2008 > > avg-cpu: %user %nice %system %iowait %steal %idle > 20.25 0.00 3.23 0.34 0.00 76.19 > > Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn > sda 7.12 50.67 186.41 38936841 143240688 > > See attached for hardware info and the CPU call tree (taken from YourKit). > > I would appreciate your recommendations. > > > Jamie > > > h t wrote: > Hi Michael, > I guess the hotspot of lucene is > org.apache.lucene.search.IndexSearcher.search() > > Hi Jamie, > What's the original text size of a million emails? > I estimate the size of an email is around 100k, is this true? > When you doing search, what kind keywords did you input, words or short > sentence? > How many results return? > Did you use filter to shrink the results size? > > 2008/2/27, Michael Stoppelman <[EMAIL PROTECTED]>: > So you're saying searches are taking 10 seconds on a 5G index? If so > that > seems ungodly slow. > If you're on *nix, have you watched your iostat statistics? Maybe > something > is hammering your hds. > Something seems amiss. > > What lucene methods were pointed to as hotspots by YourKit? > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >