I think this'd make a nice contribution -- eg it could be bundled up as a FieldComparator impl, eg LowMemoryStringComparator, that would compute the global ords in multiple passes with limited RAM usage.
It'd give users the space/time tradeoff... Mike On Mon, Dec 14, 2009 at 9:09 AM, Toke Eskildsen <t...@statsbiblioteket.dk> wrote: > On Fri, 2009-12-11 at 14:53 +0100, Michael McCandless wrote: >> How long does Lucene take to build the ords for the toplevel reader? >> >> You should be able to just time FieldCache.getStringIndex(topLevelReader). >> >> I think your 8.5 seconds for first Lucene search was with the >> StringIndex computed per segment? > > Cold disk-cache (directly after reboot): > [2009-12-14 14:44:10,914] Requesting StringIndex for field sort_title > [2009-12-14 14:44:20,326] Got StringIndex of length 2916008 in 9 > seconds, 412 ms > > Warm disk-cache (3 minutes after first test): > [2009-12-14 14:44:10,914] Requesting StringIndex for field sort_title > [2009-12-14 14:44:20,326] Got StringIndex of length 2916008 in 8 > seconds, 414 ms > > The response time for the first sorted search was about 8,5 seconds, but > that was after 6 non-sorted searches without the use of explicit field > cache, so some amount of warm-up was performed. > > Caveat: I must stress that this is very much ad hoc testing. > > > ----------------- FieldCache test code > > // Meant for testing > private FieldCache.StringIndex getStringIndex( > IndexReader reader, String field) { > log.info("Requesting StringIndex for field " + field); > Profiler profiler = new Profiler(); > FieldCache.StringIndex stringIndex; > try { > stringIndex = FieldCache.DEFAULT.getStringIndex(reader, > field); > } catch (IOException e) { > log.error("Could not retrieve StringIndex", e); > return null; > } > log.info("Got StringIndex of length " + stringIndex.order.length > + " in " + profiler.getSpendTime()); > return stringIndex; > } > > ----------------- Lucene 2.4 index > > ls -l index/sb/20091201-115941/lucene/ > > -rw-rw-r-- 1 summatst summatst 12840211452 Dec 2 11:21 _0.cfx > -rw-rw-r-- 1 summatst summatst 361027455 Dec 2 11:19 _32.cfs > -rw-rw-r-- 1 summatst summatst 373374178 Dec 2 11:19 _65.cfs > -rw-rw-r-- 1 summatst summatst 438076782 Dec 2 11:21 _98.cfs > -rw-rw-r-- 1 summatst summatst 463141239 Dec 2 11:19 _cb.cfs > -rw-rw-r-- 1 summatst summatst 1862427706 Dec 2 11:19 _rm.cfs > -rw-rw-r-- 1 summatst summatst 203 Dec 2 11:21 segments_3 > -rw-rw-r-- 1 summatst summatst 20 Dec 2 11:18 segments.gen > > ----------------- > > Regards, > Toke Eskildsen > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org