Re: Memory Usage

Marvin Humphrey Tue, 15 Nov 2005 21:15:43 -0800

Good stuff, Daniel...

Thanks for taking the time to tabulate the results and present them.If your results hold, it may have a significant impact on myapplication. I'm working on a Perl/XS port, and I think a lot ofpeople who want to run it won't be running mod_perl, so startup timesare quite important to me. I may end up setting the defaultIndexInterval considerably higher than 128 as a result of thisdiscussion.

The formatting of the results turned up a little screwy in my emailreader, so here's a reformatted version...

Timings for a simple TermQuery on the term "one" (docFreq = 22):

   skip    time range for query (ms)    approx mem usage of JVM (MB)
     1      28 ~  30                     49.2
     2      28 ~  30
     4      28 ~  30
     8      29 ~  31
    16      29 ~  32                     15.9 (!!)
    32      29 ~  33
    64      38 ~  42
   128      59 ~  61
   256      99 ~ 102                     14.1

Timings for a simple TermQuery on the term "test" (docFreq = 31,356):

   skip    time range for query (ms)
     1       6.8 ~  7.6
    16       9.7 ~ 10.2
   256      69   ~ 72

So, more frequent terms get a larger penalty due to this modification,
but the time was relatively fast to start with. Rarer terms getless of
a penalty, perhaps because they already take so much longer to find.

This doesn't sound right to me. The time to locate the term via theTermInfosReader shouldn't have anything to do with the doc_freq,since that's kept as a single number in .tis and .tii. Within theterm dictionary, all terms are more or less created equal.

I'm only passingly familiar with the org.apache.lucene.searchpackage, so I'm not sure what could account for this; I wouldnormally expect a more common term to take longer, as there are moredocs to score. Anybody got a expanation handy?


Marvin Humphrey
Rectangular Research
http://www.rectangular.com/


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Memory Usage

Reply via email to