Re: FST performance difference b/w build() & load()

Dawid Weiss Wed, 03 Jul 2013 13:13:10 -0700

An isolated test case would be great.

Dawid


On Wed, Jul 3, 2013 at 10:09 PM, Michael Garski <[email protected]> wrote:
> This is using Solr/Lucene 4.3.0.
>
> The test was done with a fresh JVM each time. I have a search component & 
> query parser that heavily use WFSTCompletionLookups and observed the 
> difference in query performance between creating a new lookup vs. loading a 
> previously built one at startup. In the case where Solr was started and the 
> lookup was loaded from a previously saved one and then subsequently trigger 
> the component to build a new lookup (done periodically to update the weights) 
> the performance then returns to the same level as if it was initially started 
> by building a new lookup. This is what leads me to suspect a difference in 
> performance between a lookup created with load and build.
>
> Admittedly this is not a good measure of the lookup performance between the 
> two cases, and I will create isolated tests of performance. I just wanted to 
> find out if this is something I should expect... sounds like it is not 
> expected and I will dig in further.
>
> Thanks,
>
> Michael
>
>
> On Jul 3, 2013, at 12:37 PM, Michael McCandless <[email protected]> 
> wrote:
>
>> Which Lucene version?
>>
>> A "just built" FST stores its bytes in smallish (64 KB I think) pages,
>> while a loaded-from-disk FST uses much larger pages (1 GB I think).
>>
>> But I would expect the loaded FST to be faster, not the other way around.
>>
>> How much of a speed difference are you seeing?  And are you restarting
>> the JVM in between the two runs?  (Ie fresh JVM for the "just built"
>> case, and a fresh JVM for the "loaded from disk" case).
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>>
>> On Wed, Jul 3, 2013 at 3:31 PM, Michael Garski <[email protected]> wrote:
>>> Hello -
>>>
>>> I've observed a noticeable performance difference in looking up completions 
>>> in a WFSTCompletionLookup when it is created using the build() method with 
>>> a TermFreqIterator versus one that is created by loading a previously saved 
>>> instance, with the instance created from the build method being faster. The 
>>> saved lookup is ~275MB on disk. I have not dug in to determine why this is, 
>>> but is this to be expected?
>>>
>>> Thanks,
>>>
>>> Michael
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [email protected]
>>> For additional commands, e-mail: [email protected]
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: FST performance difference b/w build() & load()

Reply via email to