[
https://issues.apache.org/jira/browse/LUCENE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13430373#comment-13430373
]
Adrien Grand commented on LUCENE-3892:
--------------------------------------
I backported Mike's changes to the {{BlockPacked}} codec and tried to
understand why it was slower than {{Block}}...
The use of {{java.nio.*Buffer}} seemed to be the bottleneck
({{ByteBuffer.asLongBuffer}} and {{ByteBuffer.getLong}} especially are _very_
slow) of the decoding step so I switched back to decoding from long[] (instead
of LongBuffer) and added direct decoding from byte[] to avoid having to convert
the bytes to longs before decoding.
Tests passed with -Dtests.postingsformat=BlockPacked. Here are the results of
the benchmark (unfortunately, it started before Mike committed r1370179):
{noformat}
Task QPS 3892 StdDev 3892QPS 3892-packedStdDev 3892-packed
Pct diff
PKLookup 259.41 9.06 255.77 8.89 -8% -
5%
AndHighLow 1656.30 50.44 1653.85 55.05 -6% -
6%
AndHighHigh 82.90 1.82 83.47 2.52 -4% -
6%
AndHighMed 274.76 11.11 278.51 13.42 -7% -
10%
Prefix3 285.41 4.82 289.60 6.31 -2% -
5%
HighTerm 230.78 14.33 235.16 20.61 -12% -
18%
IntNRQ 55.91 1.03 57.13 2.73 -4% -
9%
LowTerm 1720.10 47.06 1759.16 55.47 -3% -
8%
Wildcard 290.54 3.82 297.39 5.42 0% -
5%
MedTerm 733.01 35.38 750.46 50.37 -8% -
14%
HighSpanNear 6.93 0.23 7.12 0.39 -6% -
11%
HighPhrase 6.46 0.22 6.65 0.46 -7% -
14%
Respell 96.11 2.84 99.00 3.98 -3% -
10%
OrHighHigh 38.07 2.53 39.23 3.06 -10% -
19%
Fuzzy2 50.29 1.70 51.87 2.25 -4% -
11%
MedPhrase 26.20 0.94 27.03 1.07 -4% -
11%
OrHighMed 138.83 7.76 143.54 9.79 -8% -
16%
Fuzzy1 100.58 2.15 104.21 3.99 -2% -
9%
HighSloppyPhrase 5.26 0.11 5.45 0.24 -3% -
10%
OrHighLow 78.43 5.55 81.80 6.89 -10% -
21%
MedSpanNear 32.75 1.13 34.28 1.73 -3% -
13%
LowPhrase 90.27 3.20 95.06 3.58 -2% -
13%
LowSpanNear 46.40 1.95 48.89 2.40 -3% -
15%
MedSloppyPhrase 36.29 1.00 38.59 1.46 0% -
13%
LowSloppyPhrase 37.41 1.11 40.48 1.39 1% -
15%
{noformat}
Mike, Billy, could you check that {{BLockPacked}} is at least as fast as
{{Block}} on your computer too?
> Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta,
> Simple9/16/64, etc.)
> -------------------------------------------------------------------------------------
>
> Key: LUCENE-3892
> URL: https://issues.apache.org/jira/browse/LUCENE-3892
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Labels: gsoc2012, lucene-gsoc-12
> Fix For: 4.1
>
> Attachments: LUCENE-3892-BlockTermScorer.patch,
> LUCENE-3892-blockFor&hardcode(base).patch,
> LUCENE-3892-blockFor&packedecoder(comp).patch,
> LUCENE-3892-blockFor-with-packedints-decoder.patch,
> LUCENE-3892-blockFor-with-packedints-decoder.patch,
> LUCENE-3892-blockFor-with-packedints.patch,
> LUCENE-3892-direct-IntBuffer.patch, LUCENE-3892-for&pfor-with-javadoc.patch,
> LUCENE-3892-handle_open_files.patch,
> LUCENE-3892-pfor-compress-iterate-numbits.patch,
> LUCENE-3892-pfor-compress-slow-estimate.patch, LUCENE-3892_for_byte[].patch,
> LUCENE-3892_for_int[].patch, LUCENE-3892_for_unfold_method.patch,
> LUCENE-3892_pfor_unfold_method.patch, LUCENE-3892_pulsing_support.patch,
> LUCENE-3892_settings.patch, LUCENE-3892_settings.patch
>
>
> On the flex branch we explored a number of possible intblock
> encodings, but for whatever reason never brought them to completion.
> There are still a number of issues opened with patches in different
> states.
> Initial results (based on prototype) were excellent (see
> http://blog.mikemccandless.com/2010/08/lucene-performance-with-pfordelta-codec.html
> ).
> I think this would make a good GSoC project.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]