[
https://issues.apache.org/jira/browse/LUCENE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13428936#comment-13428936
]
Michael McCandless commented on LUCENE-3892:
--------------------------------------------
I just committed an optimization to BlockPF DocsEnum.advance, inlining
the scanning step (still have to do D&PEnum and EverythingEnum):
{noformat}
Task QPS base StdDev base QPS for StdDev for Pct
diff
IntNRQ 12.46 1.45 11.60 0.04 -16% -
5%
Wildcard 54.36 2.75 52.72 0.38 -8% -
2%
Prefix3 85.43 4.97 83.08 0.47 -8% -
3%
Fuzzy2 63.86 2.13 62.44 1.79 -8% -
4%
Respell 62.75 1.52 61.42 2.02 -7% -
3%
Fuzzy1 75.68 1.65 74.69 1.44 -5% -
2%
LowSpanNear 9.24 0.20 9.13 0.19 -5% -
3%
PKLookup 192.89 2.91 190.66 2.43 -3% -
1%
HighSpanNear 1.71 0.05 1.69 0.05 -6% -
4%
MedSpanNear 4.80 0.11 4.76 0.12 -5% -
4%
MedPhrase 12.57 0.27 12.56 0.21 -3% -
3%
MedSloppyPhrase 6.57 0.11 6.56 0.11 -3% -
3%
LowPhrase 21.55 0.35 21.55 0.28 -2% -
2%
LowSloppyPhrase 7.25 0.16 7.28 0.12 -3% -
4%
HighPhrase 1.81 0.11 1.82 0.10 -10% -
13%
HighSloppyPhrase 1.94 0.10 1.96 0.05 -6% -
9%
LowTerm 512.53 5.66 518.31 2.30 0% -
2%
MedTerm 196.09 4.68 198.76 0.30 -1% -
3%
HighTerm 35.53 0.95 36.11 0.03 -1% -
4%
OrHighMed 23.34 0.83 23.85 0.70 -4% -
9%
OrHighLow 26.91 0.98 27.53 0.82 -4% -
9%
OrHighHigh 11.27 0.41 11.53 0.34 -4% -
9%
AndHighHigh 21.24 0.05 23.79 0.13 11% -
12%
AndHighLow 553.19 8.47 621.35 4.01 9% -
14%
AndHighMed 57.45 0.13 67.78 0.70 16% -
19%
{noformat}
> Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta,
> Simple9/16/64, etc.)
> -------------------------------------------------------------------------------------
>
> Key: LUCENE-3892
> URL: https://issues.apache.org/jira/browse/LUCENE-3892
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Labels: gsoc2012, lucene-gsoc-12
> Fix For: 4.1
>
> Attachments: LUCENE-3892-BlockTermScorer.patch,
> LUCENE-3892-blockFor&hardcode(base).patch,
> LUCENE-3892-blockFor&packedecoder(comp).patch,
> LUCENE-3892-blockFor-with-packedints-decoder.patch,
> LUCENE-3892-blockFor-with-packedints-decoder.patch,
> LUCENE-3892-blockFor-with-packedints.patch,
> LUCENE-3892-direct-IntBuffer.patch, LUCENE-3892-for&pfor-with-javadoc.patch,
> LUCENE-3892-handle_open_files.patch,
> LUCENE-3892-pfor-compress-iterate-numbits.patch,
> LUCENE-3892-pfor-compress-slow-estimate.patch, LUCENE-3892_for_byte[].patch,
> LUCENE-3892_for_int[].patch, LUCENE-3892_for_unfold_method.patch,
> LUCENE-3892_pfor_unfold_method.patch, LUCENE-3892_pulsing_support.patch,
> LUCENE-3892_settings.patch, LUCENE-3892_settings.patch
>
>
> On the flex branch we explored a number of possible intblock
> encodings, but for whatever reason never brought them to completion.
> There are still a number of issues opened with patches in different
> states.
> Initial results (based on prototype) were excellent (see
> http://blog.mikemccandless.com/2010/08/lucene-performance-with-pfordelta-codec.html
> ).
> I think this would make a good GSoC project.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]