[
https://issues.apache.org/jira/browse/LUCENE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13430423#comment-13430423
]
Han Jiang commented on LUCENE-3892:
-----------------------------------
Thanks Adrien! Your codes are really clean!
At first glance, I think we should still support all-value-the-same case? For
some applications(like index with payloads), that might be helpful.
And, I'm a little confused about your performance test. Did you use BlockPF
before r1370179 as a baseline, and compare it with your latest commit? Here, I
tested these two PF under latest versions(r1370345).
{noformat}
Task QPS base StdDev base QPS comp StdDev comp Pct
diff
AndHighHigh 124.53 9.36 100.46 3.31 -27% -
-9%
AndHighLow 2141.08 63.93 1922.73 36.32 -14% -
-5%
AndHighMed 281.48 36.49 218.68 13.10 -35% -
-5%
Fuzzy1 84.33 2.56 83.94 1.67 -5% -
4%
Fuzzy2 30.49 1.13 30.48 0.71 -5% -
6%
HighPhrase 9.08 0.28 7.56 0.20 -21% -
-11%
HighSloppyPhrase 5.46 0.21 4.88 0.23 -17% -
-2%
HighSpanNear 10.12 0.21 9.21 0.30 -13% -
-3%
HighTerm 176.52 6.13 146.13 5.43 -22% -
-11%
IntNRQ 59.56 1.98 51.05 1.33 -19% -
-9%
LowPhrase 40.02 1.03 32.75 0.37 -21% -
-15%
LowSloppyPhrase 59.59 2.85 51.49 1.33 -19% -
-6%
LowSpanNear 73.86 3.17 61.98 1.45 -21% -
-10%
LowTerm 1755.38 15.56 1622.61 26.87 -9% -
-5%
MedPhrase 25.99 0.47 21.01 0.17 -21% -
-16%
MedSloppyPhrase 30.52 0.89 24.77 0.55 -22% -
-14%
MedSpanNear 22.26 0.43 18.73 0.47 -19% -
-12%
MedTerm 651.90 18.97 573.34 19.25 -17% -
-6%
OrHighHigh 26.75 0.33 23.53 0.50 -14% -
-9%
OrHighLow 151.69 2.13 134.17 3.19 -14% -
-8%
OrHighMed 102.48 1.48 90.73 2.01 -14% -
-8%
PKLookup 216.59 5.70 215.99 2.99 -4% -
3%
Prefix3 166.00 0.78 145.25 1.29 -13% -
-11%
Respell 82.01 3.01 82.80 1.66 -4% -
6%
Wildcard 151.66 2.22 141.14 1.57 -9% -
-4%
{noformat}
Strange that it isn't working well on my computer. And results are similar when
I change MMapDirectory to NIOFSDirectory.
> Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta,
> Simple9/16/64, etc.)
> -------------------------------------------------------------------------------------
>
> Key: LUCENE-3892
> URL: https://issues.apache.org/jira/browse/LUCENE-3892
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Labels: gsoc2012, lucene-gsoc-12
> Fix For: 4.1
>
> Attachments: LUCENE-3892-BlockTermScorer.patch,
> LUCENE-3892-blockFor&hardcode(base).patch,
> LUCENE-3892-blockFor&packedecoder(comp).patch,
> LUCENE-3892-blockFor-with-packedints-decoder.patch,
> LUCENE-3892-blockFor-with-packedints-decoder.patch,
> LUCENE-3892-blockFor-with-packedints.patch,
> LUCENE-3892-direct-IntBuffer.patch, LUCENE-3892-for&pfor-with-javadoc.patch,
> LUCENE-3892-handle_open_files.patch,
> LUCENE-3892-pfor-compress-iterate-numbits.patch,
> LUCENE-3892-pfor-compress-slow-estimate.patch, LUCENE-3892_for_byte[].patch,
> LUCENE-3892_for_int[].patch, LUCENE-3892_for_unfold_method.patch,
> LUCENE-3892_pfor_unfold_method.patch, LUCENE-3892_pulsing_support.patch,
> LUCENE-3892_settings.patch, LUCENE-3892_settings.patch
>
>
> On the flex branch we explored a number of possible intblock
> encodings, but for whatever reason never brought them to completion.
> There are still a number of issues opened with patches in different
> states.
> Initial results (based on prototype) were excellent (see
> http://blog.mikemccandless.com/2010/08/lucene-performance-with-pfordelta-codec.html
> ).
> I think this would make a good GSoC project.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]