Typically the vast majority of terms will in fact have docFreq < 128,
but a few very high freq terms may have many 128 blocks, and it's
those "costly" terms that you want decode to be fast for.
We encode that last partial block as vInt because we don't want to
fill 0s into the unoccupied part of t
We have identified the reason for slowness...
Lucene41PostingsWriter encodes postings-list as VInt when block-size < 128
and takes a FOR coding approach otherwise...
Most of our terms falls under VInt and that's why decompression during
merge-reads was eating up a lot of CPU cycles...
We switche
Hi,
I am finding that lucene is slowing down a lot when bigger and bigger
doc/pos files are merged... While it's normally the case, the worrying part
is all my data is in RAM. Version is 4.6.1
Some sample statistics took after instrumenting the SortingAtomicReader
code, as we use a SortingMergePo