On Tue, Apr 9, 2013 at 9:11 AM, Adrien Grand <jpou...@gmail.com> wrote:

> The default codec stores numeric doc values by blocks of 4096 values
> that have independent numbers of bits per values. If you end up having
> most of these blocks empty, doc values will require little space but
> in a worst-case scenario where each block contains 1 single value, it
> is true that memory and disk usage will be very inefficient.
>

Also the default codec has a performance hack (depending on
acceptableOverHead) for optimizing the single byte case (e.g. norms or
other smallfloat scoring factor). In this case it doesn't even use
blockpackedwriter at all.

Thats why I recommended diskdv codec instead... the concepts are the same
but its not yet "optimized" so its easier to understand whats going on :)

Reply via email to