[ 
https://issues.apache.org/jira/browse/LUCENE-5731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-5731:
--------------------------------

    Attachment: LUCENE-5731.patch

Attached is a patch:
* added new DirectWriter, DirectReader. They support > 2B values and don't have 
concepts like 'acceptableOverhead', instead its just simple and ensures every 
bpv is fast.
* added RandomAccessInput api (default -> seek+read), with optimized impl for 
mmap.
* Added 3 byte padding to the end of every DirectWriter stream, all decoding is 
one i/o operation.
* DirectReader enforces its use
* Added new Lucene49DocValuesFormat using this stuff.

Across every bitsPerValue i see consistent performance gains, usually 50-75% 
from trunk today.

> split direct packed ints from in-ram ones
> -----------------------------------------
>
>                 Key: LUCENE-5731
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5731
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Robert Muir
>            Assignee: Robert Muir
>         Attachments: LUCENE-5731.patch
>
>
> Currently there is an oversharing problem in packedints that imposes too many 
> requirements on improving it:
> * every packed ints must be able to be loaded directly, or in ram, or 
> iterated with.
> * things like filepointers are expected to be adjusted (this is especially 
> stupid) in all cases
> * lots of unnecessary abstractions
> * versioning etc is complex
> None of this flexibility is needed or buys us anything, and it prevents 
> performance improvements (e.g. i just want to add 3 bytes at the end of 
> on-disk streams to reduce the number of bytebuffer calls and thats seriously 
> impossible with the current situation).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to