You can't make documents more likely to be in the same segment, however I'm thinking you could use index sorting to make documents closer to each other on a per-segment basis?
Le jeu. 18 mai 2017 à 11:04, Tommaso Teofili <[email protected]> a écrit : > Hi all, > > I am working on a use case where my Lucene index stores documents composed > by (relatively short) text and binary values, at retrieval time I need to > retrieve documents that belong to a set of cluster values (e.g. facets). > In that context I was wondering if and how it'd be possible to make it > more probable that documents (and associated docValues) that belong to a > same cluster fall into the same segment. > That would allow to have a higher storage locality [1] and presumably a > better performance (given docs belonging to the same clusters get retrieved > together most of the times in my use case). > At first I had looked into extending the DV format but that's segment > agnostic therefore I am thinking of coming up with a merge policy which > produces segments whose docs belong to the same cluster with a high > probability. > Any other ideas / suggestions ? > > Regards, > Tommaso > > [1] : https://en.wikipedia.org/wiki/Locality_of_reference >
