Re: Mapping doc values back to doc ID (in decent time)

2015-08-09 Thread András Péteri
If I understand it correctly, the Zoie library [1][2] implements the "sledgehammer" approach by collecting docValues for all documents when a segment reader is opened. If you have some RAM to throw at the problem, this could indeed bring you an acceptable level of performance. [1] http://senseidb.

Re: Mapping doc values back to doc ID (in decent time)

2015-08-09 Thread Trejkaz
On Fri, Aug 7, 2015 at 5:34 PM, Adrien Grand wrote: > Does your application actually iterate in order over dense ids, or is > it just for benchmarking purposes? Because if it does, you probably > don't actually need seeking, you could just see what the current ID in > the terms enum is. Both dens

Re: Mapping doc values back to doc ID (in decent time)

2015-08-07 Thread Adrien Grand
On Fri, Aug 7, 2015 at 8:30 AM, Trejkaz wrote: > for (int ourId = 0; ourId < count; ourId++) > { > builder.clear(); > NumericUtils.longToPrefixCoded(ourId, 0, builder); > termsEnum.seekExact(builder.get()); > postingsEnum = termsEnum.postings(null, postingsE

Mapping doc values back to doc ID (in decent time)

2015-08-06 Thread Trejkaz
Hi all. It's that time again. I'm trying to kill off our long-standing reliance on stable doc IDs. To that end, I am adding an additional field which contains the ID. But we use these IDs a lot and for all kinds of purposes, and in some of these purposes, many lookups are done at once, so perform