E.g. see FreqProxTermsWriterPerField.FreqProxPostingsArray, which stores 5 parallel arrays indexed by that counter (called "term id" in the code, sometimes) to hold meta-data about each term until we can write it to the index.
Mike McCandless http://blog.mikemccandless.com On Sun, May 8, 2016 at 10:20 PM, shanghaihyj <shanghai...@163.com> wrote: > I see. > Yes, if a logical mapping of "byte[] ---> (offset and*arbitrary data)" is > required, this indirection is necessary. > > Thanks. > Yijian Huang > > > At 2016-05-08 23:06:14,"Adrien Grand" <jpou...@gmail.com> wrote: > >That would work if you are only interested in using BytesRefHash as a hash > >set for byte[]. However these incremental ids are useful if you want to > >associate data with each byte[]: you can create parallel arrays and use > the > >ids returned by the BytesRefHash as indices in these arrays. > > > >Le dim. 8 mai 2016 à 14:45, shanghaihyj <shanghai...@163.com> a écrit : > > > >> I'm studying the BytesRefHash class, a mapping from bytes to a generated > >> ID for the bytes. > >> > >> In the BytesRefHash class, there are two levels of reference: > >> (1) ids[bytes' hash code] ---> count, where count is the > self-incremental > >> size of the this hashmap. > >> (2) bytesStart[count] ---> offset in the ByteBlockPool, where the > original > >> bytes are stored. > >> > >> > >> My question is, can the above two references be collapsed into one, as > >> follows ? > >> ids[bytes' hash code] ---> offset in the ByteBlockPool. > >> > >> > >> I've searched the code, and cannot grab an idea what's the benefit to > have > >> another indirection via bytesStart. > >> > >> > >> p.s. Regarding such questions about Lucene source code, should I ask in > >> d...@lucene.apache.org instead ? These questions may be too easy and > thus > >> bothering to the developers... >