Note the comments in the source:

  /** Length of used bytes. */
  public int length;

length is not the same as the size of the internal buffer. It is the number of used bytes, so the length of the "logical" value as you call it.

-Mike

On 1/21/2014 10:32 AM, Yann-Erwan Perio wrote:
Hello,

I have been working a bit with BytesRef recently, and I wonder whether
the content of the equals() method, and more specifically the content
of the bytesEquals(BytesRef other) method, is the intended one.

Here is my use case. I work with Lucene 4.6.0. During indexing, using
a custom tokenizer, I have added some payloads onto some tokens. Using
an extension of the Default Similarity, I was then able to retrieve
these payloads, passing them to a collector of mine, so as to perform
aggregation calculations. It occurred to me that the BytesRef
retrieved were not exactly the same as the indexed - namely their real
content was the same, but their offsets would differ.

I was made aware of this because I used a Map<BytesRef, ...> in the
collector, and the map would sometimes give inconsistent results.
Checking out the source code, the hashcode() method looks valid to me,
but the bytesEquals() method looks strange - because prior to
comparing the real value of the BytesRef, it checks their lengths -
and AIUI these may differ, even though BytesRef are logically equal.

I am not familiar at all with the internals of Lucene (this includes
the BytesRef mechanics), so I may be completely wrong here. FWIW, I
solved my problem by creating fresh BytesRef from the ones sent by the
similarity, using the copyBytes method. I could also have used the
string representation of the BytesRef, but this appears to be slower
than copying the bytes, by a magnitude of about 2.5.

Kind regards.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to