I agree that comparing the BytesRef lengths in an equals() method seems counter to the purpose of having a BytesRef class.
I'd recommend taking a look at the BytesRefHash which maps BytesRef objects to unique ids as it 'may' be more efficient than converting to Strings. Stuart -----Original Message----- From: Yann-Erwan Perio [mailto:ye.pe...@gmail.com] Sent: Tuesday, January 21, 2014 7:33 AM To: java-user@lucene.apache.org Subject: BytesRef equals() method Hello, I have been working a bit with BytesRef recently, and I wonder whether the content of the equals() method, and more specifically the content of the bytesEquals(BytesRef other) method, is the intended one. Here is my use case. I work with Lucene 4.6.0. During indexing, using a custom tokenizer, I have added some payloads onto some tokens. Using an extension of the Default Similarity, I was then able to retrieve these payloads, passing them to a collector of mine, so as to perform aggregation calculations. It occurred to me that the BytesRef retrieved were not exactly the same as the indexed - namely their real content was the same, but their offsets would differ. I was made aware of this because I used a Map<BytesRef, ...> in the collector, and the map would sometimes give inconsistent results. Checking out the source code, the hashcode() method looks valid to me, but the bytesEquals() method looks strange - because prior to comparing the real value of the BytesRef, it checks their lengths - and AIUI these may differ, even though BytesRef are logically equal. I am not familiar at all with the internals of Lucene (this includes the BytesRef mechanics), so I may be completely wrong here. FWIW, I solved my problem by creating fresh BytesRef from the ones sent by the similarity, using the copyBytes method. I could also have used the string representation of the BytesRef, but this appears to be slower than copying the bytes, by a magnitude of about 2.5. Kind regards. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org