zhaih commented on code in PR #12844:
URL: https://github.com/apache/lucene/pull/12844#discussion_r1412533096
##########
lucene/core/src/java/org/apache/lucene/util/hnsw/NeighborArray.java:
##########
@@ -32,17 +32,20 @@
* @lucene.internal
*/
public class NeighborArray {
+ private static final int INITIAL_CAPACITY = 10;
private final boolean scoresDescOrder;
+ private final int maxSize;
private int size;
float[] score;
int[] node;
private int sortedNodeSize;
public final ReadWriteLock rwlock = new ReentrantReadWriteLock(true);
public NeighborArray(int maxSize, boolean descOrder) {
- node = new int[maxSize];
- score = new float[maxSize];
+ node = new int[INITIAL_CAPACITY];
Review Comment:
I think it's just due to having consuming more memory -> more GC cycles
needed -> higher latency.
So I rechecked the code, when we insert a node, we will first collect
`beamWidth` number of candidates, and then try to diversely add those
candidates to the neighborArray. So I think:
* in case that `beamWidth > maxSize`, we can just init this with `maxSize`
and done, because it's likely in a larger graph that the first fill will
directly fill the `NeighborArray` to full and there's no point on resizing it
with any init size.
* in case that `beamWidth < maxSize`, we can just init this with
`beamWidth` such that the init fill will likely fill the array in a nearly full
state?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]