Hi Shiva, This proposal looks good.
I have two questions (sorry if I missed) How are we going to handle distributed execution when dealing with top-K ANN? How does the memory component look like in terms of configurable size? My naive understanding is that Vector index itself is memory-hungry. Best, Taewoo On Tue, Jan 13, 2026 at 2:04 PM Shiva Jahangiri <[email protected]> wrote: > Hi all, > > Initiating discussion to add vector index in AsterixDB to support > approximate nearest neighbor (ANN) queries. > > Feature: Adding Vector Index > > Details: Currently AsterixDB does not support approximate nearest queries > and similarity search on vector embeddings. This proposal suggests the > first design of a tree-based vector indexing supporting top-k ANN queries > which is fully compatible with LSM structure of AsterixDB's storage. As > part of this proposal we provide support for : > > * Adding vector distance functions to support K-Nearest Neighbor (KNN) > queries > * Adding vector index to support ANN queries > * Adding support for INCLUDE fields in vector index to better support > filtered similarity search. > > APE: > > https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/ASTERIXDB/APE*31*3A*Vector*Index__;KyUrKw!!MLMg-p0Z!D1Zmu-mN_byA8sV_P_p7aLlbYJIg0b19njsPeaMkVTovtWeW0IsD-CIgOo0MJ7_7t3pZsk63GqP6lfK1$ > > Thanks, > Shiva > > -- > Shiva Jahangiri > Assistant Professor in Computer Science and Engineering Department > Santa Clara University >
