Hi Shiva,

This proposal looks good.

I have two questions (sorry if I missed)

How are we going to handle distributed execution when dealing with top-K
ANN?
How does the memory component look like in terms of configurable size? My
naive understanding is that Vector index itself is memory-hungry.

Best,
Taewoo


On Tue, Jan 13, 2026 at 2:04 PM Shiva Jahangiri <[email protected]> wrote:

> Hi all,
>
> Initiating discussion to add vector index in AsterixDB to support
> approximate nearest neighbor (ANN) queries.
>
>  Feature: Adding Vector Index
>
> Details: Currently AsterixDB does not support approximate nearest queries
> and similarity search on vector embeddings. This proposal suggests the
> first design of a tree-based vector indexing supporting top-k ANN queries
> which is fully compatible with LSM structure of AsterixDB's storage. As
> part of this proposal we provide support for :
>
> * Adding vector distance functions to support K-Nearest Neighbor (KNN)
> queries
> * Adding vector index to support ANN queries
> * Adding support for INCLUDE fields in vector index to better support
> filtered similarity search.
>
> APE:
>
> https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/ASTERIXDB/APE*31*3A*Vector*Index__;KyUrKw!!MLMg-p0Z!D1Zmu-mN_byA8sV_P_p7aLlbYJIg0b19njsPeaMkVTovtWeW0IsD-CIgOo0MJ7_7t3pZsk63GqP6lfK1$
>
> Thanks,
> Shiva
>
> --
> Shiva Jahangiri
> Assistant Professor in Computer Science and Engineering Department
> Santa Clara University
>

Reply via email to