Hi everyone
We are trying to perform vector search with additional conditions and faced 
several issues with that. Could you please clarify is that behavior of the 
Cassandra is intended by design, are we doing something wrong, or probably 
there is some bug?

Our case:
We need to get a result based on two conditions: find similarity to a vector 
and filter by IDs matching with a certain list of IDs (we cannot determine it 
in advance).

Our table and indexes:
create table chunks
(
    id             int primary key,
    comment_vector vector< float,
    1024>,
    id2
    int
);

create custom index chunks_comment_vector_index
    on chunks (comment_vector)
    using 'sai';

create custom index chunks_id2_index
    on chunks (id2)
    using 'sai';

Our query:
SELECT id
FROM chunks where id2 in (33697, 24070, 46119, 21013, 44744, 47011, 98128, 
64183, 57494, 12937) -- There can be any number of IDs
order by comment_vector ANN OF %s
LIMIT 1000 ALLOW FILTERING;

Issue 1:
This query requires an "ALLOW FILTER", but with this parameter, the "LIMIT 
1000" is first executed, and then the "WHERE" condition. Is that works as 
expected by design or is there some gap? Is there any way to execute WHERE 
first and LIMIT at the end?

Issue 2:
IN condition requires ALLOW FILTER. If the IN condition is replaced with the 
"WHERE id2 < 50000", then the "ALLOW FILTER" is not required and there is a 
result. Why is that?


Thanks and best regards,
_____________________________________
Artsiom Tarasevich




________________________________
The information transmitted herein is intended only for the person or entity to 
which it is addressed and may contain confidential, proprietary and/or 
privileged material. Any review, retransmission, dissemination or other use of, 
or taking of any action in reliance upon, this information by persons or 
entities other than the intended recipient is prohibited. If you received this 
in error, please contact the sender and delete the material from any computer.

Reply via email to