Hi All - have a table in Cassandra 5.02 that has several columns and a vector column.

I'm trying to do a hybrid query that includes a column and the ordering using ANN.  Such as:

select textdata from doc.google_gtr_t5_large where type='type1' order by embeddings ANN of [-0.005542, 0.000996, 0.039524, -0.004628, -0.017905, -0.002265, -0.119871...] limit 10;

This results in:
InvalidRequest: Error from server: code=2200 [Invalid query] message="ANN ordering by vector requires all restricted column(s) to be indexed"

The embeddings column has an SAI index on it, and the type column does as well.
Queries such as:

select textdata from doc.google_gtr_t5_large where type='type1' and source ='somesource';
works fine.

How can I create a table where I can combine a vector search with a column or columns such as a text or timestamp column?

Full table definition:

CREATE TABLE doc.google_gtr_t5_large (
    uuid text,
    type text,
    fieldname text,
    offset int,
    textdata text,
    creationdate timestamp,
    embeddings vector<float, 768>,
    metadata boolean,
    source text,
    sourceurl text,
    PRIMARY KEY ((uuid, type), fieldname, offset, textdata)
) WITH CLUSTERING ORDER BY (fieldname ASC, offset ASC, textdata ASC)
    AND additional_write_policy = '99p'
    AND allow_auto_snapshot = true
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND cdc = false
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}     AND compression = {'chunk_length_in_kb': '16', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND memtable = 'default'
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND extensions = {}
    AND gc_grace_seconds = 864000
    AND incremental_backups = true
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99p';

CREATE CUSTOM INDEX ann_index ON doc.google_gtr_t5_large (embeddings) USING 'sai';

CREATE CUSTOM INDEX creationidx ON doc.google_gtr_t5_large (creationdate) USING 'sai';

CREATE CUSTOM INDEX sourceidx ON doc.google_gtr_t5_large (source) USING 'sai';

CREATE CUSTOM INDEX typeidx ON doc.google_gtr_t5_large (type) USING 'sai';

Thank you!

-Joe


--
This email has been checked for viruses by AVG antivirus software.
www.avg.com

Reply via email to