Implementing a secondary index

Claude Warren Wed, 17 Nov 2021 01:17:14 -0800

Greetings,

I am looking to implement a Multidimensional Bloom filter index [1] [2] on
a Cassandra table.  OK, I know that is a lot to take in.  What I need is
any documentation that explains the architecture of the index options, or
someone I can ask questions of -- a mentor if you will.


I have a proof of concept for the index that works from the client side
[3].  What I want to do is move some of that processing to the server
side.

I basically I think I need to do the following:

   1. On each partition create an SST to store the index data.  This table
   comprises, 2 integer data points and the primary key for the data table.
   2. When the index cell gets updated in the original table (there will
   only be on column), update one or more rows in the SST table.
   3. When querying perform multiple queries against the index data, and
   return the primary key values (or the data associated with the primary keys
   -- I am unclear on this bit).

Any help or guidance would be appreciated,
Claude

[1] https://archive.org/details/arxiv-1501.01941/mode/2up
[2] https://archive.fosdem.org/2020/schedule/event/bloom_filters/
[3] https://github.com/Claude-at-Instaclustr/blooming_cassandra




-- 

[image: Instaclustr logo]


*Claude Warren*

Principal Software Engineer

Instaclustr

Implementing a secondary index

Reply via email to