Dear cassandra users, We have the following structure in MySql:
documents->[doc_id(primary key), title, description] elements->[element_id(primary key), doc_id(index), title, description] Notation: table name->[column1(key or index), column2, …] We want to transfer the data to Cassandra. Each document can contain a large number of elements (between 1 and 100k+) We have two requirements: a) Load all elements for a given doc_id quickly b) Update the value of one individual element quickly We were thinking on the following cassandra configurations: Option A documents->[doc_id(primary key), title, description, elements] (elements could be a SET or a TEXT, each time new elements are added (they are never removed) we would append it to this column) elements->[element_id(primary key), title, description] Loading a document: a) Load document with given <doc_id> and get all element ids SELECT * from documents where doc_id=‘id’ b) Load all elements with the given ids SELECT * FROM elements where element_id IN (ids loaded from query a) Option B documents->[doc_id(primary key), title, description] elements->[element_id(primary key), doc_id(secondary index), title, description] Loading a document: a) SELECT * from elements where doc_id=‘id’ Neither solutions doesn’t seem to be good, in Option A, even if we are querying by Primary keys, the second query will have 100k+ primary key id’s in the WHERE clause, and the second solution looks like an anti pattern in cassandra. Could anyone give any advice how would we create a model for our use case? Thank you in advance, Zoltan.