I would definitely use dynamic columns for this instead of modifying schema 
(dynamically).
Sounds like an anti-pattern to me.



From: toddf...@gmail.com [mailto:toddf...@gmail.com] On Behalf Of Todd Fast
Sent: Monday, October 06, 2014 11:57 PM
To: Cassandra Users
Subject: Dynamic schema modification an anti-pattern?

There is a team at my work building a entity-attribute-value (EAV) store using 
Cassandra. There is a column family, called Entity, where the partition key is 
the UUID of the entity, and the columns are the attributes names with their 
values. Each entity will contain hundreds to thousands of attributes, out of a 
list of up to potentially ten thousand known attribute names.

However, instead of using wide rows with dynamic columns (and serializing type 
info with the value), they are trying to use a static column family and 
modifying the schema dynamically as new named attributes are created.

(I believe one of the main drivers of this approach is to use collection 
columns for certain attributes, and perhaps to preserve type metadata for a 
given attribute.)

This approach goes against everything I've seen and done in Cassandra, and is 
generally an anti-pattern for most persistence stores, but I want to gather 
feedback before taking the next step with the team.

Do others consider this approach an anti-pattern, and if so, what are the 
practical downsides?

For one, this means that the Entity schema would contain the superset of all 
columns for all rows. What is the impact of having thousands of columns names 
in the schema? And what are the implications of modifying the schema 
dynamically on a decent sized cluster (5 nodes now, growing to 10s later) under 
load?

Thanks,
Todd

Reply via email to