Anti-pattern.  Dynamically altering the schema won't scale and is bad ju ju.

--
Colin Clark 
+1-320-221-9531
 

> On Oct 6, 2014, at 10:56 PM, Todd Fast <t...@toddfast.com> wrote:
> 
> There is a team at my work building a entity-attribute-value (EAV) store 
> using Cassandra. There is a column family, called Entity, where the partition 
> key is the UUID of the entity, and the columns are the attributes names with 
> their values. Each entity will contain hundreds to thousands of attributes, 
> out of a list of up to potentially ten thousand known attribute names.
> 
> However, instead of using wide rows with dynamic columns (and serializing 
> type info with the value), they are trying to use a static column family and 
> modifying the schema dynamically as new named attributes are created.
> 
> (I believe one of the main drivers of this approach is to use collection 
> columns for certain attributes, and perhaps to preserve type metadata for a 
> given attribute.)
> 
> This approach goes against everything I've seen and done in Cassandra, and is 
> generally an anti-pattern for most persistence stores, but I want to gather 
> feedback before taking the next step with the team.
> 
> Do others consider this approach an anti-pattern, and if so, what are the 
> practical downsides?
> 
> For one, this means that the Entity schema would contain the superset of all 
> columns for all rows. What is the impact of having thousands of columns names 
> in the schema? And what are the implications of modifying the schema 
> dynamically on a decent sized cluster (5 nodes now, growing to 10s later) 
> under load?
> 
> Thanks,
> Todd

Reply via email to