Anti-pattern. Dynamically altering the schema won't scale and is bad ju ju.
-- Colin Clark +1-320-221-9531 > On Oct 6, 2014, at 10:56 PM, Todd Fast <t...@toddfast.com> wrote: > > There is a team at my work building a entity-attribute-value (EAV) store > using Cassandra. There is a column family, called Entity, where the partition > key is the UUID of the entity, and the columns are the attributes names with > their values. Each entity will contain hundreds to thousands of attributes, > out of a list of up to potentially ten thousand known attribute names. > > However, instead of using wide rows with dynamic columns (and serializing > type info with the value), they are trying to use a static column family and > modifying the schema dynamically as new named attributes are created. > > (I believe one of the main drivers of this approach is to use collection > columns for certain attributes, and perhaps to preserve type metadata for a > given attribute.) > > This approach goes against everything I've seen and done in Cassandra, and is > generally an anti-pattern for most persistence stores, but I want to gather > feedback before taking the next step with the team. > > Do others consider this approach an anti-pattern, and if so, what are the > practical downsides? > > For one, this means that the Entity schema would contain the superset of all > columns for all rows. What is the impact of having thousands of columns names > in the schema? And what are the implications of modifying the schema > dynamically on a decent sized cluster (5 nodes now, growing to 10s later) > under load? > > Thanks, > Todd