There is a known issue for concurrent schema migrations 
https://issues.apache.org/jira/browse/CASSANDRA-1391

Once they diverge the I think you can delete the schema by removing the 
necessary system files and leaving the data files in place, then re-creating 
the files. 

And yes, you should not be creating lots of column families they are not the 
same as tables. 

Aaron

On 16 Apr 2011, at 09:13, Alejandro Perez wrote:

> Thanks for the quick response!. I will reconsider the schema.
> 
> However, the problem troubles me somehow. How are schema changes supposed to 
> be done? Should I serialize them, should I halt other cluster operations 
> while I do the schema change? Is this a known problem with cassandra?
> 
> The other question, and I think the more important one for me now: how do I 
> repair the cluster without loosing data once the schemas diverge? Right now 
> the only way I have is erase all data and have the cluster start empty. 
> Should this problem ever happen in production, it's important there's a way 
> to recover the data.
> 
> On Fri, Apr 15, 2011 at 1:57 PM, Dan Hendry <dan.hendry.j...@gmail.com> wrote:
> Uh... don’t create a column family per user. Column families are meant to be 
> fairly static; conceptually equivalent to a table in a relational database. 
> Why do you need (or even want) a CF per user? Reconsider your data model, a 
> single column family with an inverted index for a ‘user’ column is probably 
> more what you are looking for. Operationally, the fewer CFs the better.
> 
>  
> Dan
> 
>  
> From: Alejandro Perez [mailto:sp...@indextank.com] 
> Sent: April-15-11 16:39
> To: user@cassandra.apache.org
> Cc: Support
> Subject: Schemas diverging while dynamically creating CF.
> 
>  
> Hello,
> 
>  
> We're testing cassandra for integration with indextank. In this first try, 
> we're creating one column family for each user. In practice, on the first run 
> and for the first few documents (a few 100s), a new CF is created, and a 
> document is immediately added to it. A few (up to 50) requests of this type 
> are issued in parallel (for different column families).
> 
>  
> The end result, and quite repeatable, is having the cluster split with 
> different schema versions, and they never agree.
> 
>  
> Any thoughts?
> 
>  
>  
> Thanks,
> 
>  
> Spike.
> 
> 
> --
> 
> Alejandro Perez
> IndexTank
> 
> follow us @indextank | read our blog | subscribe our user mailing list
> 
> 
> 
> No virus found in this incoming message.
> Checked by AVG - www.avg.com
> Version: 9.0.894 / Virus Database: 271.1.1/3574 - Release Date: 04/15/11 
> 02:34:00
> 
> 
> 
> 
> -- 
> Alejandro Perez
> IndexTank
> 
> follow us @indextank | read our blog | subscribe our user mailing list
> 
> 

Reply via email to