Hi Aaron Morton and R. Verlangen, Thanks for the quick answer. It's good to know Thrift's limit on the amount of data it will accept / send.
I know the hard limit is 2 billion columns per row. My question is at what size it will slowdown read/write performance and maintenance. The blog I reference said the row size should be less than 10MB. It'll be better if Cassandra can transparently shard/split the wide row and then distribute them to many nodes, to help the load balancing. Are there any other ways to model historical data (or time-series-data) besides wide row column slicing in Cassandra? Thanks, Charlie | Data Solution Architect Developer http://mujiang.blogspot.com On Thu, Feb 16, 2012 at 12:38 AM, aaron morton <aa...@thelastpickle.com>wrote: > > Based on this blog of Basic Time Series with Cassandra data modeling, > > http://rubyscale.com/blog/2011/03/06/basic-time-series-with-cassandra/ > I've not read that one but it sounds right. Mat Dennis knows his stuff > http://www.slideshare.net/mattdennis/cassandra-nyc-2011-data-modeling > > > There is a limit on how big the row size can be before slowing down the > update and query performance, that is 10MB or less. > There is no hard limit. Wide rows wont upset writes too much. Some read > queries can avoid problems but most will not. > > Wide rows are a pain when it comes to maintenance. They take longer to > compact and repair. > > > Is this still true in Cassandra latest version? or in what release > Cassandra will remove this limit? > There is a limit of 2 billion columns per row. There is a not a limit of > 10MB per row. I've seen some rows in the 100's of MB and they are always a > pain. > > > Manually sharding the wide row will increase the application complexity, > it would be better if Cassandra can handle it transparently. > it's not that hard :) > > Cheers > > ----------------- > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 16/02/2012, at 7:40 AM, Data Craftsman wrote: > > > Hello experts, > > > > Based on this blog of Basic Time Series with Cassandra data modeling, > > http://rubyscale.com/blog/2011/03/06/basic-time-series-with-cassandra/ > > > > "This (wide row column slicing) works well enough for a while, but over > time, this row will get very large. If you are storing sensor data that > updates hundreds of times per second, that row will quickly become gigantic > and unusable. The answer to that is to shard the data up in some way" > > > > There is a limit on how big the row size can be before slowing down the > update and query performance, that is 10MB or less. > > > > Is this still true in Cassandra latest version? or in what release > Cassandra will remove this limit? > > > > Manually sharding the wide row will increase the application complexity, > it would be better if Cassandra can handle it transparently. > > > > Thanks, > > Charlie | DBA & Developer > > > > > > p.s. Quora link, > > > http://www.quora.com/Cassandra-database/What-are-good-ways-to-design-data-model-in-Cassandra-for-historical-data > > > > > > > >