> I think the limit of the size per row in cassandra is 2G? That was a pre 0.7 restriction http://wiki.apache.org/cassandra/CassandraLimitations
> and I insert 10000 columns into a row, each column has a 1MB data. So a single row with 10GB of data. That's what we call a big one. > /usr/local/lib/ruby/gems/1.9.1/gems/thrift-0.8.0/lib/thrift/transport/socket.rb:109:in > `read': CassandraThrift::Cassandra::Client::TransportException > from > /usr/local/lib/ruby/gems/1.9.1/gems/thrift-0.8.0/lib/thrift/transport/base_transport.rb:87:in > `read_all' I would expect to see a string description there as well. Check on the server side error logs. > this script crashed again, same error message. And cassandra process remain > in 100% cpu usage. Counting columns involves reading them, so you are asking cassandra to read 10GB of data. This will take a while. It's probably the size of the row that is causing problems. You can easily have rows with millions of columns (here is an experiment that uses 10MM cols in a row http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/) In general you will want to avoid rows with more than say 32 or 64 MB of data. It's not a hard restriction but big rows cause issues and it's often easier to avoid them. Hope that helps. ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 20/08/2012, at 8:15 PM, Chuan-Heng Hsiao <hsiao.chuanh...@gmail.com> wrote: > I think the limit of the size per row in cassandra is 2G? > > 10000 x 1M = 10G. > > Hsiao > > On Mon, Aug 20, 2012 at 1:07 PM, oupfevph <oupfe...@yahoo.com> wrote: > I setup cassandra with default configuration in clean AWS instance, and I > insert 10000 columns into a row, each column has a 1MB data. I use this > ruby(version 1.9.3) script: > > 10000.times do > key = rand(36**8).to_s(36) > value = rand(36**1024).to_s(36) * 1024 > Cas_client.insert(TestColumnFamily,TestRow,{key=>value}) > end > > every time I run this script, it will crash: > > /usr/local/lib/ruby/gems/1.9.1/gems/thrift-0.8.0/lib/thrift/transport/socket.rb:109:in > `read': CassandraThrift::Cassandra::Client::TransportException > from > /usr/local/lib/ruby/gems/1.9.1/gems/thrift-0.8.0/lib/thrift/transport/base_transport.rb:87:in > `read_all' > from > /usr/local/lib/ruby/gems/1.9.1/gems/thrift-0.8.0/lib/thrift/transport/framed_transport.rb:104:in > `read_frame' > from > /usr/local/lib/ruby/gems/1.9.1/gems/thrift-0.8.0/lib/thrift/transport/framed_transport.rb:69:in > `read_into_buffer' > from > /usr/local/lib/ruby/gems/1.9.1/gems/thrift-0.8.0/lib/thrift/client.rb:45:in > `read_message_begin' > from > /usr/local/lib/ruby/gems/1.9.1/gems/thrift-0.8.0/lib/thrift/client.rb:45:in > `receive_message' > from > /usr/local/lib/ruby/gems/1.9.1/gems/cassandra-0.15.0/vendor/0.8/gen-rb/cassandra.rb:251:in > `recv_batch_mutate' > from > /usr/local/lib/ruby/gems/1.9.1/gems/cassandra-0.15.0/vendor/0.8/gen-rb/cassandra.rb:243:in > `batch_mutate' > from > /usr/local/lib/ruby/gems/1.9.1/gems/thrift_client-0.8.1/lib/thrift_client/abstract_thrift_client.rb:150:in > `handled_proxy' from > /usr/local/lib/ruby/gems/1.9.1/gems/thrift_client-0.8.1/lib/thrift_client/abstract_thrift_client.rb:60:in > `batch_mutate' > from > /usr/local/lib/ruby/gems/1.9.1/gems/cassandra-0.15.0/lib/cassandra/protocol.rb:7:in > `_mutate' > from > /usr/local/lib/ruby/gems/1.9.1/gems/cassandra-0.15.0/lib/cassandra/cassandra.rb:463:in > `insert' > from a.rb:6:in `block in <main>' > from a.rb:3:in `times' > from a.rb:3:in `<main>' > > yet cassandra performs normally, then I run another ruby script to get how > many columns I have inserted: > > p cas_client.count_columns(TestColumnFamily,TestRow) > > this script crashed again, same error message. And cassandra process remain > in 100% cpu usage. > > > AWS m1.xlarge type instance (15GB mem,800GB harddisk, 4cores cpu) > cassandra-1.1.2 > ruby-1.9.3-p194 > jdk-7u6-linux-x64 > ruby-gems: > cassandra (0.15.0) > thrift (0.8.0) > thrift_client (0.8.1) > > What is the problem? > >