Hi Aaron, 1. A timeout more than 10ms to us is the max value we could accept 2. It is a random key access, not a range scan. 3. We have only one column family only for that keyspace, we select the columns.
Thanks. Best wishes, Stanley Xu On Fri, Apr 19, 2013 at 2:22 AM, aaron morton <aa...@thelastpickle.com>wrote: > > Is that possible that we could make some configuration, so there will be > like a mem_table queue in the memory, like there are 4 mem_tables in the > memory, from mem1, mem2, mem3, mem4 based on time series, and the Cassandra > will flush mem1, and once there is a mem5 is full, it will flush the mem2. > Is that possible? > No. > > > We were using Cassandra for this with 40 QPS of read before, but once > the QPS to read increase, it looks the IO_WAIT of the system increase > heavily and we got a lot of timeout in query(we set 10ms as the timeout). > Look at the cfhistogram for the CF. Look at the read latency column, the > number on the left is microseconds and the number in the read latency > column is how many local reads took that long. Also look at the SSTables > column, this is the number of SSTables that were involved in the read. > > Consider increasing the rpc_timeout to reduce the timeout errors until you > reduce the read latency. > > Is the read a range scan or selecting by row key? > When you do the read, is a to select all columns in the row or do you > select columns by name? The later is more performant. > > Cheers > ----------------- > Aaron Morton > Freelance Cassandra Consultant > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 19/04/2013, at 12:22 AM, Stanley Xu <wenhao...@gmail.com> wrote: > > > Dear buddies, > > > > We are using Cassandra to handle a tech scenario like the following: > > > > 1. A table using a Long as Key, and has one and only one Integer as a > ColumnFamily, with 2 hours as the TTL. > > 2. The wps(write per second) is 45000, the qps(read per second) would be > about 30 - 200. > > 3. There isn't a "hot zone" for read(which means each key query would be > a different key), but most of the reads will hit the writes in the last 30 > minutes > > 4. All writes are new key with new value, no overwrite. > > > > > > We were using Cassandra for this with 40 QPS of read before, but once > the QPS to read increase, it looks the IO_WAIT of the system increase > heavily and we got a lot of timeout in query(we set 10ms as the timeout). > > > > Per my understand, the main reason is that most of the queries will hit > the disk with our configuration. > > > > I am wondering if following things will help us to handle the load. > > > > 1. Increase the size of mem_table, so most of the read will read from > mem_table, and since the mem_table hasn't been flushed to disk yet, a query > to the sstable will be filtered by bloomfilter, so no disk seek will happen. > > > > But our major concern is that once a large mem_table is flushed to the > disk, then the new incoming queries will all went to disk and the timeout > crash will still happen. > > > > Is that possible that we could make some configuration, so there will be > like a mem_table queue in the memory, like there are 4 mem_tables in the > memory, from mem1, mem2, mem3, mem4 based on time series, and the Cassandra > will flush mem1, and once there is a mem5 is full, it will flush the mem2. > Is that possible? > > > > > > Best wishes, > > Stanley Xu > > Best wishes, > > Stanley Xu > >