Re: Heavy writes ok for single node, but failed for cluster

2011-04-28 Thread Sheng Chen
bytes) 2011/4/28 Jonathan Ellis > This means a node was too busy with something else to send out its > heartbeat. Sometimes this is STW GC. Other times it is a bug (one was > fixed for 0.7.6 in > https://issues.apache.org/jira/browse/CASSANDRA-2554). > > On Thu, Apr 28, 2011 at 3

Re: Heavy writes ok for single node, but failed for cluster

2011-04-28 Thread Sheng Chen
at 10:32 AM, Sheng Chen > wrote: > > I succeeded to insert 1 billion records into a single node cassandra, > >>> bin/stress -d cas01 -o insert -n 10 -c 5 -S 34 -C5 -t 20 > > Inserts finished in about 14 hours at a speed of 20k/sec. > > But when I

Heavy writes ok for single node, but failed for cluster

2011-04-27 Thread Sheng Chen
I succeeded to insert 1 billion records into a single node cassandra, >> bin/stress -d cas01 -o insert -n 10 -c 5 -S 34 -C5 -t 20 Inserts finished in about 14 hours at a speed of 20k/sec. But when I added another node, tests always failed with UnavailableException in an hour. >> bin/stress

Re: Stress tests failed with secondary index

2011-04-06 Thread Sheng Chen
happens with > secondary indexes. Consider things like > - reducing the throughput > - reducing the number of clients > - ensuring the clients are connecting to all nodes in the cluster. > > You will probably find some logs about dropped messages on some nodes. > Aaron > > On

Re: Compaction threshold does not save with nodetool

2011-04-06 Thread Sheng Chen
ou have to use the cli and the > ‘update column family X with min_compaction_threshold=Y and > max_compaction_threshold=X’ command. > > > > Dan > > > > *From:* Sheng Chen [mailto:chensheng2...@gmail.com] > *Sent:* April-06-11 1:42 > *To:* user@cassandra.apache

Re: Test idea on cassandra

2011-04-06 Thread Sheng Chen
Stress tools in contrib directory use multiple threads/processes. 2011/4/7 Mengchen Yu > I'm trying to simulate a multi-user scenario. The reason why I > want to use MPJ is to create different processes act like individual > users. Do any one have idea how to do this clearly? > Sorry for duplica

Stress tests failed with secondary index

2011-04-06 Thread Sheng Chen
I used py_stress module to insert 10m test data with a secondary index. I got the following exceptions. # python stress.py -d xxx -o insert -n 1000 -c 5 -s 34 -C 5 -x keys total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time 265322,26532,26541,0.00186140829433,10 630300,36497,3650

Compaction threshold does not save with nodetool

2011-04-05 Thread Sheng Chen
Cassandra 0.7.4 # nodetool -h localhost getcompactionthreshold Keyspace1 Standard1 min=4 max=32 # nodetool -h localhost setcompactionthreshold Keyspace1 Standard1 0 0 # nodetool -h localhost getcompactionthreshold Keyspace1 Standard1 min=0 max=0 Now the thresholds have changed on the JMX pannel,

Re: Endless minor compactions after heavy inserts

2011-04-03 Thread Sheng Chen
on > > > > On 2 Apr 2011, at 12:45, Sheng Chen wrote: > > Thank you very much. > > The major compaction will merge everything into one big file., which would > be very large. > Is there any way to control the number or size of files created by major > compaction?

Re: Endless minor compactions after heavy inserts

2011-04-01 Thread Sheng Chen
et the best out of your 10 disks will be to use a > dedicated mirror for the commit log and a stripe set for the data. > > Hope that helps. > Aaron > > On 1 Apr 2011, at 14:52, Sheng Chen wrote: > > > I've got a single node of cassandra 0.7.4, and I used the java str

Endless minor compactions after heavy inserts

2011-03-31 Thread Sheng Chen
I've got a single node of cassandra 0.7.4, and I used the java stress tool to insert about 100 million records. The inserts took about 6 hours (45k inserts/sec) but the following minor compactions last for 2 days and the pending compaction jobs are still increasing. >From jconsole I can read the M

Re: newbie question: how do I know the total number of rows of a cf?

2011-03-30 Thread Sheng Chen
I just found an estmateKeys() method of the ColumnFamilyStoreMBean. Is there any indication about how it works? Sheng 2011/3/28 Sheng Chen > Hi all, > I want to know how many records I am holding in Cassandra, just like > count(*) in sql. > What can I do ? Thank you. > > Sheng > > >

Re: Compaction doubles disk space

2011-03-30 Thread Sheng Chen
But right so long as unused space is freed when > needed it's working as designed AFAIK. > > Thats my understanding, hope it helps explain why it works that way. > Aaron > > On 30 Mar 2011, at 13:32, Sheng Chen wrote: > > Yes. > I think at least we can remove the tombstones

Re: Compaction doubles disk space

2011-03-29 Thread Sheng Chen
Yes. I think at least we can remove the tombstones for each sstable first, and then do the merge. 2011/3/29 Karl Hiramoto > Would it be possible to improve the current compaction disk space issue by > compacting one only a few SSTables at a time then imediately deleting the > old one? Looking

Re: Compaction doubles disk space

2011-03-29 Thread Sheng Chen
>From a previous thread of the same topic, I used a force GC and the extra spaces are released. What about my second question? 2011/3/29 Sheng Chen > I use 'nodetool compact' command to start a compaction. > I can understand that extra disk spaces are required during th

Compaction doubles disk space

2011-03-29 Thread Sheng Chen
I use 'nodetool compact' command to start a compaction. I can understand that extra disk spaces are required during the compaction, but after the compaction, the extra spaces are not released. Before compaction: SSTable count: 10 space used (live): 19G space used (total): 21G After compaction: ss

Re: newbie question: how do I know the total number of rows of a cf?

2011-03-29 Thread Sheng Chen
gt;> hold the answer for you. > >> > >> - Stephen > >> > >> --- > >> Sent from my Android phone, so random spelling mistakes, random nonsense > >> words and other nonsense are a direct result of using swype to type on > the > >> screen > >> > >> On 28 Mar 2011 07:40, "Sheng Chen" wrote: > >>> Hi all, > >>> I want to know how many records I am holding in Cassandra, just like > >>> count(*) in sql. > >>> What can I do ? Thank you. > >>> > >>> Sheng > >> > > > > > > > > -- > > http://twitter.com/jpartogi > > >

newbie question: how do I know the total number of rows of a cf?

2011-03-27 Thread Sheng Chen
Hi all, I want to know how many records I am holding in Cassandra, just like count(*) in sql. What can I do ? Thank you. Sheng

Re: stress.py bug?

2011-03-22 Thread Sheng Chen
I am just wondering, why the stress test tools (python, java) need more threads ? Is the bottleneck of a single thread in the client, or in the server? Thanks. Sean 2011/3/22 Ryan King > On Mon, Mar 21, 2011 at 4:02 AM, pob wrote: > > Hi, > > I'm inserting data from client node with stress.py