Reg CassandraBulkuploader

2011-08-31 Thread Thamizh
Hi All, I am using Cassandra-0.7.8 and trying to run "CassandraBulkLoader.java" at . But it ended up with below error. I have kept cassandra.yaml file on HDFS at /data/conf/cassandra.yaml . FYR, Attached theĀ  CassandraBulkUploader.java, So, I made below Hadoop specific changes on DistributedCa

Cassandra BETWEEN & ORDER BY operations

2011-08-31 Thread Thamizh
Hi All, I wanted to perform SQL operations such as BETWEEN, ORDER BY with ASC/DSC order on Cassandra-0.7.8. As I know, Cassandra-0.7.8 does not have direct support to these operations. Kindly let me know is there a way to accomplish these by tweaking on secondary index? Below is my Data model

Re: Updates lost

2011-08-31 Thread Jim Ancona
You could also look at Hector's approach in: https://github.com/rantav/hector/blob/master/core/src/main/java/me/prettyprint/cassandra/service/clock/MicrosecondsSyncClockResolution.java It works well and I believe there was some performance testing done on it as well. Jim On Tue, Aug 30, 2011 at

Re: Updates lost

2011-08-31 Thread Jiang Chen
Cheers. That would be another solution. On Wed, Aug 31, 2011 at 10:42 AM, Jim Ancona wrote: > You could also look at Hector's approach in: > https://github.com/rantav/hector/blob/master/core/src/main/java/me/prettyprint/cassandra/service/clock/MicrosecondsSyncClockResolution.java > > It works wel

Re: hw requirements

2011-08-31 Thread Maxim Potekhin
Plenty of comments in this thread already, and I agree with those saying "it depends". From my experience, a cluster with 18 spindles total could not match the performance and throughput of our primary Oracle server which had 108 spindles. After we upgraded to SSD, things have definitely changed f

Re: hw requirements

2011-08-31 Thread Terje Marthinussen
SSD's definitely makes live simpler as you will get a lot less trouble with impact from things like compactions. Just beware that Cassandra expands data a lot due to storage overhead (for small columns), replication and needed space for compactions and repairs. It is well worth doing some rea

Trying to understand QUORUM and Strategies

2011-08-31 Thread Anthony Ikeda
Okay, we are looking at setting up a production environment which means getting our quorum settings and strategies correct. However, we need to really understand the approach taken to get this right. So far we have been working with co-located nodes and our prod environment is going to be distribut

Re: hw requirements

2011-08-31 Thread Anthony Ikeda
Sorry to fork this topic, but in "composite indexes" do you mean as strings or as "Composite()". I only ask cause we have started using the Composite as rowkeys and column names to replace the use of concatenated strings mainly for lookup purposes. Anthony On Wed, Aug 31, 2011 at 10:27 AM, Maxim

nodetool tpstats feature request

2011-08-31 Thread David Hawthorne
It would be very useful to be able to get refreshing statistics from tpstats, a la top. nodetool -h localhost tpstats [n] refresh every second, show me the Active and Pending and Blocked columns as they currently exist, but under Completed show me a per-second rate based on the delta from the

Re: nodetool tpstats feature request

2011-08-31 Thread Peter Sanford
I use `watch` to do this: watch -n 5 nodetool -h localhost tpstats -psanford On Wed, Aug 31, 2011 at 1:59 PM, David Hawthorne wrote: > It would be very useful to be able to get refreshing statistics from tpstats, > a la top. > > nodetool -h localhost tpstats [n] > > refresh every second, show

Re: Trying to understand QUORUM and Strategies

2011-08-31 Thread Evgeniy Ryabitskiy
Hi Actually you can use LOCAL_QUORUM and EACH_QUORUM policy everywhere on DEV/QA/Prod. Even it would be better for integration tests to use same Consistency level as on production. For production with multiple DC you usually need to chouse between 2 common solutions: Geographical Distribution or D

Re: nodetool tpstats feature request

2011-08-31 Thread David Hawthorne
watch doesn't calculate diffs On Aug 31, 2011, at 2:29 PM, Peter Sanford wrote: > I use `watch` to do this: > > watch -n 5 nodetool -h localhost tpstats > > -psanford > > On Wed, Aug 31, 2011 at 1:59 PM, David Hawthorne > wrote: >> It would be very useful to be able to get refreshing statist

Cluster properties via Thrift?

2011-08-31 Thread Stephen McKamey
Is there any way to initialize via thrift cluster configuration properties (e.g., partitioner)? Do you always have to set these values via the cassandra.yaml file? I'm trying to programmatically build out all of my schema so that I can build out a fresh install just by running the app.

Re: Cluster properties via Thrift?

2011-08-31 Thread Eric Evans
On Wed, Aug 31, 2011 at 6:56 PM, Stephen McKamey wrote: > Is there any way to initialize via thrift cluster configuration properties > (e.g., partitioner)? Do you always have to set these values via the > cassandra.yaml file? I'm trying to programmatically build out all of my > schema so that I ca

15 seconds to increment 17k keys?

2011-08-31 Thread Ian Danforth
All, I've got a 4 node cluster (ec2 m1.large instances, replication = 3) that has one primary counter type column family, that has one column in the family. There are millions of rows. Each operation consists of doing a batch_insert through pycassa, which increments ~17k keys. A majority of these

Re: 15 seconds to increment 17k keys?

2011-08-31 Thread Yang
1ms per add operation is the general order of magnitude I have seen with my tests. On Wed, Aug 31, 2011 at 6:04 PM, Ian Danforth wrote: > All, > > I've got a 4 node cluster (ec2 m1.large instances, replication = 3) > that has one primary counter type column family, that has one column > in th

randompartitioner cluster unbalanced

2011-08-31 Thread David Hawthorne
$ ./nodetool -h localhost ring Address DC RackStatus State LoadOwns Token 136112946768375385385349842972707284580 10.0.0.57datacent

Unable to link C library (for jna.jar) on 0.7.5

2011-08-31 Thread Eric Czech
Hi everybody, I'm running cassandra 0.7.5 on about 20 RHEL 5 (24 GB RAM) machines and I'm having issues with snapshots, json sstable conversions, and various nodetool commands due to memory errors and the lack of the native access C libraries. I tried putting jna.jar on the classpath but I'm stil