How to generate tokens for my two node Cassandra cluster?

2013-11-01 Thread Techy Teck
I am trying to setup two node Cassandra Cluster on windows machine. I have
basically two windows machine and I was following this datastax tutorial (
http://www.datastax.com/2012/01/how-to-setup-and-monitor-a-multi-node-cassandra-cluster-on-windows
)

Whenever I use the below command to get the token number from the above
tutorial -

python -c "num=2; print ""\n"".join([(""token %d: %d""
%(i,(i*(2**127)/num))) for i in range(0,num)])"


I always get this error -

C:\Users\username>python -c "num=2; print ""\n"".join([(""token %d:
%d"" %(i,(i*(2**127)/num))) for i
in range(0,num)])"
  File "", line 1
num=2; print "\n".join([("token %d: %d" %(i,(i*(2**127)/num))) for
i in range(0,num)])
^
SyntaxError: invalid syntax


Not able to form a Cassandra cluster of two nodes in Windows?

2013-11-01 Thread Techy Teck
I am trying to setup two nodes of Cassandra cluster on my windows machine.
Basically, I have two windows machine. In both of my machine, I have
installed Cassandra 1.2.11 from Datastax. Now I was following this
[tutorial](
http://www.datastax.com/2012/01/how-to-setup-and-monitor-a-multi-node-cassandra-cluster-on-windows)
to setup two node Cassandra Cluster.

After installing Cassandra into those two machines, I stopped the services
for the Cassandra server, DataStax OpsCenter, and the DataStax OpsCenter
agent in those two machines..

And then I started making changes in the yaml file -

My First Node details are -

initial_token: 0
seeds: "10.0.0.4"
listen_address: 10.0.0.4   #IP of Machine - A (Wireless LAN adapter
Wireless Network Connection)
rpc_address: 10.0.0.4

My Second Node details are -

initial_token: 0
seeds: "10.0.0.4"
listen_address: 10.0.0.7   #IP of Machine - B (Wireless LAN adapter
Wireless Network Connection)
rpc_address: 10.0.0.7

Both of my serves gets started up properly after I start the services for
server. But they are not forming a cluster of two nodes somehow? Is there
anything I am missing here?

Machine-A Nodetool Information-

Datacenter: datacenter1
==
Replicas: 1

Address   RackStatus State   Load
OwnsToken


10.0.0.4  rack1   Up Normal  212.1 KB
100.00% 5264744098649860606

Machine-B Nodetool Information-

Starting NodeTool

Datacenter: datacenter1
==
Replicas: 1

Address   RackStatus State   Load
OwnsToken


10.0.0.7  rack1   Up Normal  68.46 KB
100.00% 407804996740764696


Check out if Cassandra ready

2013-11-01 Thread Salih Kardan
Hi all,

I am a newbie to Cassandra and I am tring to write test cases to cassandra
with JUnit.
I use CassandraDaemon class to start cassandra in IntelliJ IDEA. I want to
wait
until Cassandra up and running before runnig test methods. How can I wait
until cassandra starts or
is there any way to check if cassandra is running (with Java)?

Thanks.
Salih Kardan


Re: High loads only on one node in the cluster

2013-11-01 Thread Ashish Tyagi
Hi Evan,

The clients connect to all nodes. We tried shutting the thrift server on
the affected node. Loads did not come down.



On Fri, Nov 1, 2013 at 12:59 AM, Evan Weaver  wrote:

> Are all your clients only connecting to your first node? I would
> probably strace it and compare the trace to one from a lightly loaded
> node.
>
> On Thu, Oct 31, 2013 at 7:12 PM, Ashish Tyagi 
> wrote:
> > We have a 9 node cluster. 6 nodes are in one data-center and 3 nodes in
> the
> > other. All machines are Amazon M1.XLarge configuration.
> >
> > Datacenter: DC1
> > ==
> > Address RackStatus State   LoadOwns
> > Token
> >
> > ip11  1b  Up Normal  76.46 GB16.67%  0
> > ip12  1b  Up Normal  44.66 GB16.67%
> > 28356863910078205288614550619314017621
> > ip13  1c  Up Normal  85.94 GB16.67%
> > 56713727820156410577229101238628035241
> > ip14  1c  Up Normal  17.55 GB16.67%
> > 85070591730234615865843651857942052863
> > ip15  1d  Up Normal  80.74 GB16.67%
> > 113427455640312821154458202477256070484
> > ip16  1d  Up Normal  20.88 GB16.67%
> > 141784319550391026443072753096570088105
> >
> > Datacenter: DC2
> > ==
> > Address RackStatus State   LoadOwns
> > Token
> >
> > ip21  1a  Up Normal  78.32 GB0.00%   1001
> > ip22  1b  Up Normal  71.23 GB0.00%
> > 56713727820156410577229101238628036241
> > ip23  1b  Up Normal  53.49 GB0.00%
> > 113427455640312821154458202477256071484
> >
> > Problem is that node with ip address: ip11 often has 5-10 times more load
> > than any other node. Most of the operations are on counters. The primary
> > column family (which receives most writes) has a replication factor of 2
> in
> > DataCenter DC1 and also in DataCenter DC2. The traffic is write heavy
> (reads
> > are less than 10% of total requests). We are using size-tiered
> compaction.
> > Both writes and reads happen with a consistency factor of LOCAL_QUORUM.
> >
> > More information:
> >
> > 1. cassandra.yaml - http://pastebin.com/u344fA6z
> > 2. Jmap heap when node under high loads - http://pastebin.com/ib3D0Pa
> > 3. Nodetool tpstats - http://pastebin.com/s0AS7bGd
> > 4. Cassandra-env.sh - http://pastebin.com/ubp4cGUx
> > 5. GC log lines -  http://pastebin.com/Y0TKphsm
> >
> > Am I doing anything wrong. Any pointers will be appreciated.
> >
> > Thanks in advance,
> > Ashish
>


Re: High loads only on one node in the cluster

2013-11-01 Thread Rakesh Rajan
@Tyler / @Rob,

As Ashish mentioned earlier, we have 9 nodes on AWS - 6 on EastCoast and 3
on Singapore. All 9 nodes uses EC2Snitch. The current ring ( across all
nodes in 2 DC ) looks like this:

ip11 - East Coast - m1.xlarge / us-east-1b - Size: 83 GB - Token: 0
ip21 - Singapore  - m1.xlarge / ap-southeast-1a - Size: 88 GB - Token: 1001
ip12 - East Coast - m1.xlarge / us-east-1b - Size: 45 GB -
Token: 28356863910078205288614550619314017621
ip13 - East Coast - m1.xlarge / us-east-1c - Size: 93 GB -
Token: 56713727820156410577229101238628035241
ip22 - Singapore  - m1.xlarge / ap-southeast-1b - Size: 73 GB -
Token: 56713727820156410577229101238628036241
ip14 - East Coast - m1.xlarge / us-east-1c - Size: 20 GB -
Token: 85070591730234615865843651857942052863
ip15 - East Coast - m1.xlarge / us-east-1d - Size: 89 GB -
Token: 113427455640312821154458202477256070484
ip23 - Singapore  - m1.xlarge / ap-southeast-1b - Size: 56 GB -
Token: 113427455640312821154458202477256071484
ip16 - East Coast - m1.xlarge / us-east-1d - Size: 25 GB -
Token: 141784319550391026443072753096570088105

Regarding alternating racks solution, I've the following queries:

1) By alternating racks, do you mean to alternate racks between all nodes
in a single DC v/s multiple DCs? AWS EastCoast has 4 AZs
and Singapore has 2 AZs. So is the final solution something like this:
ip11 - East Coast - m1.xlarge / us-east-1b - Token: 0
ip21 - Singapore  - m1.xlarge / ap-southeast-1a - Token: 1001
ip12 - East Coast - m1.xlarge / us-east-*1c* -
Token: 28356863910078205288614550619314017621
ip13 - East Coast - m1.xlarge / us-east-*1d* -
Token: 56713727820156410577229101238628035241
ip22 - Singapore  - m1.xlarge / ap-southeast-1b -
Token: 56713727820156410577229101238628036241
ip14 - East Coast - m1.xlarge / us-east-*1a* -
Token: 85070591730234615865843651857942052863
ip15 - East Coast - m1.xlarge / us-east-*1b* -
Token: 113427455640312821154458202477256070484
ip23 - Singapore  - m1.xlarge / ap-southeast-*1a* -
Token: 113427455640312821154458202477256071484
ip16 - East Coast - m1.xlarge / us-east-*1c* -
Token: 141784319550391026443072753096570088105

Is this what you had suggested?

2) How does dynamic_snitch_badness_threshold: 0.1 effect the CPU load? On
the node ( ip11 ) which was high CPU ( system load > 30 ), I checked the
attribute score ( via JMX
bean org.apache.cassandra.db:type=DynamicEndpointSnitch ) and saw the
following:
EastCoast:
*ip11 = 1.6813321647677475*
ip12 = 1.0003505696757231
ip13 = 1.1324160525509974
ip14 = 1.000350569675723
ip15 = 1.0007011393514456
ip16 = 1.0005258545135842
Singapore:
ip21 = 1.095880806310253
ip22 = 1.4101
ip23 = 1.0953549517966696

So ip11 node is indeed having higher score - but not sure why traffic is
still going to that replica as opposed to some other node?

Thanks!



On Fri, Nov 1, 2013 at 3:13 PM, Ashish Tyagi  wrote:

> Hi Evan,
>
> The clients connect to all nodes. We tried shutting the thrift server on
> the affected node. Loads did not come down.
>
>
>
> On Fri, Nov 1, 2013 at 12:59 AM, Evan Weaver  wrote:
>
>> Are all your clients only connecting to your first node? I would
>> probably strace it and compare the trace to one from a lightly loaded
>> node.
>>
>> On Thu, Oct 31, 2013 at 7:12 PM, Ashish Tyagi 
>> wrote:
>> > We have a 9 node cluster. 6 nodes are in one data-center and 3 nodes in
>> the
>> > other. All machines are Amazon M1.XLarge configuration.
>> >
>> > Datacenter: DC1
>> > ==
>> > Address RackStatus State   LoadOwns
>> > Token
>> >
>> > ip11  1b  Up Normal  76.46 GB16.67%  0
>> > ip12  1b  Up Normal  44.66 GB16.67%
>> > 28356863910078205288614550619314017621
>> > ip13  1c  Up Normal  85.94 GB16.67%
>> > 56713727820156410577229101238628035241
>> > ip14  1c  Up Normal  17.55 GB16.67%
>> > 85070591730234615865843651857942052863
>> > ip15  1d  Up Normal  80.74 GB16.67%
>> > 113427455640312821154458202477256070484
>> > ip16  1d  Up Normal  20.88 GB16.67%
>> > 141784319550391026443072753096570088105
>> >
>> > Datacenter: DC2
>> > ==
>> > Address RackStatus State   LoadOwns
>> > Token
>> >
>> > ip21  1a  Up Normal  78.32 GB0.00%
>> 1001
>> > ip22  1b  Up Normal  71.23 GB0.00%
>> > 56713727820156410577229101238628036241
>> > ip23  1b  Up Normal  53.49 GB0.00%
>> > 113427455640312821154458202477256071484
>> >
>> > Problem is that node with ip address: ip11 often has 5-10 times more
>> load
>> > than any other node. Most of the operations are on counters. The primary
>> > column family (which receives most writes) has a replication factor of
>> 2 in
>> > DataCenter DC1 and also in DataCenter DC2. T

Re: High loads only on one node in the cluster

2013-11-01 Thread Rakesh Rajan
Forgot to mention: All 9 nodes on Cassandra 1.2.9. Also, tpstats on the
high CPU node indicate:


   1. Pool NameActive   Pending  Completed
   Blocked  All time blocked
   2. ReadStage32  6600 3420385815
   0 0
   3. RequestResponseStage  0 0 2094235864
   0 0
   4. MutationStage 0 0 3102461222
   0 0
   5. ReadRepairStage   0 0 438089
   0 0
   6. *ReplicateOnWriteStage 0 0  253180440
   0  23703996*
   7. GossipStage   0 05917301
   0 0
   8. AntiEntropyStage  0 0   1486
   0 0
   9. MigrationStage0 0143
   0 0
   10. MemtablePostFlusher   0 0  39070
   0 0
   11. FlushWriter   0 0   7452
   0   927
   12. MiscStage 0 0257
   0 0
   13. commitlog_archiver0 0  0
   0 0
   14. AntiEntropySessions   0 0  1
   0 0
   15. InternalResponseStage 0 0 62
   0 0
   16. HintedHandoff 0 0   1961
   0 0
   17.
   18. Message type   Dropped
   19. RANGE_SLICE   1681
   20. READ_REPAIR   3921
   21. BINARY   0
   22. READ   4103953
   23. MUTATION   2651071
   24. _TRACE   0
   25. REQUEST_RESPONSE  3229


On Fri, Nov 1, 2013 at 3:37 PM, Rakesh Rajan  wrote:

> @Tyler / @Rob,
>
> As Ashish mentioned earlier, we have 9 nodes on AWS - 6 on EastCoast and 3
> on Singapore. All 9 nodes uses EC2Snitch. The current ring ( across all
> nodes in 2 DC ) looks like this:
>
> ip11 - East Coast - m1.xlarge / us-east-1b - Size: 83 GB - Token:
> 0
> ip21 - Singapore  - m1.xlarge / ap-southeast-1a - Size: 88 GB - Token:
> 1001
> ip12 - East Coast - m1.xlarge / us-east-1b - Size: 45 GB -
> Token: 28356863910078205288614550619314017621
> ip13 - East Coast - m1.xlarge / us-east-1c - Size: 93 GB -
> Token: 56713727820156410577229101238628035241
> ip22 - Singapore  - m1.xlarge / ap-southeast-1b - Size: 73 GB -
> Token: 56713727820156410577229101238628036241
> ip14 - East Coast - m1.xlarge / us-east-1c - Size: 20 GB -
> Token: 85070591730234615865843651857942052863
> ip15 - East Coast - m1.xlarge / us-east-1d - Size: 89 GB -
> Token: 113427455640312821154458202477256070484
> ip23 - Singapore  - m1.xlarge / ap-southeast-1b - Size: 56 GB -
> Token: 113427455640312821154458202477256071484
> ip16 - East Coast - m1.xlarge / us-east-1d - Size: 25 GB -
> Token: 141784319550391026443072753096570088105
>
> Regarding alternating racks solution, I've the following queries:
>
> 1) By alternating racks, do you mean to alternate racks between all nodes
> in a single DC v/s multiple DCs? AWS EastCoast has 4 AZs
> and Singapore has 2 AZs. So is the final solution something like this:
> ip11 - East Coast - m1.xlarge / us-east-1b - Token: 0
> ip21 - Singapore  - m1.xlarge / ap-southeast-1a - Token: 1001
> ip12 - East Coast - m1.xlarge / us-east-*1c* -
> Token: 28356863910078205288614550619314017621
> ip13 - East Coast - m1.xlarge / us-east-*1d* -
> Token: 56713727820156410577229101238628035241
> ip22 - Singapore  - m1.xlarge / ap-southeast-1b -
> Token: 56713727820156410577229101238628036241
> ip14 - East Coast - m1.xlarge / us-east-*1a* -
> Token: 85070591730234615865843651857942052863
> ip15 - East Coast - m1.xlarge / us-east-*1b* -
> Token: 113427455640312821154458202477256070484
> ip23 - Singapore  - m1.xlarge / ap-southeast-*1a* -
> Token: 113427455640312821154458202477256071484
> ip16 - East Coast - m1.xlarge / us-east-*1c* -
> Token: 141784319550391026443072753096570088105
>
> Is this what you had suggested?
>
>  2) How does dynamic_snitch_badness_threshold: 0.1 effect the CPU load? On
> the node ( ip11 ) which was high CPU ( system load > 30 ), I checked the
> attribute score ( via JMX
> bean org.apache.cassandra.db:type=DynamicEndpointSnitch ) and saw the
> following:
> EastCoast:
> *ip11 = 1.6813321647677475*
> ip12 = 1.0003505696757231
> ip13 = 1.1324160525509974
> ip14 = 1.000350569675723
> ip15 = 1.0007011393514456
> ip16 = 1.0005258545135842
> Singapore:
> ip21 = 1.095880806310253
> ip22 = 1.4101
> ip23 = 1.0953549517966696
>
> So ip11 node is indeed having higher score - but not sure why traffic is
> still going to that replica as opposed to some other node?
>
>

Re: Not able to form a Cassandra cluster of two nodes in Windows?

2013-11-01 Thread Aaron Mintz
One issue I ran into that produced similar symptoms: if you have
internode_compression turned on without the proper snappy library available
for your architecture (i had 64-bit linux), starting up will fail to link
the nodes. It'll also be silent unless you set a certain class logging
level to DEBUG, but it basically presented as if nodes would each form
their own single-machine ring


On Fri, Nov 1, 2013 at 3:52 AM, Techy Teck  wrote:

> I am trying to setup two nodes of Cassandra cluster on my windows machine.
> Basically, I have two windows machine. In both of my machine, I have
> installed Cassandra 1.2.11 from Datastax. Now I was following this
> [tutorial](
> http://www.datastax.com/2012/01/how-to-setup-and-monitor-a-multi-node-cassandra-cluster-on-windows)
> to setup two node Cassandra Cluster.
>
> After installing Cassandra into those two machines, I stopped the services
> for the Cassandra server, DataStax OpsCenter, and the DataStax OpsCenter
> agent in those two machines..
>
> And then I started making changes in the yaml file -
>
> My First Node details are -
>
> initial_token: 0
> seeds: "10.0.0.4"
> listen_address: 10.0.0.4   #IP of Machine - A (Wireless LAN adapter
> Wireless Network Connection)
> rpc_address: 10.0.0.4
>
> My Second Node details are -
>
> initial_token: 0
> seeds: "10.0.0.4"
> listen_address: 10.0.0.7   #IP of Machine - B (Wireless LAN adapter
> Wireless Network Connection)
> rpc_address: 10.0.0.7
>
> Both of my serves gets started up properly after I start the services for
> server. But they are not forming a cluster of two nodes somehow? Is there
> anything I am missing here?
>
> Machine-A Nodetool Information-
>
> Datacenter: datacenter1
> ==
> Replicas: 1
>
> Address   RackStatus State   Load
> OwnsToken
>
>
> 10.0.0.4  rack1   Up Normal  212.1 KB
> 100.00% 5264744098649860606
>
> Machine-B Nodetool Information-
>
> Starting NodeTool
>
> Datacenter: datacenter1
> ==
> Replicas: 1
>
> Address   RackStatus State   Load
> OwnsToken
>
>
> 10.0.0.7  rack1   Up Normal  68.46 KB
> 100.00% 407804996740764696
>
>
>


Re: Check out if Cassandra ready

2013-11-01 Thread Tom van den Berge
I recommend using CassandraUnit (https://github.com/jsevellec/cassandra-unit).
It makes using Cassandra in unit tests quite easy.

It allows you to start an embedded Cassandra synchronously with a single
simple method call, optionally load your schema and initial data, and
you're ready to start testing.

I'm using it in many unit tests (although formally it's not a unit test
anymore when relying on a cassandra node). The fantastic performance of
Cassandra even allows me to clear all column families and insert the test
fixture rows for each individual test case.

Good luck,
Tom



On Fri, Nov 1, 2013 at 10:00 AM, Salih Kardan  wrote:

> Hi all,
>
> I am a newbie to Cassandra and I am tring to write test cases to cassandra
> with JUnit.
> I use CassandraDaemon class to start cassandra in IntelliJ IDEA. I want to
> wait
> until Cassandra up and running before runnig test methods. How can I wait
> until cassandra starts or
> is there any way to check if cassandra is running (with Java)?
>
> Thanks.
> Salih Kardan
>



-- 

Drillster BV
Middenburcht 136
3452MT Vleuten
Netherlands

+31 30 755 5330

Open your free account at www.drillster.com


Re: How to generate tokens for my two node Cassandra cluster?

2013-11-01 Thread Peter Sanford
I can't tell you why that one-liner isn't working, but you can try
http://www.cassandraring.com for generating balanced tokens.


On Thu, Oct 31, 2013 at 11:59 PM, Techy Teck wrote:

> I am trying to setup two node Cassandra Cluster on windows machine. I have
> basically two windows machine and I was following this datastax tutorial (
> http://www.datastax.com/2012/01/how-to-setup-and-monitor-a-multi-node-cassandra-cluster-on-windows
> )
>
> Whenever I use the below command to get the token number from the above
> tutorial -
>
> python -c "num=2; print ""\n"".join([(""token %d: %d""
> %(i,(i*(2**127)/num))) for i in range(0,num)])"
>
>
> I always get this error -
>
> C:\Users\username>python -c "num=2; print ""\n"".join([(""token %d:
> %d"" %(i,(i*(2**127)/num))) for i
> in range(0,num)])"
>   File "", line 1
> num=2; print "\n".join([("token %d: %d" %(i,(i*(2**127)/num))) for
> i in range(0,num)])
> ^
> SyntaxError: invalid syntax
>
>
>


Re: How to generate tokens for my two node Cassandra cluster?

2013-11-01 Thread Ray Sutton
Your quotes need to be escaped:
python -c "num=2; print \"\n\".join([(\"token %d: %d\"
%(i,(i*(2**127)/num))) for i in range(0,num)])"


--
Ray  //o-o\\



On Fri, Nov 1, 2013 at 10:36 AM, Peter Sanford
wrote:

> I can't tell you why that one-liner isn't working, but you can try
> http://www.cassandraring.com for generating balanced tokens.
>
>
> On Thu, Oct 31, 2013 at 11:59 PM, Techy Teck wrote:
>
>> I am trying to setup two node Cassandra Cluster on windows machine. I
>> have basically two windows machine and I was following this datastax
>> tutorial (
>> http://www.datastax.com/2012/01/how-to-setup-and-monitor-a-multi-node-cassandra-cluster-on-windows
>> )
>>
>> Whenever I use the below command to get the token number from the above
>> tutorial -
>>
>> python -c "num=2; print ""\n"".join([(""token %d: %d""
>> %(i,(i*(2**127)/num))) for i in range(0,num)])"
>>
>>
>> I always get this error -
>>
>> C:\Users\username>python -c "num=2; print ""\n"".join([(""token %d:
>> %d"" %(i,(i*(2**127)/num))) for i
>> in range(0,num)])"
>>   File "", line 1
>> num=2; print "\n".join([("token %d: %d" %(i,(i*(2**127)/num)))
>> for i in range(0,num)])
>> ^
>> SyntaxError: invalid syntax
>>
>>
>>
>


Cassandra remove column using thrift

2013-11-01 Thread Suruchi Deodhar
Hello folks,

I have a couple of questions regarding deletion of columns from Cassandra
using thrift.
I am trying to remove a column using the thrift API call - remove() defined
as below.

void remove(1:required binary key,
  2:required ColumnPath column_path,
  3:required i64 timestamp,
  4:ConsistencyLevel consistency_level=ConsistencyLevel.ONE)
   throws (1:InvalidRequestException ire, 2:UnavailableException ue,
3:TimedOutException te),

I provide the timestamp as the current time and
consistency_level=ConsistencyLevel.ALL.

My questions wrt this are:
1. Is there a log where I can check whether the remove command registered
successfully with the Cassandra nodes?
(In my case, the thrift call was successfully executed but my queries still
show data in the deleted column. I don't see logs in the system.log. )

2. Given that consistency_level=ConsistencyLevel.ALL, how much time does
cassandra take to commit the delete and cassandra client to get updated
data?

Thanks,
Suruchi


Re: High loads only on one node in the cluster

2013-11-01 Thread Tyler Hobbs
On Fri, Nov 1, 2013 at 5:07 AM, Rakesh Rajan  wrote:

>
> 1) By alternating racks, do you mean to alternate racks between all nodes
> in a single DC v/s multiple DCs? AWS EastCoast has 4 AZs
> and Singapore has 2 AZs. So is the final solution something like this:
> ip11 - East Coast - m1.xlarge / us-east-1b - Token: 0
> ip21 - Singapore  - m1.xlarge / ap-southeast-1a - Token: 1001
> ip12 - East Coast - m1.xlarge / us-east-*1c* -
> Token: 28356863910078205288614550619314017621
> ip13 - East Coast - m1.xlarge / us-east-*1d* -
> Token: 56713727820156410577229101238628035241
> ip22 - Singapore  - m1.xlarge / ap-southeast-1b -
> Token: 56713727820156410577229101238628036241
> ip14 - East Coast - m1.xlarge / us-east-*1a* -
> Token: 85070591730234615865843651857942052863
> ip15 - East Coast - m1.xlarge / us-east-*1b* -
> Token: 113427455640312821154458202477256070484
> ip23 - Singapore  - m1.xlarge / ap-southeast-*1a* -
> Token: 113427455640312821154458202477256071484
> ip16 - East Coast - m1.xlarge / us-east-*1c* -
> Token: 141784319550391026443072753096570088105
>
> Is this what you had suggested?
>

That would be more balanced than your current setup, but it would still be
unbalanced, especially the ap-southeast DC.  To have a perfectly balanced
cluster with multiple racks, you need to a) have the same number of nodes
on each rack, and b) alternate racks within each DC.  Your new layout would
meet requirement (b), but not (a).  This is why I suggest using the same
rack for all nodes.


-- 
Tyler Hobbs
DataStax 


Re: Cassandra 1.1.6 - New node bootstrap not completing

2013-11-01 Thread Narendra Sharma
I was successfully able to bootstrap the node. The issue was RF > 2. Thanks
again Robert.


On Wed, Oct 30, 2013 at 10:29 AM, Narendra Sharma  wrote:

> Thanks Robert.
>
> I didn't realize that some of the keyspaces (not all and esp. the biggest
> one I was focusing on) had RF > 2. I wasted 3 days on it. Thanks again for
> the pointers. I will try again and share the results.
>
>
> On Wed, Oct 30, 2013 at 12:28 AM, Robert Coli wrote:
>
>> On Tue, Oct 29, 2013 at 11:45 AM, Narendra Sharma <
>> narendra.sha...@gmail.com> wrote:
>>
>>> We had a cluster of 4 nodes in AWS. The average load on each node was
>>> approx 750GB. We added 4 new nodes. It is now more than 30 hours and the
>>> node is still in JOINING mode.
>>> Specifically I am analyzing the one with IP 10.3.1.29. There is no
>>> compaction or streaming or index building happening.
>>>
>>
>> If your cluster has RF>2, you are bootstrapping two nodes into the same
>> range simultaneously. That is not supported. [1,2] The node you are having
>> the problem with is in the range that is probably overlapping.
>>
>> If I were you I would :
>>
>> 1) stop all "Joining" nodes and wipe their state including system keyspace
>> 2) optionally "removetoken" any nodes which remain in cluster gossip
>> state after stopping
>> 3) re-start/bootstrap them one at a time, waiting for each to complete
>> bootstrapping before starting the next  one
>> 4) (unrelated) Upgrade from 1.1.6 to the head of 1.1.x ASAP.
>>
>> =Rob
>> [1] https://issues.apache.org/jira/browse/CASSANDRA-2434
>> [2]
>> https://issues.apache.org/jira/browse/CASSANDRA-2434?focusedCommentId=13091851&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13091851
>>
>
>
>
> --
> Narendra Sharma
> Software Engineer
> *http://www.aeris.com*
> *http://narendrasharma.blogspot.com/*
>
>


-- 
Narendra Sharma
Software Engineer
*http://www.aeris.com*
*http://narendrasharma.blogspot.com/*


Re: Not able to form a Cassandra cluster of two nodes in Windows?

2013-11-01 Thread Techy Teck
In my case, both of my laptop are running Windows 7 64 bit.. Not sure
what's the problem...


On Fri, Nov 1, 2013 at 4:48 AM, Aaron Mintz  wrote:

> One issue I ran into that produced similar symptoms: if you have
> internode_compression turned on without the proper snappy library available
> for your architecture (i had 64-bit linux), starting up will fail to link
> the nodes. It'll also be silent unless you set a certain class logging
> level to DEBUG, but it basically presented as if nodes would each form
> their own single-machine ring
>
>
> On Fri, Nov 1, 2013 at 3:52 AM, Techy Teck wrote:
>
>> I am trying to setup two nodes of Cassandra cluster on my windows
>> machine. Basically, I have two windows machine. In both of my machine, I
>> have installed Cassandra 1.2.11 from Datastax. Now I was following this
>> [tutorial](
>> http://www.datastax.com/2012/01/how-to-setup-and-monitor-a-multi-node-cassandra-cluster-on-windows)
>> to setup two node Cassandra Cluster.
>>
>> After installing Cassandra into those two machines, I stopped the
>> services for the Cassandra server, DataStax OpsCenter, and the DataStax
>> OpsCenter agent in those two machines..
>>
>> And then I started making changes in the yaml file -
>>
>> My First Node details are -
>>
>> initial_token: 0
>> seeds: "10.0.0.4"
>> listen_address: 10.0.0.4   #IP of Machine - A (Wireless LAN adapter
>> Wireless Network Connection)
>> rpc_address: 10.0.0.4
>>
>> My Second Node details are -
>>
>> initial_token: 0
>> seeds: "10.0.0.4"
>> listen_address: 10.0.0.7   #IP of Machine - B (Wireless LAN adapter
>> Wireless Network Connection)
>> rpc_address: 10.0.0.7
>>
>> Both of my serves gets started up properly after I start the services for
>> server. But they are not forming a cluster of two nodes somehow? Is there
>> anything I am missing here?
>>
>> Machine-A Nodetool Information-
>>
>> Datacenter: datacenter1
>> ==
>> Replicas: 1
>>
>> Address   RackStatus State   Load
>> OwnsToken
>>
>>
>> 10.0.0.4  rack1   Up Normal  212.1 KB
>> 100.00% 5264744098649860606
>>
>> Machine-B Nodetool Information-
>>
>> Starting NodeTool
>>
>> Datacenter: datacenter1
>> ==
>> Replicas: 1
>>
>> Address   RackStatus State   Load
>> OwnsToken
>>
>>
>> 10.0.0.7  rack1   Up Normal  68.46 KB
>> 100.00% 407804996740764696
>>
>>
>>
>


Re: High loads only on one node in the cluster

2013-11-01 Thread Rakesh Rajan
Tyler,

Thanks for the explanation. The objective is not to have a perfectly
balanced US-East and SG DC clusters. SG DC cluster is just a backup cluster
and hence has lesser nodes than US-East cluster. What we are trying to
figure out is the imbalance between the 6 nodes within US-East itself. I'll
try to correct the 6 nodes with US-East to proper racks and check.

In addition, as I mentioned earlier, do you see any issues with the dynamic
snitch attribute score? I see that node has high score, but what value of
dynamic_snitch_badness_threshold should I set so that other replicas can
get the traffic? ( that node has >50% higher score than all other nodes )


On Fri, Nov 1, 2013 at 10:04 PM, Tyler Hobbs  wrote:

>
> On Fri, Nov 1, 2013 at 5:07 AM, Rakesh Rajan  wrote:
>
>>
>> 1) By alternating racks, do you mean to alternate racks between all nodes
>> in a single DC v/s multiple DCs? AWS EastCoast has 4 AZs
>> and Singapore has 2 AZs. So is the final solution something like this:
>> ip11 - East Coast - m1.xlarge / us-east-1b - Token: 0
>> ip21 - Singapore  - m1.xlarge / ap-southeast-1a - Token: 1001
>> ip12 - East Coast - m1.xlarge / us-east-*1c* -
>> Token: 28356863910078205288614550619314017621
>> ip13 - East Coast - m1.xlarge / us-east-*1d* -
>> Token: 56713727820156410577229101238628035241
>> ip22 - Singapore  - m1.xlarge / ap-southeast-1b -
>> Token: 56713727820156410577229101238628036241
>> ip14 - East Coast - m1.xlarge / us-east-*1a* -
>> Token: 85070591730234615865843651857942052863
>> ip15 - East Coast - m1.xlarge / us-east-*1b* -
>> Token: 113427455640312821154458202477256070484
>> ip23 - Singapore  - m1.xlarge / ap-southeast-*1a* -
>> Token: 113427455640312821154458202477256071484
>> ip16 - East Coast - m1.xlarge / us-east-*1c* -
>> Token: 141784319550391026443072753096570088105
>>
>> Is this what you had suggested?
>>
>
> That would be more balanced than your current setup, but it would still be
> unbalanced, especially the ap-southeast DC.  To have a perfectly balanced
> cluster with multiple racks, you need to a) have the same number of nodes
> on each rack, and b) alternate racks within each DC.  Your new layout would
> meet requirement (b), but not (a).  This is why I suggest using the same
> rack for all nodes.
>
>
> --
> Tyler Hobbs
> DataStax 
>


Re: IllegalStateException when bootstrapping new nodes

2013-11-01 Thread Cyril Scetbon
We got issues with Hinted Handoff threads (2 actives) now which cause High CPU 
Load and a lot of garbage collection only on our new nodes. You can see that we 
have a lot of Hints on our new nodes (the 4 last ones)

node001
2.2M/cassandra/data/system/hints/
node002
1.6M/cassandra/data/system/hints/
node003
1.8M/cassandra/data/system/hints/
node004
1.9M/cassandra/data/system/hints/
node005
1.9M/cassandra/data/system/hints/
node006
2.0M/cassandra/data/system/hints/
node007
1.9M/cassandra/data/system/hints/
node008
1.8M/cassandra/data/system/hints/
node009
34G /cassandra/data/system/hints/
node011
43G /cassandra/data/system/hints/
node013
43G /cassandra/data/system/hints/
node015
16G /cassandra/data/system/hints/

I try to pause and resume hinted which made CPU return to normal load, but they 
are still there and CPU seems to get high again later with same consequences 
(lot of garbage collections) which seems to be again related to hints.

any idea ?
-- 
Cyril SCETBON

On 31 Oct 2013, at 23:35, Cyril Scetbon  wrote:

> No we're using vnodes
> -- 
> Cyril SCETBON
> 
> On 30 Oct 2013, at 20:25, Robert Coli  wrote:
> 
>> On Wed, Oct 30, 2013 at 12:22 PM, Cyril Scetbon  
>> wrote:
>> FYI, we should upgrade to the last 1.2 version (1.2.11+) in January 2014. 
>> However, we would like to know if it's a known fixed bug or inform you about 
>> this issue if it's not.
>> 
>> Did you bootstrap multiple nodes into the same token range? That is 
>> unsupported...
>> 
>> What does "nodetool ring" say?
>> 
>> =Rob
>>  
> 



Re: Cassandra 1.1.6 - New node bootstrap not completing

2013-11-01 Thread Robert Coli
On Fri, Nov 1, 2013 at 9:36 AM, Narendra Sharma
wrote:

> I was successfully able to bootstrap the node. The issue was RF > 2.
> Thanks again Robert.
>

For the record, I'm not entirely clear why bootstrapping two nodes into the
same range should have caused your specific bootstrap problem, but I am
glad to hear that bootstrapping one node at a time was a usable workaround.

=Rob


Frustration with "repair" process in 1.1.11

2013-11-01 Thread Oleg Dulin

First I need to vent.


One of my cassandra cluster is a dual data center setup, with DC1 
acting as primary, and DC2 acting as a hot backup.


Well, guess what ? I am pretty sure that it falls behind on 
replication. So I am told I need to run repair.


I run repair (with -pr) on DC2. First time I run it it gets *stuck* 
(i.e. frozen) within the first 30 seconds, with no error or any sort of 
message. I then run it again -- and it completes in seconds on each 
node, with about 50 gigs of data on each.


That seems suspicious, so I do some research.

I am told on IRC that running repair -pr will only do the repair on 
"100" tokens (the offset from DC1 to DC2)… Seriously ???


Repair process is, indeed, a joke: 
https://issues.apache.org/jira/browse/CASSANDRA-5396 . Repair is the 
worst thing you can do to your cluster, it consumes enormous resources, 
and can leave your cluster in an inconsistent state. Oh and by the way 
you must run it every week…. Whoever invented that process must not 
live in a real world, with real applications.



No… lets have a constructive conversation.

How do I know, with certainty, that my DC2 cluster is up to date on 
replication ? I have a few options:


1) I set read repair chance to 100% on critical column families and I 
write a tool to scan every CF, every column of every row. This strikes 
me as very silly. 
Q1: Do I need to scan every column or is looking at one column enough 
to trigger a read repair ?


2) Can someone explain to me how the repair works such that I don't 
totally trash my cluster or spill into work week ?


Is there any improvement and clarity in 1.2 ? How about 2.0 ?



--
Regards,
Oleg Dulin
http://www.olegdulin.com

Recompacting all sstables

2013-11-01 Thread Jiri Horky
Hi all

since we upgraded half of our Cassandra cluster to 2.0.0 and we use LCS,
we hit CASSANDRA-6284 bug. So basically all data in sstables created
after the upgrade are wrongly (non-uniformly within compaction levels)
distributed. This causes a huge overhead when compacting new sstables
(see the bug for the details).

After applying the patch, the distribution of the data within a level is
supposed to recover itself over time but we would like to not to wait a
month or so until it gets better.

So the question. What is the best way to recompact all the sstables so
the data in one sstables within a level would contain more or less the
right portion of the data, in other worlds, keys would be uniformly
distributed across sstables within a level? (e.g.: assumming total token
range for a node 1..1, and given that L2 should contain 100
sstables, , all sstables within L2 should cover a range of ~100 tokens).

Based on documentation, I can only think of switching to SizeTiered
compaction, doing major compaction and then switching back to LCS.

Thanks in advance
Jiri Horky


Re: Recompacting all sstables

2013-11-01 Thread Robert Coli
On Fri, Nov 1, 2013 at 12:47 PM, Jiri Horky  wrote:

> since we upgraded half of our Cassandra cluster to 2.0.0 and we use LCS,
> we hit CASSANDRA-6284 bug.


1) Why upgrade a cluster to 2.0.0? Hopefully not a production cluster? [1]

2) CASSANDRA-6284 is ouch, thx for filing and patching!

3) What do you mean by "upgraded half of our Cassandra cluster"? That is
Not Supported and also Not Advised... for example, before the streaming
change in 2.x line, a cluster in such a state may be unable to have nodes
added, removed or replaced.

So the question. What is the best way to recompact all the sstables so
> the data in one sstables within a level would contain more or less the
> right portion of the data
>
...

> Based on documentation, I can only think of switching to SizeTiered
> compaction, doing major compaction and then switching back to LCS.
>

That will work, though be aware of  the implication of CASSANDRA-6092 [2].
Briefly, if the CF in question is not receiving write load, you will be
unable to promote your One Big SSTable from L0 to L1. In that case, you
might want to consider running sstable_split (and then restarting the node)
in order to split your One Big SSTable into two or more smaller ones.

=Rob

[1]
https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/
[2] https://issues.apache.org/jira/browse/CASSANDRA-6092


Re: Cassandra remove column using thrift

2013-11-01 Thread Robert Coli
On Fri, Nov 1, 2013 at 8:15 AM, Suruchi Deodhar <
suruchi.deod...@generalsentiment.com> wrote:

> I provide the timestamp as the current time and
> consistency_level=ConsistencyLevel.ALL.
>
> My questions wrt this are:
> 1. Is there a log where I can check whether the remove command registered
> successfully with the Cassandra nodes?
> (In my case, the thrift call was successfully executed but my queries
> still show data in the deleted column. I don't see logs in the system.log. )
>

This sounds unexpected. Have you tried deleting the same column via
cassandra-cli or cqlsh (as applicable)? If so, does it work?


> 2. Given that consistency_level=ConsistencyLevel.ALL, how much time does
> cassandra take to commit the delete and cassandra client to get updated
> data?
>

In my understanding, zero time. CL.ALL means all replicas have acknowledged
the write, and therefore the write should be available in the memtables of
all nodes.

=Rob


Re: Cassandra remove column using thrift

2013-11-01 Thread Jayadev Jayaraman
Hey guys,

False alarm, sorry about that. Our column-names are byte-concatenations of
short integers and we had been constructing the column names wrongly before
attempting a delete. We fixed the problem and we've been able to delete the
columns without issue.


On Fri, Nov 1, 2013 at 4:19 PM, Robert Coli  wrote:

> On Fri, Nov 1, 2013 at 8:15 AM, Suruchi Deodhar <
> suruchi.deod...@generalsentiment.com> wrote:
>
>> I provide the timestamp as the current time and
>> consistency_level=ConsistencyLevel.ALL.
>>
>> My questions wrt this are:
>> 1. Is there a log where I can check whether the remove command registered
>> successfully with the Cassandra nodes?
>> (In my case, the thrift call was successfully executed but my queries
>> still show data in the deleted column. I don't see logs in the system.log. )
>>
>
> This sounds unexpected. Have you tried deleting the same column via
> cassandra-cli or cqlsh (as applicable)? If so, does it work?
>
>
>> 2. Given that consistency_level=ConsistencyLevel.ALL, how much time does
>> cassandra take to commit the delete and cassandra client to get updated
>> data?
>>
>
> In my understanding, zero time. CL.ALL means all replicas have
> acknowledged the write, and therefore the write should be available in the
> memtables of all nodes.
>
> =Rob
>


Strange exception when storing heavy data in cassandra 2.0.0...

2013-11-01 Thread Krishna Chaitanya
Hello,
 I am newbie to the  Cassandra world. I am currently using
Cassandra 2.0.0 with thrift 0.8.0 for storing netflow packets using
libQtCassandra library. Currently, I am generating about 1000 netflows/sec
and store\ing them into the database. The program is crashing with the an
exception on the lines of--- "what() :frame size reads negative value". I
re ran the program a few times with same exception after running for a few
seconds. But, after few times a new default TException occured and
Cassandra is now failing to even to startup. When I try kill and start it
up again, the exception says--" exception encountered during startup:
unfinished compactions reference missing sstables.This should never happen
since compactions are marked finished before we start removing the old
sstables".
 Is this a known issue because it did
not occur when we were using Cassandra-1.2.6 and previous versions with
pycassa library for accessing the database. I am just running a single
cassandra node with default settings. How can I avoid this exception and
also is there any way in which I can get back my node to running state even
if it means reinstalling cassandra???  Thank You in advance for any help.

-- 
Regards,
BNSK*.
*