Re: vnodes ready for production ?

2013-06-18 Thread Alain RODRIGUEZ
Any insights on vnodes, one month after my original post ?


2013/5/16 Alain RODRIGUEZ 

> Hi,
>
> Adding vnodes is a big improvement to Cassandra, specifically because we
> have a fluctuating load on our Cassandra depending on the week, and it is
> quite annoying to add some nodes for one week or two, move tokens and then
> having to remove them and then move tokens again. Even more if we could
> automate some up-scale thanks to AWS alarms, It would be awesome.
>
> We don't use vnodes yet because Opscenter did not support this feature and
> because we need to have a reliable production. Now Opscenter handles vnodes.
>
> Are the vnodes feature and the tokens =>vnodes transition safe enough to
> go live with vnodes ?
>
> What would be the transition process ?
>
> Does someone auto-scale his Cassandra cluster ?
>
> Any advice about vnodes ?
>


Re: [Cassandra] Expanding a Cassandra cluster

2013-06-18 Thread Richard Low
On 10 June 2013 22:00, Emalayan Vairavanathan  wrote:

b) Will Cassandra automatically take care of removing
> obsolete keys in future ?
>

In a future version Cassandra should automatically clean up for you:

https://issues.apache.org/jira/browse/CASSANDRA-5051

Right now though you have to run cleanup eventually or the space will never
be reclaimed.

Richard.


What is the effect of reducing the thrift message sizes on GC

2013-06-18 Thread Ananth Gundabattula
We are currently running on 1.1.10 and planning to migrate to a higher
version 1.2.4.

The question pertains to tweaking all the knobs to reduce GC related issues
( we have been fighting a lot of really bad GC issues on 1.1.10 and met with 
little
success all the way using 1.1.10)

Taking into consideration GC tuning is a black art, I was wondering if we
can have some good effect on the GC by tweaking the following settings:

*thrift_framed_transport_size_in_mb & thrift_max_message_length_in_mb*
*
*
Our system is a very short column (both in number of columns and data sizes
) tables but having millions/billions of rows in each column family. The typical
number of columns in each column family is 4. The typical lookup involves
specifying the row key and fetching one column most of the times. The
writes are also similar except for one keyspace where the number of columns
are 50 but very small data sizes per column.

Assuming we can tweak the config values :
*
*
* > thrift_framed_transport_size_in_mb & *
* >  thrift_max_message_length_in_mb *

to lower values in the above context, I was wondering if it helps in the GC
being invoked less if the thrift settings reflect our data model reads and 
writes ?

For example: What is the impact by reducing the above config values on the
GC to say 1 mb rather than say 15 or 16 ?

Thanks a lot for your inputs and thoughts.


Regards,
Ananth


rename a cluster in cassandra 1.2.6

2013-06-18 Thread Paco a.k.a. "Francisco Trujillo"
I am using cassandra 1.2.6 in cluster with a single node. I am trying to rename 
the cluster using the instructions in:

Cassandra clustername mismatch

After doing all the steps indicate I continue with the same error when I start 
cassandra after change the cassandra.yaml file

Do anyone Know if it is a problem of cassandra 1.2.6?

Thanks

[cid:image001.png@01CE6C06.7F4A9180]
Disclaimer
This e-mail and any files transmitted with it are confidential and intended 
solely for the use of the individual or entity to whom/which they are 
addressed. If you have received this e-mail in error please notify the sender 
immediately, delete this e-mail from your system and do not disseminate, 
distribute or copy this e-mail. This e-mail shall not constitute a binding 
agreement unless expressly confirmed in writing by one of the Directors of 
Genetwister. Please note that any views or opinions presented in this e-mail 
are solely those of the author and do not necessarily represent those of 
Genetwister. Finally, the recipient should check this e-mail and any 
attachments for the presence of viruses. Genetwister accepts no liability for 
any damage caused by any virus transmitted by this e-mail or damage caused by 
the use of the information contained in this e-mail. / Chamber of Commerce 
Arnhem, The Netherlands no.: 11043533

<>

Re: rename a cluster in cassandra 1.2.6

2013-06-18 Thread aaron morton
The cluster name is read from the yaml file the first time the server starts 
and stored in the system tables, these are in the local CF in the system KS.

If this is test system just blow away the data for the CF or truncate it. 

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 18/06/2013, at 7:30 PM, Paco a.k.a. Francisco Trujillo 
 wrote:

> I am using cassandra 1.2.6 in cluster with a single node. I am trying to 
> rename the cluster using the instructions in:
>  
> Cassandra clustername mismatch
>  
> After doing all the steps indicate I continue with the same error when I 
> start cassandra after change the cassandra.yaml file
>  
> Do anyone Know if it is a problem of cassandra 1.2.6?
>  
> Thanks
>  
> 
> Disclaimer
> This e-mail and any files transmitted with it are confidential and intended 
> solely for the use of the individual or entity to whom/which they are 
> addressed. If you have received this e-mail in error please notify the sender 
> immediately, delete this e-mail from your system and do not disseminate, 
> distribute or copy this e-mail. This e-mail shall not constitute a binding 
> agreement unless expressly confirmed in writing by one of the Directors of 
> Genetwister. Please note that any views or opinions presented in this e-mail 
> are solely those of the author and do not necessarily represent those of 
> Genetwister. Finally, the recipient should check this e-mail and any 
> attachments for the presence of viruses. Genetwister accepts no liability for 
> any damage caused by any virus transmitted by this e-mail or damage caused by 
> the use of the information contained in this e-mail. / Chamber of Commerce 
> Arnhem, The Netherlands no.: 11043533
> 
>  



Re: What is the effect of reducing the thrift message sizes on GC

2013-06-18 Thread aaron morton
> *thrift_framed_transport_size_in_mb & thrift_max_message_length_in_mb*
This control the max size of a bugger allocated by thrift when processing 
requests / responses. The buffers are not pre allocated, but once they are 
allocated they are not returned. So it's only an issue if have lots of clients 
connecting and reading a lot of data. 

> Our system is a very short column (both in number of columns and data sizes
> ) tables but having millions/billions of rows in each column family.
If you have over 500 million rows per node you may be running into issues with 
the bloom filters and index samples. 

This typically looks like the heap usage does not reduce after CMS compaction 
has completed. 

Ensure the bloom_file_fp_chance on the CF's is set to 0.01 for size tiered 
compaction and 0.1 for levelled compaction. If you need to change it  run 
nodetool upgradesstables

Then consider increasing the index_interval in the yaml file, see the comments. 

Note that v 1.2 moves the bloom filters off heap, so if you upgrade to 1.2 it 
will probably resolve your issues. 

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 18/06/2013, at 7:30 PM, Ananth Gundabattula  
wrote:

> We are currently running on 1.1.10 and planning to migrate to a higher
> version 1.2.4.
> 
> The question pertains to tweaking all the knobs to reduce GC related issues
> ( we have been fighting a lot of really bad GC issues on 1.1.10 and met with 
> little
> success all the way using 1.1.10)
> 
> Taking into consideration GC tuning is a black art, I was wondering if we
> can have some good effect on the GC by tweaking the following settings:
> 
> *thrift_framed_transport_size_in_mb & thrift_max_message_length_in_mb*
> *
> *
> Our system is a very short column (both in number of columns and data sizes
> ) tables but having millions/billions of rows in each column family. The 
> typical
> number of columns in each column family is 4. The typical lookup involves
> specifying the row key and fetching one column most of the times. The
> writes are also similar except for one keyspace where the number of columns
> are 50 but very small data sizes per column.
> 
> Assuming we can tweak the config values :
> *
> *
> * > thrift_framed_transport_size_in_mb & *
> * >  thrift_max_message_length_in_mb *
> 
> to lower values in the above context, I was wondering if it helps in the GC
> being invoked less if the thrift settings reflect our data model reads and 
> writes ?
> 
> For example: What is the impact by reducing the above config values on the
> GC to say 1 mb rather than say 15 or 16 ?
> 
> Thanks a lot for your inputs and thoughts.
> 
> 
> Regards,
> Ananth



Re: vnodes ready for production ?

2013-06-18 Thread aaron morton
> Even more if we could automate some up-scale thanks to AWS alarms, It would 
> be awesome.
I saw a demo for Priam (https://github.com/Netflix/Priam) doing that at netflix 
in March, not sure if it's public yet. 

> Are the vnodes feature and the tokens =>vnodes transition safe enough to go 
> live with vnodes ?
There have been some issues, search the user list for shuffle and as always 
test. 

> Any advice about vnodes ?

They are in use out there. It's a sizable change so it would be good idea to 
build a test system for running shuffle and testing your application. There 
have been some issues with repair and range scans (including hadoop 
integration.)

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 18/06/2013, at 7:04 PM, Alain RODRIGUEZ  wrote:

> Any insights on vnodes, one month after my original post ?
> 
> 
> 2013/5/16 Alain RODRIGUEZ 
> Hi, 
> 
> Adding vnodes is a big improvement to Cassandra, specifically because we have 
> a fluctuating load on our Cassandra depending on the week, and it is quite 
> annoying to add some nodes for one week or two, move tokens and then having 
> to remove them and then move tokens again. Even more if we could automate 
> some up-scale thanks to AWS alarms, It would be awesome.
> 
> We don't use vnodes yet because Opscenter did not support this feature and 
> because we need to have a reliable production. Now Opscenter handles vnodes.
> 
> Are the vnodes feature and the tokens =>vnodes transition safe enough to go 
> live with vnodes ?
> 
> What would be the transition process ?
> 
> Does someone auto-scale his Cassandra cluster ?
> 
> Any advice about vnodes ?
> 



Re: Reduce Cassandra GC

2013-06-18 Thread Joel Samuelsson
Can't find any promotion failure.

In system.log this is what I get:
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,490 GCInspector.java (line
122) GC for ParNew: 145189 ms for 1 collections, 225905072 used; max is
4114612224
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,490 StatusLogger.java (line
57) Pool NameActive   Pending   Blocked
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,491 StatusLogger.java (line
72) ReadStage 0 0 0
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,492 StatusLogger.java (line
72) RequestResponseStage  0 0 0
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,492 StatusLogger.java (line
72) ReadRepairStage   0 0 0
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,492 StatusLogger.java (line
72) MutationStage 0 0 0
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,493 StatusLogger.java (line
72) ReplicateOnWriteStage 0 0 0
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,493 StatusLogger.java (line
72) GossipStage   0 0 0
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,493 StatusLogger.java (line
72) AntiEntropyStage  0 0 0
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,494 StatusLogger.java (line
72) MigrationStage0 0 0
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,494 StatusLogger.java (line
72) StreamStage   0 0 0
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,494 StatusLogger.java (line
72) MemtablePostFlusher   0 0 0
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,495 StatusLogger.java (line
72) FlushWriter   0 0 0
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,495 StatusLogger.java (line
72) MiscStage 0 0 0
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,499 StatusLogger.java (line
72) commitlog_archiver0 0 0
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,499 StatusLogger.java (line
72) InternalResponseStage 0 0 0
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,499 StatusLogger.java (line
72) HintedHandoff 0 0 0
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,500 StatusLogger.java (line
77) CompactionManager 0 0
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,500 StatusLogger.java (line
89) MessagingServicen/a   0,0
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,500 StatusLogger.java (line
99) Cache Type SizeCapacityKeysToSave  Provider
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,504 StatusLogger.java (line
100) KeyCache  12129   2184533 all

 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,505 StatusLogger.java (line
106) RowCache  0   0   all
org.apache.cassandra.cache.SerializingCacheProvider
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,505 StatusLogger.java (line
113) ColumnFamilyMemtable ops,data
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,505 StatusLogger.java (line
116) system.NodeIdInfo 0,0
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,505 StatusLogger.java (line
116) system.IndexInfo  0,0
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,505 StatusLogger.java (line
116) system.LocationInfo   0,0
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,506 StatusLogger.java (line
116) system.Versions 3,103
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,506 StatusLogger.java (line
116) system.schema_keyspacees   0,0
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,506 StatusLogger.java (line
116) system.Migrations 0,0
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,506 StatusLogger.java (line
116) system.schema_columnfamilies 0,0
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,506 StatusLogger.java (line
116) system.schema_columns 0,0
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,507 StatusLogger.java (line
116) system.HintsColumnFamily  0,0
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,507 StatusLogger.java (line
116) system.Schema 0,0
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,507 StatusLogger.java (line
116) Keyspace.cf01 0,0
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,507 StatusLogger.java (line
116) Keyspace.cf02 0,0
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,507 StatusLogger.java (line
116) Keyspace.cf03 0,0
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,508 StatusLogger.java (line
116) Keyspace.cf04 0,0
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,508 StatusLogger.java (line
116) Keyspace.cf05 0,0
 INFO [ScheduledTasks:1] 2013-06-17 08:13:47,508 Status

Re: Node failing to decomission (vnodes and 1.2.5)

2013-06-18 Thread aaron morton
> I also am not seeing anything in the nodes log files to suggest errors during 
> streaming or leaving.
You should see a log message saying "DECOMMISSIONED" when the process 
completes. 

What does nodetool status say?

> What suggestions does anyone have on getting this node removed from my ring 
> so I can rebuild it with the correct number of tokens, before I end up with a 
> disk space issue from too many vnodes.
If you really want to get the node out of there shut it down and run nodetool 
removenode on one of the remaining nodes. 

Cheers
 

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 18/06/2013, at 2:59 PM, David McNelis  wrote:

> I have a node in my ring (1.2.5) that when it was set up, had the wrong 
> number of vnodes assigned (double the amount it should have had).
> 
> As  a result, and because we can't reduce the number of vnodes on a machine 
> (at least at this point), I need to decommission the node.
> 
> The problem is that we've tried running decommission several times.  In each 
> instance we'll have a lot of streams to other nodes for a period, and then 
> eventually, netstats will tell us:
> 
> nodetool -h localhost netstats
> Mode: LEAVING
>  Nothing streaming to /10.x.x.1
>  Nothing streaming to /10.x.x.2
>  Nothing streaming to /10.x.x.3
> Not receiving any streams.
> Pool NameActive   Pending  Completed
> Commandsn/a 0 955991
> Responses   n/a 02947860
> 
> I also am not seeing anything in the nodes log files to suggest errors during 
> streaming or leaving.
> 
> Then the node will stay in this leaving state for... well, we gave up after 
> several days of no more activity and retried several times.  Each time we 
> "gave up" on it, we restarted the service and it was no longer listed as 
> Leaving, just active.  Even when in a "leaving" state, the size of data on 
> the node continued to grow.
> 
> What suggestions does anyone have on getting this node removed from my ring 
> so I can rebuild it with the correct number of tokens, before I end up with a 
> disk space issue from too many vnodes.



Re: What is the effect of reducing the thrift message sizes on GC

2013-06-18 Thread Ananth Gundabattula
Thanks Aaron for the insight.

One quick question:

>The buffers are not pre allocated, but once they are allocated they are
>not returned. So it's only an issue if have lots of clients connecting
>and reading a lot of data.
So to understand you correctly, the buffer is allocated per client
connection and remains all the while during the JVM and is reused for each
request ? 
If that is the case, then I am presuming there is no much gain by playing
around with this config with respect to optimizing for Gcs.

>reduce bloom filters, index intervals Š.
Well we have tried all the configs as advised below (and others like key
cache sizes etc ) and hit a dead end and that is the reason for a 1.2.4
move. Thanks for all your thoughts and advice on this.


Regards,
Ananth 



On 6/18/13 5:56 PM, "aaron morton"  wrote:

>> *thrift_framed_transport_size_in_mb & thrift_max_message_length_in_mb*
>This control the max size of a bugger allocated by thrift when processing
>requests / responses. The buffers are not pre allocated, but once they
>are allocated they are not returned. So it's only an issue if have lots
>of clients connecting and reading a lot of data.
>
>> Our system is a very short column (both in number of columns and data
>>sizes
>> ) tables but having millions/billions of rows in each column family.
>If you have over 500 million rows per node you may be running into issues
>with the bloom filters and index samples.
>
>This typically looks like the heap usage does not reduce after CMS
>compaction has completed.
>
>Ensure the bloom_file_fp_chance on the CF's is set to 0.01 for size
>tiered compaction and 0.1 for levelled compaction. If you need to change
>it  run nodetool upgradesstables
>
>Then consider increasing the index_interval in the yaml file, see the
>comments. 
>
>Note that v 1.2 moves the bloom filters off heap, so if you upgrade to
>1.2 it will probably resolve your issues.
>
>Cheers
>
>-
>Aaron Morton
>Freelance Cassandra Consultant
>New Zealand
>
>@aaronmorton
>http://www.thelastpickle.com
>
>On 18/06/2013, at 7:30 PM, Ananth Gundabattula
> wrote:
>
>> We are currently running on 1.1.10 and planning to migrate to a higher
>> version 1.2.4.
>> 
>> The question pertains to tweaking all the knobs to reduce GC related
>>issues
>> ( we have been fighting a lot of really bad GC issues on 1.1.10 and met
>>with little
>> success all the way using 1.1.10)
>> 
>> Taking into consideration GC tuning is a black art, I was wondering if
>>we
>> can have some good effect on the GC by tweaking the following settings:
>> 
>> *thrift_framed_transport_size_in_mb & thrift_max_message_length_in_mb*
>> *
>> *
>> Our system is a very short column (both in number of columns and data
>>sizes
>> ) tables but having millions/billions of rows in each column family.
>>The typical
>> number of columns in each column family is 4. The typical lookup
>>involves
>> specifying the row key and fetching one column most of the times. The
>> writes are also similar except for one keyspace where the number of
>>columns
>> are 50 but very small data sizes per column.
>> 
>> Assuming we can tweak the config values :
>> *
>> *
>> * > thrift_framed_transport_size_in_mb & *
>> * >  thrift_max_message_length_in_mb *
>> 
>> to lower values in the above context, I was wondering if it helps in
>>the GC
>> being invoked less if the thrift settings reflect our data model reads
>>and writes ?
>> 
>> For example: What is the impact by reducing the above config values on
>>the
>> GC to say 1 mb rather than say 15 or 16 ?
>> 
>> Thanks a lot for your inputs and thoughts.
>> 
>> 
>> Regards,
>> Ananth
>



"SQL" Injection C* (via CQL & Thrift)

2013-06-18 Thread Brian O'Neill
Mostly for fun, I wanted to throw this out there...

We are undergoing a security audit for our platform (C* + Elastic Search +
Storm).  One component of that audit is susceptibility to SQL injection.  I
was wondering if anyone has attempted to construct a SQL injection attack
against Cassandra?  Is it even possible?

I know the code paths fairly well, but...
Does there exists a path in the code whereby user data gets interpreted,
which could be exploited to perform user operations?

>From the Thrift side of things, I've always felt safe.  Data is opaque.
 Serializers are used to convert it to Bytes, and C* doesn't ever really do
anything with the data.

In examining the CQL java-driver, it looks like there might be a bit more
exposure to injection.  (or even CQL over Thrift)  I haven't dug into the
code yet, but dependent on which flavor of the API you are using, you may
be including user data in your statements.

Does anyone know if the CQL java-driver does anything to protect against
injection?  Or is it possible to say that the syntax is strict enough that
any embedded operations in data would not parse?

just some food for thought...
I'll be digging into this over the next couple weeks.  If people are
interested, I can throw a blog post out there with the findings.

-brian

-- 
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)
mobile:215.588.6024
blog: http://brianoneill.blogspot.com/
twitter: @boneill42


Data not fully replicated with 2 nodes and replication factor 2

2013-06-18 Thread James Lee
Hello,

I'm seeing a strange problem with a 2-node Cassandra test deployment, where it 
seems that data isn't being replicated among the nodes as I would expect.  I 
suspect this may be a configuration issue of some kind, but have been unable to 
figure what I should change.

The setup is as follows:

* Two Cassandra nodes in the cluster (they each have themselves and the 
other node as seeds in cassandra.yaml).

* Create 40 keyspaces, each with simple replication strategy and 
replication factor 2.

* Populate 125,000 rows into each keyspace, using a pycassa client with 
a connection pool pointed at both nodes (I've verified that pycassa does indeed 
send roughly half the writes to each node).  These are populated with writes 
using consistency level of 1.

* Wait 30 minutes (to give replications a chance to complete).

* Do random reads of the rows in the keyspaces, again using a pycassa 
client with a connection pool pointed at both nodes.  These are read using 
consistency level 1.

I'm finding that the vast majority of reads are successful, but a small 
proportion (~0.1%) are returned as Not Found.  If I manually try to look up 
those keys using cassandra-cli, I see that they are returned when querying one 
of the nodes, but not when querying the other.  So it seems like some of the 
rows have simply not been replicated.

I'm not sure how I can monitor the status of ongoing replications, but the 
system has been idle for many 10s of minutes and the total database size is 
only about 5GB, so I don't think there are any further ongoing operations.

Any suggestions?  In case it's relevant, my setup is:

* Cassandra 1.2.2, running on Linux

* Sun Java 1.7.0_10-b18 64-bit

* Java heap settings: -Xms8192M -Xmx8192M -Xmn2048M

Thank you,
James Lee



Re: Reduce Cassandra GC

2013-06-18 Thread Takenori Sato
GC logging is not in system.log. But in the following file.

JVM_OPTS="$JVM_OPTS -Xloggc:/var/log/cassandra/gc-`date +%s`.log"


At least, no GC logs are shown in your post.


On Tue, Jun 18, 2013 at 5:05 PM, Joel Samuelsson
wrote:

> Can't find any promotion failure.
>
> In system.log this is what I get:
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,490 GCInspector.java (line
> 122) GC for ParNew: 145189 ms for 1 collections, 225905072 used; max is
> 4114612224
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,490 StatusLogger.java (line
> 57) Pool NameActive   Pending   Blocked
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,491 StatusLogger.java (line
> 72) ReadStage 0 0 0
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,492 StatusLogger.java (line
> 72) RequestResponseStage  0 0 0
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,492 StatusLogger.java (line
> 72) ReadRepairStage   0 0 0
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,492 StatusLogger.java (line
> 72) MutationStage 0 0 0
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,493 StatusLogger.java (line
> 72) ReplicateOnWriteStage 0 0 0
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,493 StatusLogger.java (line
> 72) GossipStage   0 0 0
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,493 StatusLogger.java (line
> 72) AntiEntropyStage  0 0 0
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,494 StatusLogger.java (line
> 72) MigrationStage0 0 0
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,494 StatusLogger.java (line
> 72) StreamStage   0 0 0
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,494 StatusLogger.java (line
> 72) MemtablePostFlusher   0 0 0
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,495 StatusLogger.java (line
> 72) FlushWriter   0 0 0
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,495 StatusLogger.java (line
> 72) MiscStage 0 0 0
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,499 StatusLogger.java (line
> 72) commitlog_archiver0 0 0
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,499 StatusLogger.java (line
> 72) InternalResponseStage 0 0 0
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,499 StatusLogger.java (line
> 72) HintedHandoff 0 0 0
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,500 StatusLogger.java (line
> 77) CompactionManager 0 0
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,500 StatusLogger.java (line
> 89) MessagingServicen/a   0,0
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,500 StatusLogger.java (line
> 99) Cache Type SizeCapacityKeysToSave  Provider
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,504 StatusLogger.java (line
> 100) KeyCache  12129   2184533 all
>
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,505 StatusLogger.java (line
> 106) RowCache  0   0   all
> org.apache.cassandra.cache.SerializingCacheProvider
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,505 StatusLogger.java (line
> 113) ColumnFamilyMemtable ops,data
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,505 StatusLogger.java (line
> 116) system.NodeIdInfo 0,0
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,505 StatusLogger.java (line
> 116) system.IndexInfo  0,0
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,505 StatusLogger.java (line
> 116) system.LocationInfo   0,0
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,506 StatusLogger.java (line
> 116) system.Versions 3,103
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,506 StatusLogger.java (line
> 116) system.schema_keyspacees   0,0
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,506 StatusLogger.java (line
> 116) system.Migrations 0,0
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,506 StatusLogger.java (line
> 116) system.schema_columnfamilies 0,0
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,506 StatusLogger.java (line
> 116) system.schema_columns 0,0
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,507 StatusLogger.java (line
> 116) system.HintsColumnFamily  0,0
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,507 StatusLogger.java (line
> 116) system.Schema 0,0
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,507 StatusLogger.java (line
> 116) Keyspace.cf01 0,0
>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,507 StatusLogger.java (line
> 116) Keyspace.cf02 0,0
>  INFO 

Re: "SQL" Injection C* (via CQL & Thrift)

2013-06-18 Thread Sylvain Lebresne
If you're not careful, then "CQL injection" is possible.

Say you naively build you query with
  "UPDATE foo SET col='" + user_input + "' WHERE key = 'k'"
then if user_input is "foo' AND col2='bar", your user will have overwritten
a column it shouldn't have been able to. And something equivalent in a
BATCH statement could allow to overwrite/delete some random row in some
random table.

Now CQL being much more restricted than SQL (no subqueries, no generic
transaction, ...), the extent of what you can do with a CQL injection is
way smaller than in SQL. But you do have to be careful.

As far as the Datastax java driver is concerned, you can fairly easily
protect yourself by using either:
1) prepared statements: if the user input is a prepared variable, there is
nothing the user can do (it's "equivalent" to the thrift situation).
2) using the query builder: it will escape quotes in the strings you
provided, thuse avoiding injection.

So I would say that injections are definitively possible if you concatenate
strings too naively, but I don't think preventing them is very hard.

--
Sylvain


On Tue, Jun 18, 2013 at 2:02 PM, Brian O'Neill wrote:

>
> Mostly for fun, I wanted to throw this out there...
>
> We are undergoing a security audit for our platform (C* + Elastic Search +
> Storm).  One component of that audit is susceptibility to SQL injection.  I
> was wondering if anyone has attempted to construct a SQL injection attack
> against Cassandra?  Is it even possible?
>
> I know the code paths fairly well, but...
> Does there exists a path in the code whereby user data gets interpreted,
> which could be exploited to perform user operations?
>
> From the Thrift side of things, I've always felt safe.  Data is opaque.
>  Serializers are used to convert it to Bytes, and C* doesn't ever really do
> anything with the data.
>
> In examining the CQL java-driver, it looks like there might be a bit more
> exposure to injection.  (or even CQL over Thrift)  I haven't dug into the
> code yet, but dependent on which flavor of the API you are using, you may
> be including user data in your statements.
>
> Does anyone know if the CQL java-driver does anything to protect against
> injection?  Or is it possible to say that the syntax is strict enough that
> any embedded operations in data would not parse?
>
> just some food for thought...
> I'll be digging into this over the next couple weeks.  If people are
> interested, I can throw a blog post out there with the findings.
>
> -brian
>
> --
> Brian ONeill
> Lead Architect, Health Market Science (http://healthmarketscience.com)
> mobile:215.588.6024
> blog: http://brianoneill.blogspot.com/
> twitter: @boneill42
>


Re: Large number of files for Leveled Compaction

2013-06-18 Thread Franc Carter
On Mon, Jun 17, 2013 at 3:37 PM, Franc Carter wrote:

> On Mon, Jun 17, 2013 at 3:28 PM, Wei Zhu  wrote:
>
>> default value of 5MB is way too small in practice. Too many files in one
>> directory is not a good thing. It's not clear what should be a good number.
>> I have heard people are using 50MB, 75MB, even 100MB. Do your own test o
>> find a "right" number.
>>
>
> Interesting - 50MB is the low end of what people are using - 5MB is a lot
> lower. I'll try a 50MB set
>

Oops, forgot to ask - is there a way to get Cassandra to rebuild the
sstables as bigger once I have updated the column family definition ?

thanks


>
> cheers
>
>
>> -Wei
>>
>> --
>> *From: *"Franc Carter" 
>> *To: *user@cassandra.apache.org
>> *Sent: *Sunday, June 16, 2013 10:15:22 PM
>> *Subject: *Re: Large number of files for Leveled Compaction
>>
>>
>>
>>
>> On Mon, Jun 17, 2013 at 2:59 PM, Manoj Mainali wrote:
>>
>>> Not in the case of LeveledCompaction. Only SizeTieredCompaction merges
>>> smaller sstables into large ones. With the LeveledCompaction, the sstables
>>> are always of fixed size but they are grouped into different levels.
>>>
>>> You can refer to this page
>>> http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra on
>>> details of how LeveledCompaction works.
>>>
>>>
>> Yes, but it seems I've misinterpreted that page ;-(
>>
>> I took this paragraph
>>
>> In figure 3, new sstables are added to the first level, L0, and
>>> immediately compacted with the sstables in L1 (blue). When L1 fills up,
>>> extra sstables are promoted to L2 (violet). Subsequent sstables generated
>>> in L1 will be compacted with the sstables in L2 with which they overlap. As
>>> more data is added, leveled compaction results in a situation like the one
>>> shown in figure 4.
>>>
>>
>> to mean that once a level fills up it gets compacted into a higher level
>>
>> cheers
>>
>>
>>> Cheers
>>> Manoj
>>>
>>>
>>> On Mon, Jun 17, 2013 at 1:54 PM, Franc Carter >> > wrote:
>>>
 On Mon, Jun 17, 2013 at 2:47 PM, Manoj Mainali 
 wrote:

> With LeveledCompaction, each sstable size is fixed and is defined by
> sstable_size_in_mb in the compaction configuration of CF definition and
> default value is 5MB. In you case, you may have not defined your own 
> value,
> that is why your each sstable is 5MB. And if you dataset is huge, you will
> see a lot of sstable counts.
>


 Ok, seems like I do have (at least) an incomplete understanding. I
 realise that the minimum size is 5MB, but I thought compaction would merge
 these into a smaller number of larger sstables ?

 thanks


> Cheers
>
> Manoj
>
>
> On Fri, Jun 7, 2013 at 1:44 PM, Franc Carter <
> franc.car...@sirca.org.au> wrote:
>
>>
>> Hi,
>>
>> We are trialling Cassandra-1.2(.4) with Leveled compaction as it
>> looks like it may be a win for us.
>>
>> The first step of testing was to push a fairly large slab of data
>> into the Column Family - we did this much faster (> x100) than we would 
>> in
>> a production environment. This has left the Column Family with about
>> 140,000 files in the Column Family directory which seems way too high. On
>> two of the nodes the CompactionStats show 2 outstanding tasks and on a
>> third node there are over 13,000 outstanding tasks. However from looking 
>> at
>> the log activity it looks like compaction has finished on all nodes.
>>
>> Is this number of files expected/normal ?
>>
>> cheers
>>
>> --
>>
>> *Franc Carter* | Systems architect | Sirca Ltd
>>  
>>
>> franc.car...@sirca.org.au | www.sirca.org.au
>>
>> Tel: +61 2 8355 2514
>>
>> Level 4, 55 Harrington St, The Rocks NSW 2000
>>
>> PO Box H58, Australia Square, Sydney NSW 1215
>>
>>
>>
>


 --

 *Franc Carter* | Systems architect | Sirca Ltd
  

 franc.car...@sirca.org.au | www.sirca.org.au

 Tel: +61 2 8355 2514

 Level 4, 55 Harrington St, The Rocks NSW 2000

 PO Box H58, Australia Square, Sydney NSW 1215



>>>
>>
>>
>> --
>>
>> *Franc Carter* | Systems architect | Sirca Ltd
>>  
>>
>> franc.car...@sirca.org.au | www.sirca.org.au
>>
>> Tel: +61 2 8355 2514
>>
>> Level 4, 55 Harrington St, The Rocks NSW 2000
>>
>> PO Box H58, Australia Square, Sydney NSW 1215
>>
>>
>>
>>
>
>
> --
>
> *Franc Carter* | Systems architect | Sirca Ltd
>  
>
> franc.car...@sirca.org.au | www.sirca.org.au
>
> Tel: +61 2 8355 2514
>
> Level 4, 55 Harrington St, The Rocks NSW 2000
>
> PO Box H58, Australia Square, Sydney NSW 1215
>
>
>


-- 

*Franc Carter* | Systems architect | Sirca Ltd
 

franc.car...@sirca.org.au | www.sirca.org.au

Tel: +61 2 8355 2514

Level 4, 55 Harrington St, The Rocks NSW 2000

PO Box H58, Australia Square, Sydney NSW 1215


RE: What is the effect of reducing the thrift message sizes on GC

2013-06-18 Thread Viktor Jevdokimov
Our experience shows that write load (memtables) impacts ParNew GC most. More 
writes, more frequent ParNew GC. Time of ParNew GC depends on how many writes 
was made during cycle between ParNew GC's and size of NEW_HEAP (young gen).

Basicly ParNew GC itself takes longer when more objects have to be copied from 
young to old space. So reads and compactions will not promote objects to old 
space (short living objects) and you can see that increased reads and 
compactions during the same write load will increase GC frequency but decrease 
GC pause time.

Best regards / Pagarbiai
Viktor Jevdokimov
Senior Developer

Email: viktor.jevdoki...@adform.com
Phone: +370 5 212 3063, Fax +370 5 261 0453
J. Jasinskio 16C, LT-01112 Vilnius, Lithuania
Follow us on Twitter: @adforminsider
Take a ride with Adform's Rich Media Suite

[Adform News] 
[Adform awarded the Best Employer 2012] 



Disclaimer: The information contained in this message and attachments is 
intended solely for the attention and use of the named addressee and may be 
confidential. If you are not the intended recipient, you are reminded that the 
information remains the property of the sender. You must not use, disclose, 
distribute, copy, print or rely on this e-mail. If you have received this 
message in error, please contact the sender immediately and irrevocably delete 
this message and any copies.

From: Ananth Gundabattula [mailto:agundabatt...@threatmetrix.com]
Sent: Tuesday, June 18, 2013 10:31 AM
To: user@cassandra.apache.org
Subject: What is the effect of reducing the thrift message sizes on GC

We are currently running on 1.1.10 and planning to migrate to a higher
version 1.2.4.

The question pertains to tweaking all the knobs to reduce GC related issues
( we have been fighting a lot of really bad GC issues on 1.1.10 and met with 
little
success all the way using 1.1.10)

Taking into consideration GC tuning is a black art, I was wondering if we
can have some good effect on the GC by tweaking the following settings:
*thrift_framed_transport_size_in_mb & thrift_max_message_length_in_mb*
*
*
Our system is a very short column (both in number of columns and data sizes
) tables but having millions/billions of rows in each column family. The typical
number of columns in each column family is 4. The typical lookup involves
specifying the row key and fetching one column most of the times. The
writes are also similar except for one keyspace where the number of columns
are 50 but very small data sizes per column.

Assuming we can tweak the config values :
*
*
* > thrift_framed_transport_size_in_mb & *
* >  thrift_max_message_length_in_mb *

to lower values in the above context, I was wondering if it helps in the GC
being invoked less if the thrift settings reflect our data model reads and 
writes ?

For example: What is the impact by reducing the above config values on the
GC to say 1 mb rather than say 15 or 16 ?

Thanks a lot for your inputs and thoughts.


Regards,
Ananth
<><>

Re: "SQL" Injection C* (via CQL & Thrift)

2013-06-18 Thread Brian O'Neill

Perfect.  Thanks Sylvain.  That is exactly the input I was looking for, and
I agree completely.
(t's easy enough to protect against)

As for the thrift side (i.e. using Hector or Astyanax), anyone have a crafty
way to inject something?

At first glance, it doesn't appear possible, but I'm not 100% confident
making that assertion.

-brian

---
Brian O'Neill
Lead Architect, Software Development
Health Market Science
The Science of Better Results
2700 Horizon Drive € King of Prussia, PA € 19406
M: 215.588.6024 € @boneill42    €
healthmarketscience.com


This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or the
person responsible to deliver it to the intended recipient, please contact
the sender at the email above and delete this email and any attachments and
destroy any copies thereof. Any review, retransmission, dissemination,
copying or other use of, or taking any action in reliance upon, this
information by persons or entities other than the intended recipient is
strictly prohibited.
 


From:  Sylvain Lebresne 
Reply-To:  
Date:  Tuesday, June 18, 2013 8:51 AM
To:  "user@cassandra.apache.org" 
Subject:  Re: "SQL" Injection C* (via CQL & Thrift)

If you're not careful, then "CQL injection" is possible.

Say you naively build you query with
  "UPDATE foo SET col='" + user_input + "' WHERE key = 'k'"
then if user_input is "foo' AND col2='bar", your user will have overwritten
a column it shouldn't have been able to. And something equivalent in a BATCH
statement could allow to overwrite/delete some random row in some random
table.

Now CQL being much more restricted than SQL (no subqueries, no generic
transaction, ...), the extent of what you can do with a CQL injection is way
smaller than in SQL. But you do have to be careful.

As far as the Datastax java driver is concerned, you can fairly easily
protect yourself by using either:
1) prepared statements: if the user input is a prepared variable, there is
nothing the user can do (it's "equivalent" to the thrift situation).
2) using the query builder: it will escape quotes in the strings you
provided, thuse avoiding injection.

So I would say that injections are definitively possible if you concatenate
strings too naively, but I don't think preventing them is very hard.

--
Sylvain


On Tue, Jun 18, 2013 at 2:02 PM, Brian O'Neill 
wrote:
> 
> Mostly for fun, I wanted to throw this out there...
> 
> We are undergoing a security audit for our platform (C* + Elastic Search +
> Storm).  One component of that audit is susceptibility to SQL injection.  I
> was wondering if anyone has attempted to construct a SQL injection attack
> against Cassandra?  Is it even possible?
> 
> I know the code paths fairly well, but...
> Does there exists a path in the code whereby user data gets interpreted, which
> could be exploited to perform user operations?
> 
> From the Thrift side of things, I've always felt safe.  Data is opaque.
> Serializers are used to convert it to Bytes, and C* doesn't ever really do
> anything with the data.
> 
> In examining the CQL java-driver, it looks like there might be a bit more
> exposure to injection.  (or even CQL over Thrift)  I haven't dug into the code
> yet, but dependent on which flavor of the API you are using, you may be
> including user data in your statements.
> 
> Does anyone know if the CQL java-driver does anything to protect against
> injection?  Or is it possible to say that the syntax is strict enough that any
> embedded operations in data would not parse?
> 
> just some food for thought...
> I'll be digging into this over the next couple weeks.  If people are
> interested, I can throw a blog post out there with the findings.
> 
> -brian
> 
> -- 
> Brian ONeill
> Lead Architect, Health Market Science (http://healthmarketscience.com)
> mobile:215.588.6024
> blog: http://brianoneill.blogspot.com/
> twitter: @boneill42





Re: Node failing to decomission (vnodes and 1.2.5)

2013-06-18 Thread David McNelis
Never saw "decommissioned" in the logs, status continues to says "UL" on
status.

Removenode sounds like its likely to get the job done for us at this point.

Thanks.

David


On Tue, Jun 18, 2013 at 3:10 AM, aaron morton wrote:

> I also am not seeing anything in the nodes log files to suggest errors
> during streaming or leaving.
>
> You should see a log message saying "DECOMMISSIONED" when the process
> completes.
>
> What does nodetool status say?
>
> What suggestions does anyone have on getting this node removed from my
> ring so I can rebuild it with the correct number of tokens, before I end up
> with a disk space issue from too many vnodes.
>
> If you really want to get the node out of there shut it down and run
> nodetool removenode on one of the remaining nodes.
>
> Cheers
>
>
> -
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 18/06/2013, at 2:59 PM, David McNelis  wrote:
>
> I have a node in my ring (1.2.5) that when it was set up, had the wrong
> number of vnodes assigned (double the amount it should have had).
>
> As  a result, and because we can't reduce the number of vnodes on a
> machine (at least at this point), I need to decommission the node.
>
> The problem is that we've tried running decommission several times.  In
> each instance we'll have a lot of streams to other nodes for a period, and
> then eventually, netstats will tell us:
>
> nodetool -h localhost netstats
> Mode: LEAVING
>  Nothing streaming to /10.x.x.1
>  Nothing streaming to /10.x.x.2
>  Nothing streaming to /10.x.x.3
> Not receiving any streams.
> Pool NameActive   Pending  Completed
> Commandsn/a 0 955991
> Responses   n/a 02947860
>
> I also am not seeing anything in the nodes log files to suggest errors
> during streaming or leaving.
>
> Then the node will stay in this leaving state for... well, we gave up
> after several days of no more activity and retried several times.  Each
> time we "gave up" on it, we restarted the service and it was no longer
> listed as Leaving, just active.  Even when in a "leaving" state, the size
> of data on the node continued to grow.
>
> What suggestions does anyone have on getting this node removed from my
> ring so I can rebuild it with the correct number of tokens, before I end up
> with a disk space issue from too many vnodes.
>
>
>


Heap is not released and streaming hangs at 0%

2013-06-18 Thread srmore
I see an issues when I run high traffic to the Cassandra nodes, the heap
gets full to about 94% (which is expected) but the thing that confuses me
is that the heap usage never goes down after the traffic is stopped
(at-least, it appears to be so) . I kept the nodes up for a day after
stopping the traffic and the logs still tell me

"Heap is 0.9430032942657169 full.  You may need to reduce memtable and/or
cache sizes.  Cassandra will now flush up to the two largest memtables to
free up memory.  Adjust flush_largest_memtables_at threshold in
cassandra.yaml if you don't want Cassandra to do this automatically"

Things go back to normal when I restart Cassandra.

nodetool netstats tells me the following:

Mode: Normal
Not sending streams

and a bunch of keyspaces streaming from other nodes which are at 0% and
this stays this way until I restart Cassandra.

Also I see this at the bottom:

Pool NameActive   Pending  Completed
Commandsn/a 08267930
Responses   n/a 0   15184810

Any ideas as to how I can speed up this up and reclaim the heap ?

Thanks !


Re: Reduce Cassandra GC

2013-06-18 Thread Joel Samuelsson
Yes, like I said, the only relevant output from that file was:
2013-06-17T08:11:22.300+: 2551.288: [GC 870971K->216494K(4018176K),
145.1887460 secs]


2013/6/18 Takenori Sato 

> GC logging is not in system.log. But in the following file.
>
> JVM_OPTS="$JVM_OPTS -Xloggc:/var/log/cassandra/gc-`date +%s`.log"
>
>
> At least, no GC logs are shown in your post.
>
>
> On Tue, Jun 18, 2013 at 5:05 PM, Joel Samuelsson <
> samuelsson.j...@gmail.com> wrote:
>
>> Can't find any promotion failure.
>>
>> In system.log this is what I get:
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,490 GCInspector.java (line
>> 122) GC for ParNew: 145189 ms for 1 collections, 225905072 used; max is
>> 4114612224
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,490 StatusLogger.java (line
>> 57) Pool NameActive   Pending   Blocked
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,491 StatusLogger.java (line
>> 72) ReadStage 0 0 0
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,492 StatusLogger.java (line
>> 72) RequestResponseStage  0 0 0
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,492 StatusLogger.java (line
>> 72) ReadRepairStage   0 0 0
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,492 StatusLogger.java (line
>> 72) MutationStage 0 0 0
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,493 StatusLogger.java (line
>> 72) ReplicateOnWriteStage 0 0 0
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,493 StatusLogger.java (line
>> 72) GossipStage   0 0 0
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,493 StatusLogger.java (line
>> 72) AntiEntropyStage  0 0 0
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,494 StatusLogger.java (line
>> 72) MigrationStage0 0 0
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,494 StatusLogger.java (line
>> 72) StreamStage   0 0 0
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,494 StatusLogger.java (line
>> 72) MemtablePostFlusher   0 0 0
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,495 StatusLogger.java (line
>> 72) FlushWriter   0 0 0
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,495 StatusLogger.java (line
>> 72) MiscStage 0 0 0
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,499 StatusLogger.java (line
>> 72) commitlog_archiver0 0 0
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,499 StatusLogger.java (line
>> 72) InternalResponseStage 0 0 0
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,499 StatusLogger.java (line
>> 72) HintedHandoff 0 0 0
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,500 StatusLogger.java (line
>> 77) CompactionManager 0 0
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,500 StatusLogger.java (line
>> 89) MessagingServicen/a   0,0
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,500 StatusLogger.java (line
>> 99) Cache Type SizeCapacityKeysToSave  Provider
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,504 StatusLogger.java (line
>> 100) KeyCache  12129   2184533 all
>>
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,505 StatusLogger.java (line
>> 106) RowCache  0   0   all
>> org.apache.cassandra.cache.SerializingCacheProvider
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,505 StatusLogger.java (line
>> 113) ColumnFamilyMemtable ops,data
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,505 StatusLogger.java (line
>> 116) system.NodeIdInfo 0,0
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,505 StatusLogger.java (line
>> 116) system.IndexInfo  0,0
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,505 StatusLogger.java (line
>> 116) system.LocationInfo   0,0
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,506 StatusLogger.java (line
>> 116) system.Versions 3,103
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,506 StatusLogger.java (line
>> 116) system.schema_keyspacees   0,0
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,506 StatusLogger.java (line
>> 116) system.Migrations 0,0
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,506 StatusLogger.java (line
>> 116) system.schema_columnfamilies 0,0
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,506 StatusLogger.java (line
>> 116) system.schema_columns 0,0
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,507 StatusLogger.java (line
>> 116) system.HintsColumnFamily  0,0
>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,507 StatusLogger.java (l

Re: Reduce Cassandra GC

2013-06-18 Thread Mohit Anchlia
Is your young generation size set to 4GB? Can you paste the output of ps
-ef|grep cassandra ?
On Tue, Jun 18, 2013 at 8:48 AM, Joel Samuelsson
wrote:

> Yes, like I said, the only relevant output from that file was:
>  2013-06-17T08:11:22.300+: 2551.288: [GC 870971K->216494K(4018176K),
> 145.1887460 secs]
>
>
> 2013/6/18 Takenori Sato 
>
>> GC logging is not in system.log. But in the following file.
>>
>>
>> JVM_OPTS="$JVM_OPTS -Xloggc:/var/log/cassandra/gc-`date +%s`.log"
>>
>>
>> At least, no GC logs are shown in your post.
>>
>>
>> On Tue, Jun 18, 2013 at 5:05 PM, Joel Samuelsson <
>> samuelsson.j...@gmail.com> wrote:
>>
>>>  Can't find any promotion failure.
>>>
>>> In system.log this is what I get:
>>>   INFO [ScheduledTasks:1] 2013-06-17 08:13:47,490 GCInspector.java
>>> (line 122) GC for ParNew: 145189 ms for 1 collections, 225905072 used; max
>>> is 4114612224
>>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,490 StatusLogger.java (line
>>> 57) Pool NameActive   Pending   Blocked
>>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,491 StatusLogger.java (line
>>> 72) ReadStage 0 0 0
>>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,492 StatusLogger.java (line
>>> 72) RequestResponseStage  0 0 0
>>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,492 StatusLogger.java (line
>>> 72) ReadRepairStage   0 0 0
>>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,492 StatusLogger.java (line
>>> 72) MutationStage 0 0 0
>>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,493 StatusLogger.java (line
>>> 72) ReplicateOnWriteStage 0 0 0
>>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,493 StatusLogger.java (line
>>> 72) GossipStage   0 0 0
>>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,493 StatusLogger.java (line
>>> 72) AntiEntropyStage  0 0 0
>>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,494 StatusLogger.java (line
>>> 72) MigrationStage0 0 0
>>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,494 StatusLogger.java (line
>>> 72) StreamStage   0 0 0
>>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,494 StatusLogger.java (line
>>> 72) MemtablePostFlusher   0 0 0
>>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,495 StatusLogger.java (line
>>> 72) FlushWriter   0 0 0
>>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,495 StatusLogger.java (line
>>> 72) MiscStage 0 0 0
>>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,499 StatusLogger.java (line
>>> 72) commitlog_archiver0 0 0
>>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,499 StatusLogger.java (line
>>> 72) InternalResponseStage 0 0 0
>>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,499 StatusLogger.java (line
>>> 72) HintedHandoff 0 0 0
>>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,500 StatusLogger.java (line
>>> 77) CompactionManager 0 0
>>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,500 StatusLogger.java (line
>>> 89) MessagingServicen/a   0,0
>>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,500 StatusLogger.java (line
>>> 99) Cache Type SizeCapacityKeysToSave  Provider
>>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,504 StatusLogger.java (line
>>> 100) KeyCache  12129   2184533 all
>>>
>>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,505 StatusLogger.java (line
>>> 106) RowCache  0   0   all
>>> org.apache.cassandra.cache.SerializingCacheProvider
>>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,505 StatusLogger.java (line
>>> 113) ColumnFamilyMemtable ops,data
>>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,505 StatusLogger.java (line
>>> 116) system.NodeIdInfo 0,0
>>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,505 StatusLogger.java (line
>>> 116) system.IndexInfo  0,0
>>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,505 StatusLogger.java (line
>>> 116) system.LocationInfo   0,0
>>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,506 StatusLogger.java (line
>>> 116) system.Versions 3,103
>>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,506 StatusLogger.java (line
>>> 116) system.schema_keyspacees   0,0
>>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,506 StatusLogger.java (line
>>> 116) system.Migrations 0,0
>>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,506 StatusLogger.java (line
>>> 116) system.schema_columnfamilies 0,0
>>>  INFO [ScheduledTasks:1] 2013-06-17 08:13:47,506 StatusLogger.java (line
>>> 116) s

ANN Introducing Cassaforte, a Clojure client for Cassandra built around CQL 3.0

2013-06-18 Thread Michael Klishin
Cassaforte [1] is a Clojure client for Cassandra built around CQL 3.0 and
focusing
on ease of use. It's built on top of the new DataStax Java driver [2] and
supports
all the major features you'd expect from a data store client:

 * Connection to a single node or a cluster
 * All CQL 3.0 operations
 * Convenient CQL 3.0 queries, including queries with placeholders (?, a la
JDBC)
 * Nice query DSL for Clojure
 * Automatic deserialization of column names and values according to the
schema

and then some.

To learn more, see
http://blog.clojurewerkz.org/blog/2013/06/17/introducing-cassaforte/

1. http://clojurecassandra.info
2. https://github.com/datastax/java-driver
-- 
MK

http://github.com/michaelklishin
http://twitter.com/michaelklishin


Dropped mutation messages

2013-06-18 Thread cem
Hi All,

I have a cluster of 5 nodes with C* 1.2.4.

Each node has 4 disks 1 TB each.

I see  a lot of dropped messages after it stores 400 GB  per disk. (1.6 TB
per node).

The recommendation was 500 GB max per node before 1.2.  Datastax says that
we can store terabytes of data per node with 1.2.
http://www.datastax.com/docs/1.2/cluster_architecture/cluster_planning

Do I need to enable anything to leverage from 1.2? Do you have any other
advice?

What should be the path to investigate this?

Thanks in advance!

Best Regards,
Cem.


Re: Heap is not released and streaming hangs at 0%

2013-06-18 Thread Robert Coli
On Tue, Jun 18, 2013 at 8:25 AM, srmore  wrote:
> I see an issues when I run high traffic to the Cassandra nodes, the heap
> gets full to about 94% (which is expected)

Which is expected to cause GC failure? ;)

But seriously, the reason your node is unable to GC is that you have
filled your heap too fast for it to keep up. The JVM has seized up
like Joe Namath with vapor lock.

> Any ideas as to how I can speed up this up and reclaim the heap ?

Don't exhaust the ability of GC to C G. :)

=Rob
PS - What version of cassandra? If you "nodetool -h localhost flush"
does it help?


Re: rename a cluster in cassandra 1.2.6

2013-06-18 Thread Faraaz Sareshwala
Can you expand on the reasoning behind this? I was bitten by this yesterday when
trying to change the cluster name -- I thought I could just change it in the
cassandra.yaml and be done with it but cassandra wouldn't start because of this
error.

What's the process when it's not a test system (mine wasn't)? In order to fix my
issue, I deleted the commit logs, system keyspace directory, restarted the node,
and ran nodetool resetlocalschema and nodetool repair. Is this correct?

Are there any other settings like this that cassandra caches and need to be in
sync for a proper startup?

Faraaz

On Tue, Jun 18, 2013 at 12:51:34AM -0700, aaron morton wrote:
> The cluster name is read from the yaml file the first time the server starts
> and stored in the system tables, these are in the local CF in the system KS.
> 
> If this is test system just blow away the data for the CF or truncate it. 
> 
> Cheers
> 
> -
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 18/06/2013, at 7:30 PM, Paco a.k.a. Francisco Trujillo <
> f.truji...@genetwister.nl> wrote:
> 
> 
> I am using cassandra 1.2.6 in cluster with a single node. I am trying to
> rename the cluster using the instructions in:
>  
> Cassandra clustername mismatch
>  
> After doing all the steps indicate I continue with the same error when I
> start cassandra after change the cassandra.yaml file
>  
> Do anyone Know if it is a problem of cassandra 1.2.6?
>  
> Thanks
>  
> 
> 
> Disclaimer
> This e-mail and any files transmitted with it are confidential and 
> intended
> solely for the use of the individual or entity to whom/which they are
> addressed. If you have received this e-mail in error please notify the
> sender immediately, delete this e-mail from your system and do not
> disseminate, distribute or copy this e-mail. This e-mail shall not
> constitute a binding agreement unless expressly confirmed in writing by 
> one
> of the Directors of Genetwister. Please note that any views or opinions
> presented in this e-mail are solely those of the author and do not
> necessarily represent those of Genetwister. Finally, the recipient should
> check this e-mail and any attachments for the presence of viruses.
> Genetwister accepts no liability for any damage caused by any virus
> transmitted by this e-mail or damage caused by the use of the information
> contained in this e-mail. / Chamber of Commerce Arnhem, The Netherlands
> no.: 11043533
> 
>  
> 
> 


Re: Heap is not released and streaming hangs at 0%

2013-06-18 Thread srmore
Thanks Rob,
But then shouldn't JVM C G it eventually ? I can still see Cassandra alive
and kicking but looks like the heap is locked up even after the traffic is
long stopped.

nodetool -h localhost flush didn't do much good.
the version I am running is 1.0.12 (I know its due for a upgrade but gotto
work with this for now).



On Tue, Jun 18, 2013 at 12:13 PM, Robert Coli  wrote:

> On Tue, Jun 18, 2013 at 8:25 AM, srmore  wrote:
> > I see an issues when I run high traffic to the Cassandra nodes, the heap
> > gets full to about 94% (which is expected)
>
> Which is expected to cause GC failure? ;)
>
> But seriously, the reason your node is unable to GC is that you have
> filled your heap too fast for it to keep up. The JVM has seized up
> like Joe Namath with vapor lock.
>
> > Any ideas as to how I can speed up this up and reclaim the heap ?
>
> Don't exhaust the ability of GC to C G. :)
>
> =Rob
> PS - What version of cassandra? If you "nodetool -h localhost flush"
> does it help?
>


Re: rename a cluster in cassandra 1.2.6

2013-06-18 Thread Robert Coli
On Tue, Jun 18, 2013 at 10:20 AM, Faraaz Sareshwala
 wrote:
> Can you expand on the reasoning behind this?

https://issues.apache.org/jira/browse/CASSANDRA-769

In various versions of Cassandra (including current, IIRC?) you can
change the cluster name via manual edits to the system keyspace.

If you don't "nodetool flush" on all nodes when resetting the cluster,
I believe you might have changes to the system keyspace which are
unflushed...

=Rob


Re: Heap is not released and streaming hangs at 0%

2013-06-18 Thread Robert Coli
On Tue, Jun 18, 2013 at 10:33 AM, srmore  wrote:
> But then shouldn't JVM C G it eventually ? I can still see Cassandra alive
> and kicking but looks like the heap is locked up even after the traffic is
> long stopped.

No, when GC system fails this hard it is often a permanent failure
which requires a restart of the JVM.

> nodetool -h localhost flush didn't do much good.

This adds support to the idea that your heap is too full, and not full
of memtables.

You could try nodetool -h localhost invalidatekeycache, but that
probably will not free enough memory to help you.

=Rob


Re: Data not fully replicated with 2 nodes and replication factor 2

2013-06-18 Thread Wei Zhu
Cassandra doesn't do async replication like HBase does.You can run nodetool 
repair to insure the consistency. 

Or you can increase your Read or Write consistency. As long as R + W > RF, you 
have strong consistency. In your case, you can use CL.TWO for either read and 
write. 

-Wei 

- Original Message -

From: "James Lee"  
To: user@cassandra.apache.org 
Sent: Tuesday, June 18, 2013 5:02:53 AM 
Subject: Data not fully replicated with 2 nodes and replication factor 2 



Hello, 

I’m seeing a strange problem with a 2-node Cassandra test deployment, where it 
seems that data isn’t being replicated among the nodes as I would expect. I 
suspect this may be a configuration issue of some kind, but have been unable to 
figure what I should change. 

The setup is as follows: 
· Two Cassandra nodes in the cluster (they each have themselves and the other 
node as seeds in cassandra.yaml). 
· Create 40 keyspaces, each with simple replication strategy and replication 
factor 2. 
· Populate 125,000 rows into each keyspace, using a pycassa client with a 
connection pool pointed at both nodes (I’ve verified that pycassa does indeed 
send roughly half the writes to each node). These are populated with writes 
using consistency level of 1. 
· Wait 30 minutes (to give replications a chance to complete). 
· Do random reads of the rows in the keyspaces, again using a pycassa client 
with a connection pool pointed at both nodes. These are read using consistency 
level 1. 

I’m finding that the vast majority of reads are successful, but a small 
proportion (~0.1%) are returned as Not Found. If I manually try to look up 
those keys using cassandra-cli, I see that they are returned when querying one 
of the nodes, but not when querying the other. So it seems like some of the 
rows have simply not been replicated. 

I’m not sure how I can monitor the status of ongoing replications, but the 
system has been idle for many 10s of minutes and the total database size is 
only about 5GB, so I don’t think there are any further ongoing operations. 

Any suggestions? In case it’s relevant, my setup is: 
· Cassandra 1.2.2, running on Linux 
· Sun Java 1.7.0_10-b18 64-bit 
· Java heap settings: -Xms8192M -Xmx8192M -Xmn2048M 

Thank you, 
James Lee 



Re: Data not fully replicated with 2 nodes and replication factor 2

2013-06-18 Thread Robert Coli
On Tue, Jun 18, 2013 at 11:36 AM, Wei Zhu  wrote:
> Cassandra doesn't do async replication like HBase does.You can run nodetool
> repair to insure the consistency.

While this answer is true, it is somewhat non-responsive to the OP.

If the OP didn't see timeout exception, the theoretical worst case is
that he should have hints stored for initially failed to replicate
writes. His nodes should not be failing GC with a total data size of
5gb on an 8gb heap, so those hints should deliver quite quickly. After
30 minutes those hints should certainly be delivered.

@OP : do you see hints being stored? does nodetool tpstats indicate
dropped messages?

=Rob


Re: [Cassandra] Expanding a Cassandra cluster

2013-06-18 Thread Emalayan Vairavanathan
Thank you all.

I have two more question.

1) Is there any implication in running nodetool repair immediately after 
bringing a new node up (before key migration process is completed) ?

        Will it cause some race conditions ? Or will it result in some part of 
the space never be reclaimed ?

2) How can I figure out the status of key migration in Cassandra?

Thank you
Emalayan 



 From: Richard Low 
To: user@cassandra.apache.org; Emalayan Vairavanathan  
Sent: Tuesday, 18 June 2013 12:11 AM
Subject: Re: [Cassandra] Expanding a Cassandra cluster
 


On 10 June 2013 22:00, Emalayan Vairavanathan  wrote:


                   b) Will Cassandra automatically take care of removing 
obsolete keys in future ?

In a future version Cassandra should automatically clean up for you:

https://issues.apache.org/jira/browse/CASSANDRA-5051

Right now though you have to run cleanup eventually or the space will never be 
reclaimed.

Richard.

Re: Compaction not running

2013-06-18 Thread Franc Carter
On Sat, Jun 15, 2013 at 11:49 AM, Franc Carter wrote:

> On Sat, Jun 15, 2013 at 8:48 AM, Robert Coli  wrote:
>
>> On Wed, Jun 12, 2013 at 3:26 PM, Franc Carter 
>> wrote:
>> > We are running a test system with Leveled compaction on Cassandra-1.2.4.
>> > While doing an initial load of the data one of the nodes ran out of file
>> > descriptors and since then it hasn't been automatically compacting.
>>
>> You have (at least) two options :
>>
>> 1) increase file descriptors available to Cassandra with ulimit, if
>> possible
>> 2) increase the size of your sstables with levelled compaction, such
>> that you have fewer of them
>>
>
> Oops, I wasn't clear enough.
>
> I have increased the number of file descriptors and no longer have a file
> descriptor issue. However the node still doesn't compact automatically. If
> I run a 'nodetool compact' it will do a small amount of compaction and then
> stop. The Column Family is using LCS
>

Any ideas on this - compaction is still not automatically running for one
of my nodes

thanks


>
> cheers
>
>
>>
>> =Rob
>>
>
>
>
> --
>
> *Franc Carter* | Systems architect | Sirca Ltd
>  
>
> franc.car...@sirca.org.au | www.sirca.org.au
>
> Tel: +61 2 8355 2514
>
> Level 4, 55 Harrington St, The Rocks NSW 2000
>
> PO Box H58, Australia Square, Sydney NSW 1215
>
>
>


-- 

*Franc Carter* | Systems architect | Sirca Ltd
 

franc.car...@sirca.org.au | www.sirca.org.au

Tel: +61 2 8355 2514

Level 4, 55 Harrington St, The Rocks NSW 2000

PO Box H58, Australia Square, Sydney NSW 1215


Re: Compaction not running

2013-06-18 Thread Bryan Talbot
Manual compaction for LCS doesn't really do much.  It certainly doesn't
compact all those little files into bigger files.  What makes you think
that compactions are not occurring?

-Bryan



On Tue, Jun 18, 2013 at 3:59 PM, Franc Carter wrote:

> On Sat, Jun 15, 2013 at 11:49 AM, Franc Carter 
> wrote:
>
>> On Sat, Jun 15, 2013 at 8:48 AM, Robert Coli wrote:
>>
>>> On Wed, Jun 12, 2013 at 3:26 PM, Franc Carter 
>>> wrote:
>>> > We are running a test system with Leveled compaction on
>>> Cassandra-1.2.4.
>>> > While doing an initial load of the data one of the nodes ran out of
>>> file
>>> > descriptors and since then it hasn't been automatically compacting.
>>>
>>> You have (at least) two options :
>>>
>>> 1) increase file descriptors available to Cassandra with ulimit, if
>>> possible
>>> 2) increase the size of your sstables with levelled compaction, such
>>> that you have fewer of them
>>>
>>
>> Oops, I wasn't clear enough.
>>
>> I have increased the number of file descriptors and no longer have a file
>> descriptor issue. However the node still doesn't compact automatically. If
>> I run a 'nodetool compact' it will do a small amount of compaction and then
>> stop. The Column Family is using LCS
>>
>
> Any ideas on this - compaction is still not automatically running for one
> of my nodes
>
> thanks
>
>
>>
>> cheers
>>
>>
>>>
>>> =Rob
>>>
>>
>>
>>
>> --
>>
>> *Franc Carter* | Systems architect | Sirca Ltd
>>  
>>
>> franc.car...@sirca.org.au | www.sirca.org.au
>>
>> Tel: +61 2 8355 2514
>>
>> Level 4, 55 Harrington St, The Rocks NSW 2000
>>
>> PO Box H58, Australia Square, Sydney NSW 1215
>>
>>
>>
>
>
> --
>
> *Franc Carter* | Systems architect | Sirca Ltd
>  
>
> franc.car...@sirca.org.au | www.sirca.org.au
>
> Tel: +61 2 8355 2514
>
> Level 4, 55 Harrington St, The Rocks NSW 2000
>
> PO Box H58, Australia Square, Sydney NSW 1215
>
>
>


Re: Compaction not running

2013-06-18 Thread Franc Carter
On Wed, Jun 19, 2013 at 9:34 AM, Bryan Talbot wrote:

> Manual compaction for LCS doesn't really do much.  It certainly doesn't
> compact all those little files into bigger files.  What makes you think
> that compactions are not occurring?
>
>
Yeah, that's what I thought, however:-

nodetool compactionstats, gives

pending tasks: 13120
   Active compaction remaining time :n/a

when I run nodetool compact in a loop the pending tasks goes down gradually.

This node also has vastly higher latencies (x10) than the other nodes. I
saw this with a previous CF than I 'manually compacted', and when the
pending tasks reached low numbers (stuck on 9) then latencies were back to
low milliseconds

cheers


> -Bryan
>
>
>
> On Tue, Jun 18, 2013 at 3:59 PM, Franc Carter 
> wrote:
>
>> On Sat, Jun 15, 2013 at 11:49 AM, Franc Carter > > wrote:
>>
>>> On Sat, Jun 15, 2013 at 8:48 AM, Robert Coli wrote:
>>>
 On Wed, Jun 12, 2013 at 3:26 PM, Franc Carter <
 franc.car...@sirca.org.au> wrote:
 > We are running a test system with Leveled compaction on
 Cassandra-1.2.4.
 > While doing an initial load of the data one of the nodes ran out of
 file
 > descriptors and since then it hasn't been automatically compacting.

 You have (at least) two options :

 1) increase file descriptors available to Cassandra with ulimit, if
 possible
 2) increase the size of your sstables with levelled compaction, such
 that you have fewer of them

>>>
>>> Oops, I wasn't clear enough.
>>>
>>> I have increased the number of file descriptors and no longer have a
>>> file descriptor issue. However the node still doesn't compact
>>> automatically. If I run a 'nodetool compact' it will do a small amount of
>>> compaction and then stop. The Column Family is using LCS
>>>
>>
>> Any ideas on this - compaction is still not automatically running for one
>> of my nodes
>>
>> thanks
>>
>>
>>>
>>> cheers
>>>
>>>

 =Rob

>>>
>>>
>>>
>>> --
>>>
>>> *Franc Carter* | Systems architect | Sirca Ltd
>>>  
>>>
>>> franc.car...@sirca.org.au | www.sirca.org.au
>>>
>>> Tel: +61 2 8355 2514
>>>
>>> Level 4, 55 Harrington St, The Rocks NSW 2000
>>>
>>> PO Box H58, Australia Square, Sydney NSW 1215
>>>
>>>
>>>
>>
>>
>> --
>>
>> *Franc Carter* | Systems architect | Sirca Ltd
>>  
>>
>> franc.car...@sirca.org.au | www.sirca.org.au
>>
>> Tel: +61 2 8355 2514
>>
>> Level 4, 55 Harrington St, The Rocks NSW 2000
>>
>> PO Box H58, Australia Square, Sydney NSW 1215
>>
>>
>>
>


-- 

*Franc Carter* | Systems architect | Sirca Ltd
 

franc.car...@sirca.org.au | www.sirca.org.au

Tel: +61 2 8355 2514

Level 4, 55 Harrington St, The Rocks NSW 2000

PO Box H58, Australia Square, Sydney NSW 1215


Unit Testing Cassandra

2013-06-18 Thread Shahab Yunus
Hello,

Can anyone suggest a good/popular Unit Test tools/frameworks/utilities out
there for unit testing Cassandra stores? I am looking for testing from
performance/load and monitoring perspective. I am using 1.2.

Thanks a lot.

Regards,
Shahab


Re: Dropped mutation messages

2013-06-18 Thread Arthur Zubarev
Cem hi,

as per http://wiki.apache.org/cassandra/FAQ#dropped_messages

Internode messages which are received by a node, but do not get not to be 
processed within rpc_timeout are dropped rather than processed. As the 
coordinator node will no longer be waiting for a response. If the Coordinator 
node does not receive Consistency Level responses before the rpc_timeout it 
will return a TimedOutException to the client. If the coordinator receives 
Consistency Level responses it will return success to the client.

For MUTATION messages this means that the mutation was not applied to all 
replicas it was sent to. The inconsistency will be repaired by Read Repair or 
Anti Entropy Repair.

For READ messages this means a read request may not have completed.

Load shedding is part of the Cassandra architecture, if this is a persistent 
issue it is generally a sign of an overloaded node or cluster.

By the way, I am on C* 1.2.4 too in dev mode, after having my node filled with 
400 GB I started getting RPC timeouts on large data retrievals, so in short, 
you may need to revise how you query.

The queries need to be lightened 

/Arthur


From: cem 
Sent: Tuesday, June 18, 2013 1:12 PM
To: user@cassandra.apache.org 
Subject: Dropped mutation messages

Hi All, 

I have a cluster of 5 nodes with C* 1.2.4.

Each node has 4 disks 1 TB each.

I see  a lot of dropped messages after it stores 400 GB  per disk. (1.6 TB per 
node).

The recommendation was 500 GB max per node before 1.2.  Datastax says that we 
can store terabytes of data per node with 1.2.
http://www.datastax.com/docs/1.2/cluster_architecture/cluster_planning

Do I need to enable anything to leverage from 1.2? Do you have any other advice?


What should be the path to investigate this?

Thanks in advance! 

Best Regards,
Cem.