Re: Restart cassandra every X days?

2012-01-31 Thread R. Verlangen
After running 3 days on Cassandra 1.0.7 it seems the problem has been
solved. One weird thing remains, on our 2 nodes (both 50% of the ring), the
first's usage is just over 25% of the second.

Anyone got an explanation for that?

2012/1/29 aaron morton 

> Yes but…
>
> For every upgrade read the NEWS.TXT it will go through the upgrade
> procedure in detail. If you want to feel extra smart scan through the
> CHANGES.txt to get an idea of whats going on.
>
> Cheers
>
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 29/01/2012, at 4:14 AM, Maxim Potekhin wrote:
>
>  Sorry if this has been covered, I was concentrating solely on 0.8x --
> can I just d/l 1.0.x and continue using same data on same cluster?
>
> Maxim
>
>
> On 1/28/2012 7:53 AM, R. Verlangen wrote:
>
> Ok, seems that it's clear what I should do next ;-)
>
> 2012/1/28 aaron morton 
>
>> There are no blockers to upgrading to 1.0.X.
>>
>>  A
>>  -
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>>   On 28/01/2012, at 7:48 AM, R. Verlangen wrote:
>>
>> Ok. Seems that an upgrade might fix these problems. Is Cassandra 1.x.x
>> stable enough to upgrade for, or should we wait for a couple of weeks?
>>
>> 2012/1/27 Edward Capriolo 
>>
>>> I would not say that issuing restart after x days is a good idea. You
>>> are mostly developing a superstition. You should find the source of the
>>> problem. It could be jmx or thrift clients not closing connections. We
>>> don't restart nodes on a regiment they work fine.
>>>
>>>
>>> On Thursday, January 26, 2012, Mike Panchenko  wrote:
>>> > There are two relevant bugs (that I know of), both resolved in
>>> somewhat recent versions, which make somewhat regular restarts beneficial
>>> > https://issues.apache.org/jira/browse/CASSANDRA-2868 (memory leak in
>>> GCInspector, fixed in 0.7.9/0.8.5)
>>> > https://issues.apache.org/jira/browse/CASSANDRA-2252 (heap
>>> fragmentation due to the way memtables used to be allocated, refactored in
>>> 1.0.0)
>>> > Restarting daily is probably too frequent for either one of those
>>> problems. We usually notice degraded performance in our ancient cluster
>>> after ~2 weeks w/o a restart.
>>> > As Aaron mentioned, if you have plenty of disk space, there's no
>>> reason to worry about "cruft" sstables. The size of your active set is what
>>> matters, and you can determine if that's getting too big by watching for
>>> iowait (due to reads from the data partition) and/or paging activity of the
>>> java process. When you hit that problem, the solution is to 1. try to tune
>>> your caches and 2. add more nodes to spread the load. I'll reiterate -
>>> looking at raw disk space usage should not be your guide for that.
>>> > "Forcing" a gc generally works, but should not be relied upon (note
>>> "suggest" in
>>> http://docs.oracle.com/javase/6/docs/api/java/lang/System.html#gc()).
>>> It's great news that 1.0 uses a better mechanism for releasing unused
>>> sstables.
>>> > nodetool compact triggers a "major" compaction and is no longer a
>>> recommended by datastax (details here
>>> http://www.datastax.com/docs/1.0/operations/tuning#tuning-compactionbottom 
>>> of the page).
>>> > Hope this helps.
>>> > Mike.
>>> > On Wed, Jan 25, 2012 at 5:14 PM, aaron morton 
>>> wrote:
>>> >
>>> > That disk usage pattern is to be expected in pre 1.0 versions. Disk
>>> usage is far less interesting than disk free space, if it's using 60 GB and
>>> there is 200GB thats ok. If it's using 60Gb and there is 6MB free thats a
>>> problem.
>>> > In pre 1.0 the compacted files are deleted on disk by waiting for the
>>> JVM do decide to GC all remaining references. If there is not enough space
>>> (to store the total size of the files it is about to write or compact) on
>>> disk GC is forced and the files are deleted. Otherwise they will get
>>> deleted at some point in the future.
>>> > In 1.0 files are reference counted and space is freed much sooner.
>>> > With regard to regular maintenance, node tool cleanup remvos data from
>>> a node that it is no longer a replica for. This is only of use when you
>>> have done a token move.
>>> > I would not recommend a daily restart of the cassandra process. You
>>> will lose all the run time optimizations the JVM has made (i think the
>>> mapped files pages will stay resident). As well as adding additional
>>> entropy to the system which must be repaired via HH, RR or nodetool repair.
>>> > If you want to see compacted files purged faster the best approach
>>> would be to upgrade to 1.0.
>>> > Hope that helps.
>>> > -
>>> > Aaron Morton
>>> > Freelance Developer
>>> > @aaronmorton
>>> > http://www.thelastpickle.com
>>> > On 26/01/2012, at 9:51 AM, R. Verlangen wrote:
>>> >
>>> > In his message he explains that it's for " Forcing a GC ". GC stands
>>> for garbage collection. For some more background see:
>>> http://en.wikipedia.org/wiki/Gar

Re: SSTable compaction issue in our system

2012-01-31 Thread Micah Hausler
A related question, is there any way to reverse a major compaction without 
loosing performance? Do I just have to wait it out?

Micah Hausler

On Jan 30, 2012, at 7:50 PM, Roshan Pradeep wrote:

> Thanks Aaron for the perfect explanation. Decided to go with automatic 
> compaction. Thanks again.
> 
> On Wed, Jan 25, 2012 at 11:19 AM, aaron morton  
> wrote:
> The issue with major / manual compaction is that it creates a one file. One 
> big old file.  
> 
> That one file will not be compacted unless there are 
> (min_compaction_threshold -1) other files of a similar size. So thombstones 
> and overwrites in that file may not be purged for a long time. 
> 
> If you go down the manual compaction path you need to keep doing it.
> 
> If you feel you need to do it do it, otherwise let automatic compaction do 
> it's thing. 
> Cheers
>   
>   
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 25/01/2012, at 12:47 PM, Roshan wrote:
> 
>> Thanks for the reply. Is the major compaction not recommended for Cassandra
>> 1.0.6?
>> 
>> --
>> View this message in context: 
>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Different-size-of-SSTable-are-remain-in-the-system-without-compact-tp7218239p7222322.html
>> Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
>> Nabble.com.
> 
> 



RE: WARN [Memtable] live ratio

2012-01-31 Thread Dan Hendry
http://thelastpickle.com/2011/05/04/How-are-Memtables-measured/ - Gives some
background information (specific to 0.8 but still valid for 1.0 I believe).
Not quite sure why a warning message is logged but a ration of < 1 may occur
for column families with  a very high update to insert ratio.

Dan

-Original Message-
From: Roshan [mailto:codeva...@gmail.com] 
Sent: January-30-12 20:04
To: cassandra-u...@incubator.apache.org
Subject: Re: WARN [Memtable] live ratio

Exactly, I am also getting this when server moving idle to high load. May be
Cassandra Experts can help to us.

--
View this message in context:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/WARN-Memtab
le-live-ratio-tp7238582p7238603.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at
Nabble.com.
No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 9.0.927 / Virus Database: 271.1.1/4171 - Release Date: 01/30/12
14:34:00



Re: Cannot start cassandra node anymore

2012-01-31 Thread huyle
Created https://issues.apache.org/jira/browse/CASSANDRA-3819.  Thanks!

Huy


> The schema change was that we created a new key space with composite type
> CFs, but later we had to change some definition/CF names, so we dropped
> the
> key space and recreated with new definition.  

sounds like a bug, as Sylvain suggest can you report it here
https://issues.apache.org/jira/browse/CASSANDRA and include the schema
changes you made. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com



--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cannot-start-cassandra-node-anymore-tp7150978p7240588.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: WARN [Memtable] live ratio

2012-01-31 Thread Radim Kolar



  but a ration of<  1 may occur
for column families with  a very high update to insert ratio.
better to ask why minimum ratio is 1.0. What harm can be done with using 
< 1.0 ratio?


Encrypting traffic between Hector client and Cassandra server

2012-01-31 Thread Xaero S
I have been trying to figure out how to secure/encrypt the traffic between
the client (Hector) and the Cassandra Server. I looked at this link
https://issues.apache.org/jira/browse/THRIFT-106 But since thrift sits on a
layer after Hector, i am wondering how i can get Hector to use the right
Thrift calls to have the encryption happen? Also where can i get the
instructions for the any required setup for encrypting the traffic between
the Hector client and the Cassandra Server?

Would appreciate any help in this regard. Below are the setup versions

Cassandra Version - 0.8.7
Hector - 0.8.0-2
libthrift jar - 0.6.1


On a side note, we have setup internode encryption on the Cassandra server
side and found the documentation for that easily.


Future dated column

2012-01-31 Thread Hiren Shah
Hi,

I was mystified when I was not able to update a column ccc for a few keys in my 
test_cf, but was able to update that for all other keys. Then, I noticed that 
the column for that key is set to be deleted in future (year 6172 !) -

DEBUG [ReadStage:1191] 2012-01-31 22:21:48,374 SliceQueryFilter.java (line 123) 
collecting 0 of 100: ccc:true:4@1326231110794000

The timestamp must be from buggy code putting extra zeros, or from some issue 
in upgrade from 0.8.7 to 1.0.6 to 1.0.7. It almost seems like Cassandra 'sees' 
a few extra zeroes appended to the timestamp. The value without the three 
zeroes falls at the right time for the row inserted above and the time of 
delete below.

Recent records fetch ok -
DEBUG [ReadStage:1278] 2012-01-31 23:00:35,145 SliceQueryFilter.java (line 123) 
collecting 0 of 100: ccc:false:303@1328024823949!31536000

More important at this point  is to clean this up.

I tried to overwrite a value for that through CLI and it worked, sort of -

DEBUG [ReadStage:1099] 2012-01-31 21:15:34,385 SliceQueryFilter.java (line 123) 
collecting 0 of 100: ccc:false:1@132804322269

But the timestamp still keeps the extra zeroes. I cannot set it 
programmatically because my code uses current timestamp.

I tried to delete the whole record from CLI. The record cannot be queried 
anymore, but I still see the column (and others for the record) in the log. I 
did repair, cleanup and compact, but still no luck.

How can I delete a future dated column? Or overwrite it (without using a 
timestamp in year 7000!) ?

I am using cassandra 1.0.7.

Hiren Shah | R&D Team
[cid:image001.gif@01CCE036.90854290]
168 North Clinton Street, Fourth Floor
Chicago, Illinois 60661
o: 312.253.3523 | c: 312.622.4970

 www.dotomi.com

<>

Re: Encrypting traffic between Hector client and Cassandra server

2012-01-31 Thread Maxim Potekhin

Hello,

do you see any value in having a web service over cassandra, with actual 
client-clients talking to it via https/ssl?
This way the cluster can be firewalled and therefore protected, plus you 
get decent auth/auth right there.


Maxim


On 1/31/2012 5:21 PM, Xaero S wrote:


I have been trying to figure out how to secure/encrypt the traffic 
between the client (Hector) and the Cassandra Server. I looked at this 
link https://issues.apache.org/jira/browse/THRIFT-106 But since thrift 
sits on a layer after Hector, i am wondering how i can get Hector to 
use the right Thrift calls to have the encryption happen? Also where 
can i get the instructions for the any required setup for encrypting 
the traffic between the Hector client and the Cassandra Server?


Would appreciate any help in this regard. Below are the setup versions

Cassandra Version - 0.8.7
Hector - 0.8.0-2
libthrift jar - 0.6.1


On a side note, we have setup internode encryption on the Cassandra 
server side and found the documentation for that easily.








Re: Restart cassandra every X days?

2012-01-31 Thread aaron morton
Do you mean the load in nodetool ring is not even, despite the tokens been 
evenly distributed ? 

I would assume this is not the case given the difference, but it may be hints 
given you have just done an upgrade. Check the system using nodetool cfstats to 
see. They will eventually be delivered and deleted. 

More likely you will want to:
1) nodetool repair to make sure all data is distributed then
2) nodetool cleanup if you have changed the tokens at any point finally

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 31/01/2012, at 11:56 PM, R. Verlangen wrote:

> After running 3 days on Cassandra 1.0.7 it seems the problem has been solved. 
> One weird thing remains, on our 2 nodes (both 50% of the ring), the first's 
> usage is just over 25% of the second. 
> 
> Anyone got an explanation for that?
> 
> 2012/1/29 aaron morton 
> Yes but…
> 
> For every upgrade read the NEWS.TXT it will go through the upgrade procedure 
> in detail. If you want to feel extra smart scan through the CHANGES.txt to 
> get an idea of whats going on. 
> 
> Cheers
> 
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 29/01/2012, at 4:14 AM, Maxim Potekhin wrote:
> 
>> Sorry if this has been covered, I was concentrating solely on 0.8x --
>> can I just d/l 1.0.x and continue using same data on same cluster?
>> 
>> Maxim
>> 
>> 
>> On 1/28/2012 7:53 AM, R. Verlangen wrote:
>>> 
>>> Ok, seems that it's clear what I should do next ;-)
>>> 
>>> 2012/1/28 aaron morton 
>>> There are no blockers to upgrading to 1.0.X.
>>> 
>>> A 
>>> -
>>> Aaron Morton
>>> Freelance Developer
>>> @aaronmorton
>>> http://www.thelastpickle.com
>>> 
>>> On 28/01/2012, at 7:48 AM, R. Verlangen wrote:
>>> 
 Ok. Seems that an upgrade might fix these problems. Is Cassandra 1.x.x 
 stable enough to upgrade for, or should we wait for a couple of weeks?
 
 2012/1/27 Edward Capriolo 
 I would not say that issuing restart after x days is a good idea. You are 
 mostly developing a superstition. You should find the source of the 
 problem. It could be jmx or thrift clients not closing connections. We 
 don't restart nodes on a regiment they work fine.
 
 
 On Thursday, January 26, 2012, Mike Panchenko  wrote:
 > There are two relevant bugs (that I know of), both resolved in somewhat 
 > recent versions, which make somewhat   
 > regular restarts beneficial
 > https://issues.apache.org/jira/browse/CASSANDRA-2868 (memory leak in 
 > GCInspector, fixed in 0.7.9/0.8.5)
 > https://issues.apache.org/jira/browse/CASSANDRA-2252 (heap fragmentation 
 > due to the way memtables used to be allocated, refactored in 1.0.0)
 > Restarting daily is probably too frequent for either one of those 
 > problems. We usually notice degraded performance in our ancient cluster 
 > after ~2 weeks w/o a restart.
 > As Aaron mentioned, if you have plenty of disk space, there's no reason 
 > to worry about "cruft" sstables. The size of your active set is what 
 > matters, and you can determine if that's getting too big by watching for 
 > iowait (due to reads from the data partition) and/or paging activity of 
 > the java process. When you hit that problem, the solution is to 1. try 
 > to tune your caches and 2. add more nodes to spread the load. I'll 
 > reiterate - looking at raw disk space usage should not be your guide for 
 > that.
 > "Forcing" a gc generally works, but should not be relied upon (note 
 > "suggest" in 
 > http://docs.oracle.com/javase/6/docs/api/java/lang/System.html#gc()). 
 > It's great news that 1.0 uses a better mechanism for releasing unused 
 > sstables.
 > nodetool compact triggers a "major" compaction and is no longer a 
 > recommended by datastax (details here 
 > http://www.datastax.com/docs/1.0/operations/tuning#tuning-compaction 
 > bottom of the page).
 > Hope this helps.
 > Mike.
 > On Wed, Jan 25, 2012 at 5:14 PM, aaron morton  
 > wrote:
 >
 > That disk usage pattern is to be expected in pre 1.0 versions. Disk 
 > usage is far less interesting than disk free space, if it's using 60 GB 
 > and there is 200GB thats ok. If it's using 60Gb and there is 6MB free 
 > thats a problem.
 > In pre 1.0 the compacted files are deleted on disk by waiting for the 
 > JVM do decide to GC all remaining references. If there is not enough 
 > space (to store the total size of the files it is about to write or 
 > compact) on disk GC is forced and the files are deleted. Otherwise they 
 > will get deleted at some point in the future. 
 > In 1.0 files are reference counted and space is freed much sooner. 
 > With regard to regular maintenance, node tool cleanup remvos data from a 
 > node that it 

Re: SSTable compaction issue in our system

2012-01-31 Thread aaron morton
There is no way to reverse a compaction. 

You can initiate a user compaction on a single file though, see nodetool (i 
think) or the JMX interface. 
Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 1/02/2012, at 4:10 AM, Micah Hausler wrote:

> A related question, is there any way to reverse a major compaction without 
> loosing performance? Do I just have to wait it out?
> 
> Micah Hausler
> 
> On Jan 30, 2012, at 7:50 PM, Roshan Pradeep wrote:
> 
>> Thanks Aaron for the perfect explanation. Decided to go with automatic 
>> compaction. Thanks again.
>> 
>> On Wed, Jan 25, 2012 at 11:19 AM, aaron morton  
>> wrote:
>> The issue with major / manual compaction is that it creates a one file. One 
>> big old file.  
>> 
>> That one file will not be compacted unless there are 
>> (min_compaction_threshold -1) other files of a similar size. So thombstones 
>> and overwrites in that file may not be purged for a long time. 
>> 
>> If you go down the manual compaction path you need to keep doing it.
>> 
>> If you feel you need to do it do it, otherwise let automatic compaction do 
>> it's thing. 
>> Cheers
>>   
>>   
>> -
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 25/01/2012, at 12:47 PM, Roshan wrote:
>> 
>>> Thanks for the reply. Is the major compaction not recommended for Cassandra
>>> 1.0.6?
>>> 
>>> --
>>> View this message in context: 
>>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Different-size-of-SSTable-are-remain-in-the-system-without-compact-tp7218239p7222322.html
>>> Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
>>> Nabble.com.
>> 
>> 
> 



Re: WARN [Memtable] live ratio

2012-01-31 Thread aaron morton
The ratio is the ratio of serialised bytes for a memtable to actual JVM 
allocated memory. Using a ratio below 1 would imply the JVM is using less bytes 
to store the memtable in memory than it takes to store it on disk (without 
compression). 

The ceiling for the ratio is 64. 

The ratio is calculated periodically so if the workload changes, such as system 
start up, the number will lag behind. I would guess numbers less than 1 mean 
the memtable does not have any data. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 1/02/2012, at 8:27 AM, Radim Kolar wrote:

> 
>>  but a ration of<  1 may occur
>> for column families with  a very high update to insert ratio.
> better to ask why minimum ratio is 1.0. What harm can be done with using < 
> 1.0 ratio?



Re: Future dated column

2012-01-31 Thread aaron morton
Send a delete with a higher time stamp, reduce the gc_grace_seconds on the CF, 
get the CF to compact (manually or automatically) and then return the 
gc_grace_seconds. 

See the steps I took here to resolve a similar problem 
http://thelastpickle.com/2011/12/15/Anatomy-of-a-Cassandra-Partition/

Hope that helps. 
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 1/02/2012, at 12:07 PM, Hiren Shah wrote:

> Hi,
>  
> I was mystified when I was not able to update a column ccc for a few keys in 
> my test_cf, but was able to update that for all other keys. Then, I noticed 
> that the column for that key is set to be deleted in future (year 6172 !) –
>  
> DEBUG [ReadStage:1191] 2012-01-31 22:21:48,374 SliceQueryFilter.java (line 
> 123) collecting 0 of 100: ccc:true:4@1326231110794000
>  
> The timestamp must be from buggy code putting extra zeros, or from some issue 
> in upgrade from 0.8.7 to 1.0.6 to 1.0.7. It almost seems like Cassandra 
> ‘sees’ a few extra zeroes appended to the timestamp. The value without the 
> three zeroes falls at the right time for the row inserted above and the time 
> of delete below.
>  
> Recent records fetch ok –
> DEBUG [ReadStage:1278] 2012-01-31 23:00:35,145 SliceQueryFilter.java (line 
> 123) collecting 0 of 100: ccc:false:303@1328024823949!31536000
>  
> More important at this point  is to clean this up.
>  
> I tried to overwrite a value for that through CLI and it worked, sort of –
>  
> DEBUG [ReadStage:1099] 2012-01-31 21:15:34,385 SliceQueryFilter.java (line 
> 123) collecting 0 of 100: ccc:false:1@132804322269
>  
> But the timestamp still keeps the extra zeroes. I cannot set it 
> programmatically because my code uses current timestamp.
>  
> I tried to delete the whole record from CLI. The record cannot be queried 
> anymore, but I still see the column (and others for the record) in the log. I 
> did repair, cleanup and compact, but still no luck.
>  
> How can I delete a future dated column? Or overwrite it (without using a 
> timestamp in year 7000!) ?
>  
> I am using cassandra 1.0.7.
>  
> Hiren Shah | R&D Team
> 
>  
> 168 North Clinton Street, Fourth Floor 
> Chicago, Illinois 60661
> o: 312.253.3523 | c: 312.622.4970
>  
>  www.dotomi.com
>  



Re: Encrypting traffic between Hector client and Cassandra server

2012-01-31 Thread aaron morton
There was a recent post about performance that also talked about using Open VPN 
to encrypt traffic from clients to server

http://www.mail-archive.com/user@cassandra.apache.org/msg20058.html


I've not looked at thrift encryption. 

Cheers


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 1/02/2012, at 12:33 PM, Maxim Potekhin wrote:

> Hello,
> 
> do you see any value in having a web service over cassandra, with actual 
> client-clients talking to it via https/ssl?
> This way the cluster can be firewalled and therefore protected, plus you get 
> decent auth/auth right there.
> 
> Maxim
> 
> 
> On 1/31/2012 5:21 PM, Xaero S wrote:
>> 
>> I have been trying to figure out how to secure/encrypt the traffic between 
>> the client (Hector) and the Cassandra Server. I looked at this link 
>> https://issues.apache.org/jira/browse/THRIFT-106 But since thrift sits on a 
>> layer after Hector, i am wondering how i can get Hector to use the right 
>> Thrift calls to have the encryption happen? Also where can i get the 
>> instructions for the any required setup for encrypting the traffic between 
>> the Hector client and the Cassandra Server?
>> 
>> Would appreciate any help in this regard. Below are the setup versions
>> 
>> Cassandra Version - 0.8.7
>> Hector - 0.8.0-2
>> libthrift jar - 0.6.1
>> 
>> 
>> On a side note, we have setup internode encryption on the Cassandra server 
>> side and found the documentation for that easily.
>> 
>> 
>> 
> 



Delete doesn't remove row key?

2012-01-31 Thread Todd Fast

I added a row with a single column to my 1.0.8 single-node cluster:

RowKey: ----
=> (column=test, value=hi, timestamp=...)

I immediately deleted the row using both the CLI and CQL:

del Foo[lexicaluuid('----')];
delete from Foo using consistency all where 
KEY=----


In either case, the column "test" is gone but the empty row key still 
remains, and the row count reflects the presence of this phantom row.


I've tried nodetool compact/repair/flush/cleanup/scrub/etc. and nothing 
removes the row key.


How do I get rid of it?

BTW, I saw this little tidbit in the describe output:

Row cache size / save period in seconds / keys to save : 0.0/0/all

Does "all" here mean to keep the keys for empty rows? If so, how do I 
change that behavior?


ColumnFamily: "Foo"
"..."
  Key Validation Class: org.apache.cassandra.db.marshal.UUIDType
  Default column value validator: 
org.apache.cassandra.db.marshal.UTF8Type

  Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type
  Row cache size / save period in seconds / keys to save : 0.0/0/all
  Row Cache Provider: 
org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider

  Key cache size / save period in seconds: 20.0/14400
  GC grace seconds: 86400
  Compaction min/max thresholds: 4/32
  Read repair chance: 0.1
  Replicate on write: true
  Bloom Filter FP chance: default
  Built indexes: []
  Compaction Strategy: 
org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy


Todd


Re: Delete doesn't remove row key?

2012-01-31 Thread Benjamin Hawkes-Lewis
On Wed, Feb 1, 2012 at 12:58 AM, Todd Fast  wrote:
> I added a row with a single column to my 1.0.8 single-node cluster:
>
>    RowKey: ----
>    => (column=test, value=hi, timestamp=...)
>
> I immediately deleted the row using both the CLI and CQL:
>
>    del Foo[lexicaluuid('----')];
>    delete from Foo using consistency all where
> KEY=----
>
> In either case, the column "test" is gone but the empty row key still
> remains, and the row count reflects the presence of this phantom row.
>
> I've tried nodetool compact/repair/flush/cleanup/scrub/etc. and nothing
> removes the row key.

http://wiki.apache.org/cassandra/FAQ#range_ghosts

--
Benjamin Hawkes-Lewis


Re: Delete doesn't remove row key?

2012-01-31 Thread Todd Fast
First, thanks! I'd read that before, but didn't associate doing a range 
scan with using the CLI, much less doing "select count(*)" in CQL. Now I 
know what to call the phenomenon.


Second, a followup question: So the row keys will be deleted after 1) 
the GC grace period expires, and 2) I do a compaction?


Third: Assuming the answer is yes, is there any way to manually force GC 
of the deleted keys without doing the full "GC shuffle" (setting the GC 
grace period artificially low, restarting, compacting, setting grace 
period back to normal, restarting)?


Todd

On 1/31/2012 5:03 PM, Benjamin Hawkes-Lewis wrote:

On Wed, Feb 1, 2012 at 12:58 AM, Todd Fast  wrote:

I added a row with a single column to my 1.0.8 single-node cluster:

RowKey: ----
=>  (column=test, value=hi, timestamp=...)

I immediately deleted the row using both the CLI and CQL:

del Foo[lexicaluuid('----')];
delete from Foo using consistency all where
KEY=----

In either case, the column "test" is gone but the empty row key still
remains, and the row count reflects the presence of this phantom row.

I've tried nodetool compact/repair/flush/cleanup/scrub/etc. and nothing
removes the row key.

http://wiki.apache.org/cassandra/FAQ#range_ghosts

--
Benjamin Hawkes-Lewis


Astyanax: A New Cassandra Client.

2012-01-31 Thread Vijay
http://techblog.netflix.com/2012/01/announcing-astyanax.html
*
*
*What is Astyanax?*
Astyanax is a Java Cassandra client. It borrows many concepts from Hector
but diverges in the connection pool implementation as well as the client
API. One of the main design considerations was to provide a clean
abstraction between the connection pool and Cassandra API so that each may
be customized and improved separately. Astyanax provides a fluent style API
which guides the caller to narrow the query from key to column as well as
providing queries for more complex use cases that we have encountered. The
operational benefits of Astyanax over Hector include lower latency, reduced
latency variance, and better error handling.

PS: Author CC'ed

Regards,



Re: Astyanax: A New Cassandra Client.

2012-01-31 Thread Vijay
Fixing the CC list

Regards,




On Tue, Jan 31, 2012 at 5:40 PM, Vijay  wrote:

> http://techblog.netflix.com/2012/01/announcing-astyanax.html
> *
> *
> *What is Astyanax?*
> Astyanax is a Java Cassandra client. It borrows many concepts from Hector
> but diverges in the connection pool implementation as well as the client
> API. One of the main design considerations was to provide a clean
> abstraction between the connection pool and Cassandra API so that each may
> be customized and improved separately. Astyanax provides a fluent style API
> which guides the caller to narrow the query from key to column as well as
> providing queries for more complex use cases that we have encountered. The
> operational benefits of Astyanax over Hector include lower latency, reduced
> latency variance, and better error handling.
>
> PS: Author CC'ed
>
> Regards,
> 
>
>


Re: WARN [Memtable] live ratio

2012-01-31 Thread Roshan
Thanks for the explanation.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/WARN-Memtable-live-ratio-tp7238582p7242021.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: WARN [Memtable] live ratio

2012-01-31 Thread Mohit Anchlia
I guess this is not really a WARN in that case.

On Tue, Jan 31, 2012 at 4:29 PM, aaron morton  wrote:
> The ratio is the ratio of serialised bytes for a memtable to actual JVM
> allocated memory. Using a ratio below 1 would imply the JVM is using less
> bytes to store the memtable in memory than it takes to store it on disk
> (without compression).
>
> The ceiling for the ratio is 64.
>
> The ratio is calculated periodically so if the workload changes, such as
> system start up, the number will lag behind. I would guess numbers less than
> 1 mean the memtable does not have any data.
>
> Cheers
>
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 1/02/2012, at 8:27 AM, Radim Kolar wrote:
>
>
>  but a ration of<  1 may occur
>
> for column families with  a very high update to insert ratio.
>
> better to ask why minimum ratio is 1.0. What harm can be done with using <
> 1.0 ratio?
>
>