Re: constant CMS GC using CPU time

2012-10-22 Thread aaron morton
If you are using the default settings I would try to correlate the GC activity 
with some application activity before tweaking.

If this is happening on one machine out of 4 ensure that client load is 
distributed evenly. 

See if the raise in GC activity us related to Compaction, repair or an increase 
in throughput. OpsCentre or some other monitoring can help with the last one. 
Your mention of TTL makes me think compaction may be doing a bit of work 
churning through rows. 
  
Some things I've done in the past before looking at heap settings:
* reduce compaction_throughput to reduce the memory churn
* reduce in_memory_compaction_limit 
* if needed reduce concurrent_compactors

> Currently it seems like the memory used scales with the amount of bytes 
> stored and not with how busy the server actually is.  That's not such a good 
> thing.
The memtable_total_space_in_mb in yaml tells C* how much memory to devote to 
the memtables. That with the global row cache setting says how much memory will 
be used with regard to "storing" data and it will not increase inline with the 
static data load.

Now days GC issues are typically due to more dynamic forces, like compaction, 
repair and throughput. 
 
Hope that helps. 

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 20/10/2012, at 6:59 AM, Bryan Talbot  wrote:

> ok, let me try asking the question a different way ...
> 
> How does cassandra use memory and how can I plan how much is needed?  I have 
> a 1 GB memtable and 5 GB total heap and that's still not enough even though 
> the number of concurrent connections and garbage generation rate is fairly 
> low.
> 
> If I were using mysql or oracle, I could compute how much memory could be 
> used by N concurrent connections, how much is allocated for caching, temp 
> spaces, etc.  How can I do this for cassandra?  Currently it seems like the 
> memory used scales with the amount of bytes stored and not with how busy the 
> server actually is.  That's not such a good thing.
> 
> -Bryan
> 
> 
> 
> On Thu, Oct 18, 2012 at 11:06 AM, Bryan Talbot  wrote:
> In a 4 node cluster running Cassandra 1.1.5 with sun jvm 1.6.0_29-b11 
> (64-bit), the nodes are often getting "stuck" in state where CMS collections 
> of the old space are constantly running.  
> 
> The JVM configuration is using the standard settings in cassandra-env -- 
> relevant settings are included below.  The max heap is currently set to 5 GB 
> with 800MB for new size.  I don't believe that the cluster is overly busy and 
> seems to be performing well enough other than this issue.  When nodes get 
> into this state they never seem to leave it (by freeing up old space memory) 
> without restarting cassandra.  They typically enter this state while running 
> "nodetool repair -pr" but once they start doing this, restarting them only 
> "fixes" it for a couple of hours.
> 
> Compactions are completing and are generally not queued up.  All CF are using 
> STCS.  The busiest CF consumes about 100GB of space on disk, is write heavy, 
> and all columns have a TTL of 3 days.  Overall, there are 41 CF including 
> those used for system keyspace and secondary indexes.  The number of SSTables 
> per node currently varies from 185-212.
> 
> Other than frequent log warnings about "GCInspector  - Heap is 0.xxx full..." 
> and "StorageService  - Flushing CFS(...) to relieve memory pressure" there 
> are no other log entries to indicate there is a problem.
> 
> Does the memory needed vary depending on the amount of data stored?  If so, 
> how can I predict how much jvm space is needed?  I don't want to make the 
> heap too large as that's bad too.  Maybe there's a memory leak related to 
> compaction that doesn't allow meta-data to be purged?
> 
> 
> -Bryan
> 
> 
> 12 GB of RAM in host with ~6 GB used by java and ~6 GB for OS and buffer 
> cache.
> $> free -m
>  total   used   free sharedbuffers cached
> Mem: 12001  11870131  0  4   5778
> -/+ buffers/cache:   6087   5914
> Swap:0  0  0
> 
> 
> jvm settings in cassandra-env
> MAX_HEAP_SIZE="5G"
> HEAP_NEWSIZE="800M"
> 
> # GC tuning options
> JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC" 
> JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC" 
> JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled" 
> JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=8" 
> JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=1"
> JVM_OPTS="$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75"
> JVM_OPTS="$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly"
> JVM_OPTS="$JVM_OPTS -XX:+UseCompressedOops"
> 
> 
> jstat shows about 12 full collections per minute with old heap usage 
> constantly over 75% so CMS is always over the CMSInitiatingOccupancyFraction 
> threshold.
> 
> $> jstat -gcutil -t 22917 5000 4
> Timestamp S0 S1 E  O  P YGC YGCTFGC
> FGCT GCT   
>132063.0  34.70   0.00  2

Re: find smallest counter

2012-10-22 Thread aaron morton
> So the groups are a super column with CategoryId as key, GroupId as 
> superColumnName and then columns for the group members.
If this is a new project please consider not using Super Columns. They have 
some limitations http://wiki.apache.org/cassandra/CassandraLimitations and are 
often slower than a model that only uses Standard CF's 

> But I can't figure out how to grab just the smallest group.
You cannot do that in a single statement / request. Cassandra does not support 
ordering by column value (outside of secondary indexes) nor does it support 
aggregate operations. 

You will need to iterate the columns and find the smallest value. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 22/10/2012, at 2:28 PM, Paul Loy  wrote:

> I have a set of categories. In these categories I want to add groups of 
> users. If a user does not specify the group they want to join in a category, 
> I want to add them to the least subscribed group.
> 
> So the groups are a super column with CategoryId as key, GroupId as 
> superColumnName and then columns for the group members.
> 
> Then I was planning on having some counters so I could keep track of the 
> group sizes. I figured I'd have a counter column for the groups. The key 
> being the CategoryId, then counter columns named by the GroupId. But I can't 
> figure out how to grab just the smallest group.
> 
> Many thanks in advance,
> 
> Paul.
> 
> -- 
> -
> Paul Loy
> p...@keteracel.com
> http://uk.linkedin.com/in/paulloy



Re: Compound primary key: Insert after delete

2012-10-22 Thread aaron morton
How is it not working ?

Can you replicate the problem withe the CLI ?
Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 22/10/2012, at 7:17 PM, Vivek Mishra  wrote:

> code attached. Somehow it is not working with 1.1.5.
> 
> -Vivek
> 
> On Mon, Oct 22, 2012 at 5:20 AM, aaron morton  wrote:
> Yes AFAIK. 
> 
> Cheers
> 
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 20/10/2012, at 12:15 AM, Vivek Mishra  wrote:
> 
>> Hi,
>> Is it possible to reuse same compound primary key after delete? I guess it 
>> works fine for non composite keys.
>> 
>> -Vivek
> 
> 
> 



Re: Compound primary key: Insert after delete

2012-10-22 Thread Vivek Mishra
Well. Last 2 lines of code are deleting 1 record and inserting 2 records,
first one is "the deleted one" and  a new record. Output from command line:

[default@unknown] use bigdata;
Authenticated to keyspace: bigdata
[default@bigdata] list test1;
Using default limit of 100
Using default column limit of 100
---
RowKey: 2
=> (column=3:address, value=4, timestamp=1350884575938)
---
RowKey: 1

2 Rows Returned.


-Vivek

On Mon, Oct 22, 2012 at 1:01 PM, aaron morton wrote:

> How is it not working ?
>
> Can you replicate the problem withe the CLI ?
> Cheers
>
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 22/10/2012, at 7:17 PM, Vivek Mishra  wrote:
>
> code attached. Somehow it is not working with 1.1.5.
>
> -Vivek
>
> On Mon, Oct 22, 2012 at 5:20 AM, aaron morton wrote:
>
>> Yes AFAIK.
>>
>> Cheers
>>
>>   -
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 20/10/2012, at 12:15 AM, Vivek Mishra  wrote:
>>
>> Hi,
>> Is it possible to reuse same compound primary key after delete? I guess
>> it works fine for non composite keys.
>>
>> -Vivek
>>
>>
>>
> 
>
>
>


Re: Missing non composite column

2012-10-22 Thread Vivek Mishra
Anybody in group got into such issues?

-Vivek

On Fri, Oct 19, 2012 at 3:28 PM, Vivek Mishra  wrote:

> Ok. I did assume the same, here is what i have tried to fetch composite
> columns via thrift and CQL query as well!
>
> Not sure why thrift API is returning me column name as empty!  (Tried with
> Cassandra 1.1.5)
>
> Here is the program:
>
>
> /***
>  * * Copyright 2012 Impetus Infotech.
>  *  *
>  *  * Licensed under the Apache License, Version 2.0 (the "License");
>  *  * you may not use this file except in compliance with the License.
>  *  * You may obtain a copy of the License at
>  *  *
>  *  *  http://www.apache.org/licenses/LICENSE-2.0
>  *  *
>  *  * Unless required by applicable law or agreed to in writing, software
>  *  * distributed under the License is distributed on an "AS IS" BASIS,
>  *  * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
> implied.
>  *  * See the License for the specific language governing permissions and
>  *  * limitations under the License.
>
>  
> **/
> package com.impetus.client.cassandra.thrift;
>
> import java.nio.ByteBuffer;
> import java.util.ArrayList;
> import java.util.Iterator;
> import java.util.LinkedHashMap;
> import java.util.List;
>
> import org.apache.cassandra.db.marshal.AbstractType;
> import org.apache.cassandra.db.marshal.CompositeType;
> import org.apache.cassandra.db.marshal.CompositeType.Builder;
> import org.apache.cassandra.db.marshal.UTF8Type;
> import org.apache.cassandra.locator.SimpleStrategy;
> import org.apache.cassandra.thrift.Cassandra;
> import org.apache.cassandra.thrift.CfDef;
> import org.apache.cassandra.thrift.Column;
> import org.apache.cassandra.thrift.ColumnOrSuperColumn;
> import org.apache.cassandra.thrift.ColumnParent;
> import org.apache.cassandra.thrift.Compression;
> import org.apache.cassandra.thrift.ConsistencyLevel;
> import org.apache.cassandra.thrift.CqlResult;
> import org.apache.cassandra.thrift.CqlRow;
> import org.apache.cassandra.thrift.KsDef;
> import org.apache.cassandra.thrift.SlicePredicate;
> import org.apache.cassandra.thrift.SliceRange;
> import org.apache.cassandra.thrift.TBinaryProtocol;
> import org.apache.thrift.transport.TFramedTransport;
> import org.apache.thrift.transport.TSocket;
> import org.scale7.cassandra.pelops.Bytes;
>
> /**
>  * @author vivek.mishra
>  *
>  */
> public class CompositeTypeRunner
> {
> public static void main(String[] args) throws Exception {
>
> TSocket socket = new TSocket("localhost", 9160);
> TFramedTransport transport = new TFramedTransport(socket);
>
> Cassandra.Client client = new Cassandra.Client(new
> TBinaryProtocol(transport));
> transport.open();
> client.set_cql_version("3.0.0");
> List cfDefs = new ArrayList();
>
> /*  CfDef cfDef = new CfDef();
> cfDef.setName("test");
> cfDef.keyspace = "bigdata";
> cfDef.setComparator_type("UTF8Type");
> cfDef.setDefault_validation_class("UTF8Type");
> //cfDef.setKey_validation_class("UTF8Type");
>
> cfDefs.add(cfDef);*/
>
>
>KsDef ksDef = new KsDef("bigdata", SimpleStrategy.class.getName(),
> cfDefs);
>
> if (ksDef.strategy_options == null)
> {
> ksDef.strategy_options = new LinkedHashMap();
> }
>
> ksDef.strategy_options.put("replication_factor", "1");
> client.system_add_keyspace(ksDef);
> client.set_keyspace("bigdata");
>
> String cql_Query = "create columnfamily test1 (name text, age
> text, address text, PRIMARY KEY(name,age))";
>
> client.execute_cql_query(ByteBuffer.wrap(("USE
> bigdata").getBytes("UTF-8")) , Compression.NONE);
>
>
> client.execute_cql_query(ByteBuffer.wrap((cql_Query).getBytes("UTF-8")) ,
> Compression.NONE);
>
> /*ColumnParent parent = new ColumnParent("test1");
>
> List> keyTypes = new ArrayList>();
> keyTypes.add(UTF8Type.instance);
> keyTypes.add(UTF8Type.instance);
> CompositeType compositeKey = CompositeType.getInstance(keyTypes);
>
> Builder builder = new Builder(compositeKey);
> builder.add(ByteBuffer.wrap("1".getBytes()));
> builder.add(ByteBuffer.wrap("2".getBytes()));
> ByteBuffer rowid = builder.build();
>
> Column column = new Column();
> column.setName("value".getBytes());
> column.setValue("aaa".getBytes());
> column.setTimestamp(System.currentTimeMillis());
>
> client.insert(rowid, parent, column, ConsistencyLevel.ONE);*/
>
> ColumnParent parent = new ColumnParent("test1");
>
> List> keyTypes = new ArrayList>();
> keyTypes.add(UTF8Type.instance);
> keyTypes.add(UTF8Type.instance);
> CompositeType compositeKey = CompositeType.getInstance(keyTypes);
>
> Builder builder

Re: Fw: Fwd: Compound primary key: Insert after delete

2012-10-22 Thread Jonathan Ellis
Mixing the two isn't really recommended because of just this kind of
difficulty, but if you must, I would develop against 1.2 since it will
actually validate that the CT encoding you've done manually is valid.
1.1 will just fail silently.

On Mon, Oct 22, 2012 at 6:57 AM, Vivek Mishra  wrote:
> Hi,
>
> I am building support for Composite/Compund keys in Kundera and currently
> getting into number of problems for my POC to access it via Thrift.
>
> I am planning to use thrift API for insert/update/delete and for query i
> will go by CQL way.
>
>
> Issues:
> CompositeTypeRunner.java (see attached): Simple program to perform CRUD, it
> is not inserting against the deleted row key and also thrift API is
> returning column name as "Empty" string.
>
> OtherCompositeTypeRunner.java (see attached): Program to demonstrate issue
> with compound primary key as boolean. Column family creation via CQL is
> working fine, But insert via thrift is giving issue with "Unconfigured
> column family" though it is there!
>
> This is what i have tried with cassandra 1.1.6 as well.
>
> Please have a look and share, if i am doing anything wrong?   i did ask same
> on user group but no luck.
>
>
> -Vivek
>
>
>
>
> - Forwarded Message -
> From: Vivek Mishra 
> To: vivek.mis...@yahoo.com
> Sent: Monday, October 22, 2012 5:17 PM
> Subject: Fwd: Compound primary key: Insert after delete
>
>
>
> -- Forwarded message --
> From: Vivek Mishra 
> Date: Mon, Oct 22, 2012 at 1:08 PM
> Subject: Re: Compound primary key: Insert after delete
> To: user@cassandra.apache.org
>
>
> Well. Last 2 lines of code are deleting 1 record and inserting 2 records,
> first one is "the deleted one" and  a new record. Output from command line:
>
> [default@unknown] use bigdata;
> Authenticated to keyspace: bigdata
> [default@bigdata] list test1;
> Using default limit of 100
> Using default column limit of 100
> ---
> RowKey: 2
> => (column=3:address, value=4, timestamp=1350884575938)
> ---
> RowKey: 1
>
> 2 Rows Returned.
>
>
> -Vivek
>
> On Mon, Oct 22, 2012 at 1:01 PM, aaron morton 
> wrote:
>
> How is it not working ?
>
> Can you replicate the problem withe the CLI ?
> Cheers
>
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 22/10/2012, at 7:17 PM, Vivek Mishra  wrote:
>
> code attached. Somehow it is not working with 1.1.5.
>
> -Vivek
>
> On Mon, Oct 22, 2012 at 5:20 AM, aaron morton 
> wrote:
>
> Yes AFAIK.
>
> Cheers
>
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 20/10/2012, at 12:15 AM, Vivek Mishra  wrote:
>
> Hi,
> Is it possible to reuse same compound primary key after delete? I guess it
> works fine for non composite keys.
>
> -Vivek
>
>
>
> 
>
>
>
>
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: Hinted Handoff runs every ten minutes

2012-10-22 Thread Tamar Fraenkel
Hi!
I am having the same issue on 1.0.8.
Checked number of SSTables, on two nodes I have 1 (on each) and on 1 node I
have none.
Thanks,

*Tamar Fraenkel *
Senior Software Engineer, TOK Media

[image: Inline image 1]

ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956





On Mon, Oct 22, 2012 at 1:44 AM, aaron morton wrote:

> I *think* this may be ghost rows which have not being compacted.
>
> How many SSTables are on disk for the HintedHandoff CF ?
>
> Cheers
>
>   -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 19/10/2012, at 7:16 AM, David Daeschler 
> wrote:
>
> Hi Steve,
>
> Also confirming this. After having a node go down on Cassandra 1.0.8
> there seems to be hinted handoff between two of our 4 nodes every 10
> minutes. Our setup also shows 0 rows. It does not appear to have any
> effect on the operation of the ring, just fills up the log files.
>
> - David
>
>
>
> On Thu, Oct 18, 2012 at 2:10 PM, Stephen Pierce 
> wrote:
>
> I installed Cassandra on three nodes. I then ran a test suite against them
> to generate load. The test suite is designed to generate the same type of
> load that we plan to have in production. As one of many tests, I reset one
> of the nodes to check the failure/recovery modes.  Cassandra worked just
> fine.
>
>
>
> I stopped the load generation, and got distracted with some other
> project/problem. A few days later, I noticed something strange on one of
> the
> nodes. On this node hinted handoff starts every ten minutes, and while it
> seems to finish without any errors, it will be started again in ten
> minutes.
> None of the nodes has any traffic, and hasn’t for several days. I checked
> the logs, and this goes back to the initial failure/recovery testing:
>
>
>
> INFO [HintedHandoff:1] 2012-10-18 10:19:26,618 HintedHandOffManager.java
> (line 294) Started hinted handoff for token:
> 113427455640312821154458202477256070484 with IP: /192.168.128.136
>
> INFO [HintedHandoff:1] 2012-10-18 10:19:26,779 HintedHandOffManager.java
> (line 390) Finished hinted handoff of 0 rows to endpoint /192.168.128.136
>
> INFO [HintedHandoff:1] 2012-10-18 10:29:26,622 HintedHandOffManager.java
> (line 294) Started hinted handoff for token:
> 113427455640312821154458202477256070484 with IP: /192.168.128.136
>
> INFO [HintedHandoff:1] 2012-10-18 10:29:26,735 HintedHandOffManager.java
> (line 390) Finished hinted handoff of 0 rows to endpoint /192.168.128.136
>
> INFO [HintedHandoff:1] 2012-10-18 10:39:26,624 HintedHandOffManager.java
> (line 294) Started hinted handoff for token:
> 113427455640312821154458202477256070484 with IP: /192.168.128.136
>
> INFO [HintedHandoff:1] 2012-10-18 10:39:26,751 HintedHandOffManager.java
> (line 390) Finished hinted handoff of 0 rows to endpoint /192.168.128.136
>
>
>
> The other nodes are happy and don’t show this behavior. All the test data
> is
> readable, and everything is fine, but I’m curious why hinted handoff is
> running on one node all the time.
>
>
>
> I searched the bug database, and I found a bug that seems to have the same
> symptoms:
>
> https://issues.apache.org/jira/browse/CASSANDRA-3733
>
> Although it’s been marked fixed in 0.6, this describes my problem exactly.
>
>
>
> I’m running Cassandra 1.1.5 from Datastax on Centos 6.0:
>
>
> http://rpm.datastax.com/community/noarch/apache-cassandra11-1.1.5-1.noarch.rpm
>
>
>
> Is anyone else seeing this behavior? What can I do to provide more
> information?
>
>
>
> Steve
>
>
>
>
>
> --
> David Daeschler
>
>
>
<>

RE: DELETE query failing in CQL 3.0

2012-10-22 Thread Ryabin, Thomas
I figured out the problem. The DELETE query only works if the column used in 
the WHERE clause is also the first column used to define the PRIMARY KEY.

-Thomas

From: wang liang [mailto:wla...@gmail.com]
Sent: Monday, October 22, 2012 1:31 AM
To: user@cassandra.apache.org
Subject: Re: DELETE query failing in CQL 3.0

It is better to provide table definition. I guess the reason is below statement.
" a table must define at least one column that is not part of the PRIMARY KEY 
as a row exists in Cassandra only if it contains at least one value for one 
such column "
Please check this document 
here.

On Mon, Oct 22, 2012 at 7:53 AM, aaron morton 
mailto:aa...@thelastpickle.com>> wrote:
Can you paste the table definition ?

Thanks

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 20/10/2012, at 5:53 AM, "Ryabin, Thomas" 
mailto:thomas.rya...@mckesson.com>> wrote:


I have a column family called "books", and am trying to delete all rows where 
the "title" column is equal to "hatchet". This is the query I am using:
DELETE FROM books WHERE title = 'hatchet';

This query is failing with this error:
Bad Request: PRIMARY KEY part title found in SET part

I am using Cassandra 1.1 and CQL 3.0. What could be the problem?

-Thomas




--
Best wishes,
Helping others is to help myself.


Row caching memory usage in Cassandra 1.0.x

2012-10-22 Thread Josh
Hi, I'm hoping to get some help on how to help tune our 1.0.x cluster w.r.t. 
row 
caching.

We're using the netflix priam client, so unfortunately upgrading to 1.1.x is 
out 
of the question for now.. but until we find a way around that, is there any way 
to help determine where the 'sweet spot' is between heap size, row cache size, 
and leaving the rest of the ram available to the OS?

We're using the oracle jvm with jna so we can do the off-heap row caching, but 
I'm not sure how to tell how much ram it's using, thus I'm not comfortable 
increasing it further. (currently we have it set to 100,000 rows and we're 
already seeing ~85% hit rates, so we've stopped upping it further for now).

Thanks for any advice,

-Josh





Re: Fw: Fwd: Compound primary key: Insert after delete

2012-10-22 Thread Vivek Mishra
Thanks. But it means i may have re-write all the stuff in CQL way.
Considering CQL as a future interface for cassandra interface,AFN i will
implement it without mixing them.

-Vivek

On Mon, Oct 22, 2012 at 6:32 PM, Jonathan Ellis  wrote:

> Mixing the two isn't really recommended because of just this kind of
> difficulty, but if you must, I would develop against 1.2 since it will
> actually validate that the CT encoding you've done manually is valid.
> 1.1 will just fail silently.
>
> On Mon, Oct 22, 2012 at 6:57 AM, Vivek Mishra 
> wrote:
> > Hi,
> >
> > I am building support for Composite/Compund keys in Kundera and currently
> > getting into number of problems for my POC to access it via Thrift.
> >
> > I am planning to use thrift API for insert/update/delete and for query i
> > will go by CQL way.
> >
> >
> > Issues:
> > CompositeTypeRunner.java (see attached): Simple program to perform CRUD,
> it
> > is not inserting against the deleted row key and also thrift API is
> > returning column name as "Empty" string.
> >
> > OtherCompositeTypeRunner.java (see attached): Program to demonstrate
> issue
> > with compound primary key as boolean. Column family creation via CQL is
> > working fine, But insert via thrift is giving issue with "Unconfigured
> > column family" though it is there!
> >
> > This is what i have tried with cassandra 1.1.6 as well.
> >
> > Please have a look and share, if i am doing anything wrong?   i did ask
> same
> > on user group but no luck.
> >
> >
> > -Vivek
> >
> >
> >
> >
> > - Forwarded Message -
> > From: Vivek Mishra 
> > To: vivek.mis...@yahoo.com
> > Sent: Monday, October 22, 2012 5:17 PM
> > Subject: Fwd: Compound primary key: Insert after delete
> >
> >
> >
> > -- Forwarded message --
> > From: Vivek Mishra 
> > Date: Mon, Oct 22, 2012 at 1:08 PM
> > Subject: Re: Compound primary key: Insert after delete
> > To: user@cassandra.apache.org
> >
> >
> > Well. Last 2 lines of code are deleting 1 record and inserting 2 records,
> > first one is "the deleted one" and  a new record. Output from command
> line:
> >
> > [default@unknown] use bigdata;
> > Authenticated to keyspace: bigdata
> > [default@bigdata] list test1;
> > Using default limit of 100
> > Using default column limit of 100
> > ---
> > RowKey: 2
> > => (column=3:address, value=4, timestamp=1350884575938)
> > ---
> > RowKey: 1
> >
> > 2 Rows Returned.
> >
> >
> > -Vivek
> >
> > On Mon, Oct 22, 2012 at 1:01 PM, aaron morton 
> > wrote:
> >
> > How is it not working ?
> >
> > Can you replicate the problem withe the CLI ?
> > Cheers
> >
> > -
> > Aaron Morton
> > Freelance Developer
> > @aaronmorton
> > http://www.thelastpickle.com
> >
> > On 22/10/2012, at 7:17 PM, Vivek Mishra  wrote:
> >
> > code attached. Somehow it is not working with 1.1.5.
> >
> > -Vivek
> >
> > On Mon, Oct 22, 2012 at 5:20 AM, aaron morton 
> > wrote:
> >
> > Yes AFAIK.
> >
> > Cheers
> >
> > -
> > Aaron Morton
> > Freelance Developer
> > @aaronmorton
> > http://www.thelastpickle.com
> >
> > On 20/10/2012, at 12:15 AM, Vivek Mishra  wrote:
> >
> > Hi,
> > Is it possible to reuse same compound primary key after delete? I guess
> it
> > works fine for non composite keys.
> >
> > -Vivek
> >
> >
> >
> > 
> >
> >
> >
> >
> >
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>


Re: constant CMS GC using CPU time

2012-10-22 Thread Bryan Talbot
The memory usage was correlated with the size of the data set.  The nodes
were a bit unbalanced which is normal due to variations in compactions.
 The nodes with the most data used the most memory.  All nodes are affected
eventually not just one.  The GC was on-going even when the nodes were not
compacting or running a heavy application load -- even when the main app
was paused constant the GC continued.

As a test we dropped the largest CF and the memory
usage immediately dropped to acceptable levels and the constant GC stopped.
 So it's definitely related to data load.  memtable size is 1 GB, row cache
is disabled and key cache is small (default).

I believe one culprit turned out to be the bloom filters.  They were 2+ GB
(as reported by nodetool cfstats anyway).  It looks like the default
bloom_filter_fp_chance defaults to 0.0 even though guides recommend 0.10 as
the minimum value.  Raising that to 0.20 for some write-mostly CF reduced
memory used by 1GB or so.

Is there any way to predict how much memory the bloom filters will consume
if the size of the row keys, number or rows is known, and fp chance is
known?

-Bryan



On Mon, Oct 22, 2012 at 12:25 AM, aaron morton wrote:

> If you are using the default settings I would try to correlate the GC
> activity with some application activity before tweaking.
>
> If this is happening on one machine out of 4 ensure that client load is
> distributed evenly.
>
> See if the raise in GC activity us related to Compaction, repair or an
> increase in throughput. OpsCentre or some other monitoring can help with
> the last one. Your mention of TTL makes me think compaction may be doing a
> bit of work churning through rows.
>
> Some things I've done in the past before looking at heap settings:
> * reduce compaction_throughput to reduce the memory churn
> * reduce in_memory_compaction_limit
> * if needed reduce concurrent_compactors
>
> Currently it seems like the memory used scales with the amount of bytes
> stored and not with how busy the server actually is.  That's not such a
> good thing.
>
> The memtable_total_space_in_mb in yaml tells C* how much memory to devote
> to the memtables. That with the global row cache setting says how much
> memory will be used with regard to "storing" data and it will not increase
> inline with the static data load.
>
> Now days GC issues are typically due to more dynamic forces, like
> compaction, repair and throughput.
>
> Hope that helps.
>
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 20/10/2012, at 6:59 AM, Bryan Talbot  wrote:
>
> ok, let me try asking the question a different way ...
>
> How does cassandra use memory and how can I plan how much is needed?  I
> have a 1 GB memtable and 5 GB total heap and that's still not enough even
> though the number of concurrent connections and garbage generation rate is
> fairly low.
>
> If I were using mysql or oracle, I could compute how much memory could be
> used by N concurrent connections, how much is allocated for caching, temp
> spaces, etc.  How can I do this for cassandra?  Currently it seems like the
> memory used scales with the amount of bytes stored and not with how busy
> the server actually is.  That's not such a good thing.
>
> -Bryan
>
>
>
> On Thu, Oct 18, 2012 at 11:06 AM, Bryan Talbot wrote:
>
>> In a 4 node cluster running Cassandra 1.1.5 with sun jvm 1.6.0_29-b11
>> (64-bit), the nodes are often getting "stuck" in state where CMS
>> collections of the old space are constantly running.
>>
>> The JVM configuration is using the standard settings in cassandra-env --
>> relevant settings are included below.  The max heap is currently set to 5
>> GB with 800MB for new size.  I don't believe that the cluster is overly
>> busy and seems to be performing well enough other than this issue.  When
>> nodes get into this state they never seem to leave it (by freeing up old
>> space memory) without restarting cassandra.  They typically enter this
>> state while running "nodetool repair -pr" but once they start doing this,
>> restarting them only "fixes" it for a couple of hours.
>>
>> Compactions are completing and are generally not queued up.  All CF are
>> using STCS.  The busiest CF consumes about 100GB of space on disk, is write
>> heavy, and all columns have a TTL of 3 days.  Overall, there are 41 CF
>> including those used for system keyspace and secondary indexes.  The number
>> of SSTables per node currently varies from 185-212.
>>
>> Other than frequent log warnings about "GCInspector  - Heap is 0.xxx
>> full..." and "StorageService  - Flushing CFS(...) to relieve memory
>> pressure" there are no other log entries to indicate there is a problem.
>>
>> Does the memory needed vary depending on the amount of data stored?  If
>> so, how can I predict how much jvm space is needed?  I don't want to make
>> the heap too large as that's bad too.  Maybe there's a memory leak related
>> to compaction that doesn't al

Re: What does ReadRepair exactly do?

2012-10-22 Thread Manu Zhang
Is it through filter.collateColumns(resolved, iters, Integer.MIN_VALUE) and
then MergeIterator.get(toCollate, fcomp, reducer) but I don't know what
happens hereafter? How is reconcile exactly been called?

On Mon, Oct 22, 2012 at 6:49 AM, aaron morton wrote:

> There are two processes in cassandra that trigger Read Repair like
> behaviour.
>
> During a DigestMismatchException is raised if the responses from the
> replicas do not match. In this case another read is run that involves
> reading all the data. This is the CL level agreement kicking in.
>
> The other "Read Repair" is the one controlled by the "read_repair_chance".
> When RR is active on a request ALL up replicas are involved in the read.
> When RR is not active only CL replicas are involved. When test for CL
> agreement occurs synchronously to the request; the RR check
> waits asynchronously to the request for all nodes in the request to return.
> It then checks for consistency and repairs differences.
>
> From looking at the source code, I do not understand how this set is built
> and I do not understand how the reconciliation is executed.
>
> When a DigestMismatch is detected a read is run using RepairCallback. The
> callback will call the RowRepairResolver.resolve() when enough responses
> have been collected.
>
> resolveSuperset() picks one response to the baseline, and then calls
> delete() to apply row level deletes from the other responses
> (ColumnFamily's). It collects the other CF's into an iterator with a filter
> that returns all columns. The columns are then applied to the baseline CF
> which may result in reconcile() being called.
>
> reconcile() is used when a AbstractColumnContainer has two versions of a
> column and it wants to only have one.
>
> RowRepairResolve.scheduleRepairs() works out the delta for each node by
> calling ColumnFamily.diff(). The delta is then sent to the appropriate node.
>
>
> Hope that helps.
>
>
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 19/10/2012, at 6:33 AM, Markus Klems  wrote:
>
> Hi guys,
>
> I am looking through the Cassandra source code in the github trunk to
> better understand how Cassandra's fault-tolerance mechanisms work. Most
> things make sense. I am also aware of the wiki and DataStax documentation.
> However, I do not understand what read repair does in detail. The method
> RowRepairResolver.resolveSuperset(Iterable versions) seems to
> do the trick of merging conflicting versions of column family replicas and
> builds the set of columns that need to be "repaired". From looking at the
> source code, I do not understand how this set is built and I do not
> understand how the reconciliation is executed. ReadRepair does not seem to
> trigger a Column.reconcile() to reconcile conflicting column versions on
> different servers. Does it?
>
> If this is not what read repair does, then: What kind of inconsistencies
> are resolved by read repair? And: How are the inconsistencies resolved?
>
> Could someone give me a hint?
>
> Thanks so much,
>
> -Markus
>
>
>


tombstones and their data

2012-10-22 Thread B. Todd Burruss
if a node, X, has a tombstone marking deleted data, when can node X
remove the data - not the tombstone, but the data?  i understand the
tombstone cannot be removed until GCGraceSeconds has passed, but it
seems the data could be compacted away at any time.


Re: tombstones and their data

2012-10-22 Thread Hiller, Dean
My understanding is any time from that node.  Another node may have a
different existing value and tombstone vs. that existing data(most recent
timestamp wins).  Ie. The data is not needed on that node so compaction
should be getting rid of it, but I never confirmed thisŠ.I hope you get
confirmation.

Dean

On 10/22/12 10:43 AM, "B. Todd Burruss"  wrote:

>if a node, X, has a tombstone marking deleted data, when can node X
>remove the data - not the tombstone, but the data?  i understand the
>tombstone cannot be removed until GCGraceSeconds has passed, but it
>seems the data could be compacted away at any time.



Re: tombstones and their data

2012-10-22 Thread Sylvain Lebresne
The data does get removed as soon as possible (as soon as it is
compacted with the tombstone that is).

--
Sylvain

On Mon, Oct 22, 2012 at 7:03 PM, Hiller, Dean  wrote:
> My understanding is any time from that node.  Another node may have a
> different existing value and tombstone vs. that existing data(most recent
> timestamp wins).  Ie. The data is not needed on that node so compaction
> should be getting rid of it, but I never confirmed thisŠ.I hope you get
> confirmation.
>
> Dean
>
> On 10/22/12 10:43 AM, "B. Todd Burruss"  wrote:
>
>>if a node, X, has a tombstone marking deleted data, when can node X
>>remove the data - not the tombstone, but the data?  i understand the
>>tombstone cannot be removed until GCGraceSeconds has passed, but it
>>seems the data could be compacted away at any time.
>


Re: tombstones and their data

2012-10-22 Thread B. Todd Burruss
excellent, thx

On Mon, Oct 22, 2012 at 10:13 AM, Sylvain Lebresne  wrote:
> The data does get removed as soon as possible (as soon as it is
> compacted with the tombstone that is).
>
> --
> Sylvain
>
> On Mon, Oct 22, 2012 at 7:03 PM, Hiller, Dean  wrote:
>> My understanding is any time from that node.  Another node may have a
>> different existing value and tombstone vs. that existing data(most recent
>> timestamp wins).  Ie. The data is not needed on that node so compaction
>> should be getting rid of it, but I never confirmed thisŠ.I hope you get
>> confirmation.
>>
>> Dean
>>
>> On 10/22/12 10:43 AM, "B. Todd Burruss"  wrote:
>>
>>>if a node, X, has a tombstone marking deleted data, when can node X
>>>remove the data - not the tombstone, but the data?  i understand the
>>>tombstone cannot be removed until GCGraceSeconds has passed, but it
>>>seems the data could be compacted away at any time.
>>


tuning for read performance

2012-10-22 Thread feedly team
Hi,
I have a small 2 node cassandra cluster that seems to be constrained by
read throughput. There are about 100 writes/s and 60 reads/s mostly against
a skinny column family. Here's the cfstats for that family:

 SSTable count: 13
 Space used (live): 231920026568
 Space used (total): 231920026568
 Number of Keys (estimate): 356899200
 Memtable Columns Count: 1385568
 Memtable Data Size: 359155691
 Memtable Switch Count: 26
 Read Count: 40705879
 Read Latency: 25.010 ms.
 Write Count: 9680958
 Write Latency: 0.036 ms.
 Pending Tasks: 0
 Bloom Filter False Postives: 28380
 Bloom Filter False Ratio: 0.00360
 Bloom Filter Space Used: 874173664
 Compacted row minimum size: 61
 Compacted row maximum size: 152321
 Compacted row mean size: 1445

iostat shows almost no write activity, here's a typical line:

Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s avgrq-sz
avgqu-sz   await  svctm  %util
sdb   0.00 0.00  312.870.00 6.61 0.0043.27
   23.35  105.06   2.28  71.19

and nodetool tpstats always shows pending tasks in the ReadStage. The data
set has grown beyond physical memory (250GB/node w/64GB of RAM) so I know
disk access is required, but are there particular settings I should
experiment with that could help relieve some read i/o pressure? I already
put memcached in front of cassandra so the row cache probably won't help
much.

Also this column family stores smallish documents (usually 1-100K) along
with metadata. The document is only occasionally accessed, usually only the
metadata is read/written. Would splitting out the document into a separate
column family help?

Thanks
Kireet


nodetool cleanup

2012-10-22 Thread B. Todd Burruss
does "nodetool cleanup" perform a major compaction in the process of
removing unwanted data?

i seem to remember this to be the case, but can't find anything definitive


Re: tuning for read performance

2012-10-22 Thread Aaron Turner
On Mon, Oct 22, 2012 at 11:05 AM, feedly team  wrote:
> Hi,
> I have a small 2 node cassandra cluster that seems to be constrained by
> read throughput. There are about 100 writes/s and 60 reads/s mostly against
> a skinny column family. Here's the cfstats for that family:
>
>  SSTable count: 13
>  Space used (live): 231920026568
>  Space used (total): 231920026568
>  Number of Keys (estimate): 356899200
>  Memtable Columns Count: 1385568
>  Memtable Data Size: 359155691
>  Memtable Switch Count: 26
>  Read Count: 40705879
>  Read Latency: 25.010 ms.
>  Write Count: 9680958
>  Write Latency: 0.036 ms.
>  Pending Tasks: 0
>  Bloom Filter False Postives: 28380
>  Bloom Filter False Ratio: 0.00360
>  Bloom Filter Space Used: 874173664
>  Compacted row minimum size: 61
>  Compacted row maximum size: 152321
>  Compacted row mean size: 1445
>
> iostat shows almost no write activity, here's a typical line:
>
> Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s avgrq-sz
> avgqu-sz   await  svctm  %util
> sdb   0.00 0.00  312.870.00 6.61 0.0043.27
> 23.35  105.06   2.28  71.19
>
> and nodetool tpstats always shows pending tasks in the ReadStage. The data
> set has grown beyond physical memory (250GB/node w/64GB of RAM) so I know
> disk access is required, but are there particular settings I should
> experiment with that could help relieve some read i/o pressure? I already
> put memcached in front of cassandra so the row cache probably won't help
> much.
>
> Also this column family stores smallish documents (usually 1-100K) along
> with metadata. The document is only occasionally accessed, usually only the
> metadata is read/written. Would splitting out the document into a separate
> column family help?
>

Some un-expert advice:

1. Consider Leveled compaction instead of Size Tiered.  LCS improves
read performance at the cost of more writes.

2. You said "skinny column family" which I took to mean not a lot of
columns/row.  See if you can organize your data into wider rows which
allow reading fewer rows and thus fewer queries/disk seeks.

3. Enable compression if you haven't already.

4. Splitting your data from your MetaData could definitely help.  I
like separating my read heavy from write heavy CF's because generally
speaking they benefit from different compaction methods.  But don't go
crazy creating 1000's of CF's either.

Hope that gives you some ideas to investigate further!


-- 
Aaron Turner
http://synfin.net/ Twitter: @synfinatic
http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix & Windows
Those who would give up essential Liberty, to purchase a little temporary
Safety, deserve neither Liberty nor Safety.
-- Benjamin Franklin
"carpe diem quam minimum credula postero"


Re: DELETE query failing in CQL 3.0

2012-10-22 Thread Tyler Hobbs
For what it's worth, Cassandra 1.2 will support deleting a slice of
columns, allowing you to specify the first N components of the primary key
in a WHERE clause for a DELETE statement:
https://issues.apache.org/jira/browse/CASSANDRA-3708

On Mon, Oct 22, 2012 at 8:45 AM, Ryabin, Thomas
wrote:

>  I figured out the problem. The DELETE query only works if the column
> used in the WHERE clause is also the first column used to define the
> PRIMARY KEY.
>
> ** **
>
> -Thomas
>
> ** **
>
> *From:* wang liang [mailto:wla...@gmail.com]
> *Sent:* Monday, October 22, 2012 1:31 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: DELETE query failing in CQL 3.0
>
> ** **
>
> It is better to provide table definition. I guess the reason is below
> statement. 
>
> " a table must define at least one column that is not part of the PRIMARY
> KEY as a row exists in Cassandra only if it contains at least one value for
> one such column "
>
> Please check this document 
> here
> . 
>
> ** **
>
> On Mon, Oct 22, 2012 at 7:53 AM, aaron morton 
> wrote:
>
> Can you paste the table definition ? 
>
> ** **
>
> Thanks
>
> ** **
>
> -
>
> Aaron Morton
>
> Freelance Developer
>
> @aaronmorton
>
> http://www.thelastpickle.com
>
> ** **
>
> On 20/10/2012, at 5:53 AM, "Ryabin, Thomas" 
> wrote:
>
>
>
> 
>
> I have a column family called “books”, and am trying to delete all rows
> where the “title” column is equal to “hatchet”. This is the query I am
> using:
>
> DELETE FROM books WHERE title = ‘hatchet’;
>
>  
>
> This query is failing with this error:
>
> Bad Request: PRIMARY KEY part title found in SET part
>
>  
>
> I am using Cassandra 1.1 and CQL 3.0. What could be the problem?
>
>  
>
> -Thomas
>
> ** **
>
>
>
> 
>
> ** **
>
> --
> Best wishes,
> Helping others is to help myself.
>



-- 
Tyler Hobbs
DataStax 


Re: Log client

2012-10-22 Thread aaron morton
AFAIK IP not logged.

If you want to check the connection try lsof 

Cheers
 
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 22/10/2012, at 9:34 PM, Jean Paul Adant  wrote:

> Hi all,
> 
> How can I log on server the IP of every connection to the server. I had an 
> old application making request to server and i spend a lot of time to find 
> it. Is there any easy way to log the Ip of client sending resuests.
> 
> Thanks
> 
> Jean Paul
> 
> -- 
> -
> Jean Paul Adant - Créative-Ingénierie
> jean.paul.ad...@gmail.com
> 
> 
> 



Re: Row caching memory usage in Cassandra 1.0.x

2012-10-22 Thread aaron morton
I'm not aware of how to track the memory usage for the off heap row cache in 
1.0. The memory may show up in something like JConsole. What about seeing how 
much os memory is allocated to buffers and working backwards from there?   

Anyone else ? 

(One thing to be away of is each CF has it's own row cache, so tuning must be 
done per CF. )

Cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 23/10/2012, at 3:35 AM, Josh  wrote:

> Hi, I'm hoping to get some help on how to help tune our 1.0.x cluster w.r.t. 
> row 
> caching.
> 
> We're using the netflix priam client, so unfortunately upgrading to 1.1.x is 
> out 
> of the question for now.. but until we find a way around that, is there any 
> way 
> to help determine where the 'sweet spot' is between heap size, row cache 
> size, 
> and leaving the rest of the ram available to the OS?
> 
> We're using the oracle jvm with jna so we can do the off-heap row caching, 
> but 
> I'm not sure how to tell how much ram it's using, thus I'm not comfortable 
> increasing it further. (currently we have it set to 100,000 rows and we're 
> already seeing ~85% hit rates, so we've stopped upping it further for now).
> 
> Thanks for any advice,
> 
> -Josh
> 
> 
> 



Re: constant CMS GC using CPU time

2012-10-22 Thread aaron morton
> The GC was on-going even when the nodes were not compacting or running a 
> heavy application load -- even when the main app was paused constant the GC 
> continued.
If you restart a node is the onset of GC activity correlated to some event?
 
> As a test we dropped the largest CF and the memory usage immediately dropped 
> to acceptable levels and the constant GC stopped.  So it's definitely related 
> to data load.  memtable size is 1 GB, row cache is disabled and key cache is 
> small (default).
How many keys did the CF have per node? 
I dismissed the memory used to  hold bloom filters and index sampling. That 
memory is not considered part of the memtable size, and will end up in the 
tenured heap. It is generally only a problem with very large key counts per 
node. 

>  They were 2+ GB (as reported by nodetool cfstats anyway).  It looks like the 
> default bloom_filter_fp_chance defaults to 0.0 
The default should be 0.000744.

If the chance is zero or null this code should run when a new SSTable is 
written 
  // paranoia -- we've had bugs in the thrift <-> avro <-> CfDef dance before, 
let's not let that break things
logger.error("Bloom filter FP chance of zero isn't supposed to 
happen");

Were the CF's migrated from an old version ?

> Is there any way to predict how much memory the bloom filters will consume if 
> the size of the row keys, number or rows is known, and fp chance is known?

See o.a.c.utils.BloomFilter.getFilter() in the code 
This http://hur.st/bloomfilter appears to give similar results. 

Cheers
 

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 23/10/2012, at 4:38 AM, Bryan Talbot  wrote:

> The memory usage was correlated with the size of the data set.  The nodes 
> were a bit unbalanced which is normal due to variations in compactions.  The 
> nodes with the most data used the most memory.  All nodes are affected 
> eventually not just one.  The GC was on-going even when the nodes were not 
> compacting or running a heavy application load -- even when the main app was 
> paused constant the GC continued.
> 
> As a test we dropped the largest CF and the memory usage immediately dropped 
> to acceptable levels and the constant GC stopped.  So it's definitely related 
> to data load.  memtable size is 1 GB, row cache is disabled and key cache is 
> small (default).
> 
> I believe one culprit turned out to be the bloom filters.  They were 2+ GB 
> (as reported by nodetool cfstats anyway).  It looks like the default 
> bloom_filter_fp_chance defaults to 0.0 even though guides recommend 0.10 as 
> the minimum value.  Raising that to 0.20 for some write-mostly CF reduced 
> memory used by 1GB or so.
> 
> Is there any way to predict how much memory the bloom filters will consume if 
> the size of the row keys, number or rows is known, and fp chance is known?
> 
> -Bryan
> 
> 
> 
> On Mon, Oct 22, 2012 at 12:25 AM, aaron morton  
> wrote:
> If you are using the default settings I would try to correlate the GC 
> activity with some application activity before tweaking.
> 
> If this is happening on one machine out of 4 ensure that client load is 
> distributed evenly. 
> 
> See if the raise in GC activity us related to Compaction, repair or an 
> increase in throughput. OpsCentre or some other monitoring can help with the 
> last one. Your mention of TTL makes me think compaction may be doing a bit of 
> work churning through rows. 
>   
> Some things I've done in the past before looking at heap settings:
> * reduce compaction_throughput to reduce the memory churn
> * reduce in_memory_compaction_limit 
> * if needed reduce concurrent_compactors
> 
>> Currently it seems like the memory used scales with the amount of bytes 
>> stored and not with how busy the server actually is.  That's not such a good 
>> thing.
> The memtable_total_space_in_mb in yaml tells C* how much memory to devote to 
> the memtables. That with the global row cache setting says how much memory 
> will be used with regard to "storing" data and it will not increase inline 
> with the static data load.
> 
> Now days GC issues are typically due to more dynamic forces, like compaction, 
> repair and throughput. 
>  
> Hope that helps. 
> 
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 20/10/2012, at 6:59 AM, Bryan Talbot  wrote:
> 
>> ok, let me try asking the question a different way ...
>> 
>> How does cassandra use memory and how can I plan how much is needed?  I have 
>> a 1 GB memtable and 5 GB total heap and that's still not enough even though 
>> the number of concurrent connections and garbage generation rate is fairly 
>> low.
>> 
>> If I were using mysql or oracle, I could compute how much memory could be 
>> used by N concurrent connections, how much is allocated for caching, temp 
>> spaces, etc.  How can I do this for cassandra?  Currently it seems like the 
>> memory used scales

Re: Row caching memory usage in Cassandra 1.0.x

2012-10-22 Thread Will @ SOHO

On 10/22/2012 08:24 PM, aaron morton wrote:
I'm not aware of how to track the memory usage for the off heap row 
cache in 1.0. The memory may show up in something like JConsole. What 
about seeing how much os memory is allocated to buffers and working 
backwards from there?


Anyone else ?

(One thing to be away of is each CF has it's own row cache, so tuning 
must be done per CF. )


Cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 23/10/2012, at 3:35 AM, Josh > wrote:


Hi, I'm hoping to get some help on how to help tune our 1.0.x cluster 
w.r.t. row

caching.

We're using the netflix priam client, so unfortunately upgrading to 
1.1.x is out
of the question for now.. but until we find a way around that, is 
there any way
to help determine where the 'sweet spot' is between heap size, row 
cache size,

and leaving the rest of the ram available to the OS?

We're using the oracle jvm with jna so we can do the off-heap row 
caching, but
I'm not sure how to tell how much ram it's using, thus I'm not 
comfortable
increasing it further. (currently we have it set to 100,000 rows and 
we're
already seeing ~85% hit rates, so we've stopped upping it further for 
now).


Thanks for any advice,

-Josh





ByteBuffer::AllocateDirect uses memlocking, i think. mLock takes a mask 
of "all current" and "all future". Most developers outside C/C++ word 
won't use "all future". So buffer counts * sizes should be reliable. 
Even with JNA, you'd be using the same system call.





Re: constant CMS GC using CPU time

2012-10-22 Thread Will @ SOHO

On 10/22/2012 09:05 PM, aaron morton wrote:

# GC tuning options
JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC"
JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC"
JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled"
JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=8"
JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=1"
JVM_OPTS="$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75"
JVM_OPTS="$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly"
JVM_OPTS="$JVM_OPTS -XX:+UseCompressedOops"

You are too far behind the reference JVM's. Parallel GC is the preferred 
and highest performing form in the current Security Baseline version of 
the JVM's.


Strange row expiration behavior

2012-10-22 Thread Stephen Mullins
Hello, I'm seeing Cassandra behavior that I can't explain, on v1.0.12. I'm
trying to test removing rows after all columns have expired. I've read the
following:
http://wiki.apache.org/cassandra/DistributedDeletes
http://wiki.apache.org/cassandra/MemtableSSTable
https://issues.apache.org/jira/browse/CASSANDRA-2795

And came up with a test to demonstrate the empty row removal that does the
following:

   1. create a keyspace
   2. create a column family with gc_seconds=10 (arbitrary small number)
   3. insert a couple rows with ttl=5 (again, just a small number)
   4. use nodetool to flush the column family
   5. sleep >10 seconds
   6. ensure the columns are removed with *cassandra-cli list *
   7. use nodetool to compact the keyspace

Performing these steps results in the rows still being present using
*cassandra-cli
list*. What gets really odd is if I add these steps it works:

   1. sleep 5 seconds
   2. use cassandra-cli to *del mycf[arow]*
   3. use nodetool to flush the column family
   4. use nodetool to compact the keyspace

I don't understand why the first set of steps (1-7) don't work to remove
the empty row, nor do I understand why the explicit row delete somehow
makes this work. I have all this in a script that I could attach if that's
appropriate. Is there something wrong with the steps that I have?

Thanks,
Stephen


Node Dead/Up

2012-10-22 Thread Jason Hill
Hello,

I'm on version 1.0.11.

I'm seeing this in my system log with occasional frequency:

INFO [GossipTasks:1] 2012-10-23 02:26:34,449 Gossiper.java (line 818)
InetAddress /10.50.10.21 is now dead.
INFO [GossipStage:1] 2012-10-23 02:26:34,620 Gossiper.java (line 804)
InetAddress /10.50.10.21 is now UP


INFO [StreamStage:1] 2012-10-23 02:24:38,763 StreamOutSession.java
(line 228) Streaming to /10.50.10.25 <--this line included for context
INFO [GossipTasks:1] 2012-10-23 02:26:30,603 Gossiper.java (line 818)
InetAddress /10.50.10.25 is now dead.
INFO [GossipStage:1] 2012-10-23 02:26:40,763 Gossiper.java (line 804)
InetAddress /10.50.10.25 is now UP
INFO [AntiEntropyStage:1] 2012-10-23 02:27:30,249
AntiEntropyService.java (line 233) [repair
#5a3383c0-1cb5-11e2--56b66459adef] Sending completed merkle tree
to /10.50.10.25 for (Innovari,TICCompressedLoad) <--this line included
for context

What is this telling me? Is my network dropping for less than a
second? Are my nodes really dead and then up? Can someone shed some
light on this for me?

cheers,
Jason


Re: nodetool cleanup

2012-10-22 Thread Peter Schuller
On Oct 22, 2012 11:54 AM, "B. Todd Burruss"  wrote:
>
> does "nodetool cleanup" perform a major compaction in the process of
> removing unwanted data?

No.


Re: nodetool cleanup

2012-10-22 Thread Will @ SOHO

On 10/23/2012 01:25 AM, Peter Schuller wrote:



On Oct 22, 2012 11:54 AM, "B. Todd Burruss" > wrote:

>
> does "nodetool cleanup" perform a major compaction in the process of
> removing unwanted data?

No.

what is the internal memory model used? It sounds like it doesn't have a 
page manager?


How to change the seed node Cassandra 1.0.11

2012-10-22 Thread Roshan
Hi

In our production, we have 3 Cassandra 1.0.11 nodes.

Due to a reason, I want to move the current seed node to another node and
once seed node change, the previous node want to remove from cluster.

How can I do that?

Thanks. 




--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/How-to-change-the-seed-node-Cassandra-1-0-11-tp7583338.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.