Re: slow read

2012-03-05 Thread ruslan usifov
2012/3/5 Jeesoo Shin 

> Hi all.
>
> I have very SLOW READ here. :-(
> I made a cluster with three node (aws xlarge, replication = 3)
> Cassandra version is 1.0.6
> I have inserted 1,000,000 rows. (standard column)
> Each row has 200 columns.
> Each column has 16 byte key,  512 byte value.
>
> I used Hector createSliceQuery to get one column in a row.
> This basic query(random row, fixed column) is created with multiple
> thread and hit cassandra.
>
> I only get up to 140 request per second. Is this all I can get for read?
> Or am I doing something wrong?
> Interestingly, when I request rows which doesn't exist, it goes up to
> 1600 per second.
>
>
You must test read performance by paralel test (ie multiple threads). The
result when not existent rows are more faster is result of bloom filter



>
> ANY insight, share will be extremely helpful.
> Thank you.
>
> Regards,
> Jeesoo.
>


Re: Secondary indexes don't go away after metadata change

2012-03-05 Thread aaron morton
The secondary index CF's are marked as no longer required / marked as 
compacted. under 1.x they would then be deleted reasonably quickly, and 
definitely deleted after a restart. 

Is there a zero length .Compacted file there ? 

> Also, when adding a new node to the ring the new node will build indexes for 
> the ones that supposedly don’t exist any longer.  Is this supposed to happen? 
>  Would this have happened if I had deleted the old SSTables from the 
> previously existing nodes?
Check you have a consistent schema using describe cluster in the CLI. And check 
the schema is what you think it is using show schema. 

Another trick is to do a snapshot. Only the files in use are included the 
snapshot. 

Hope that helps. 
 
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 2/03/2012, at 2:53 AM, Frisch, Michael wrote:

> I have a few column families that I decided to get rid of the secondary 
> indexes on.  I see that there aren’t any new index SSTables being created, 
> but all of the old ones remain (some from as far back as September).  Is it 
> safe to just delete then when the node is offline?  Should I run clean-up or 
> scrub?
>  
> Also, when adding a new node to the ring the new node will build indexes for 
> the ones that supposedly don’t exist any longer.  Is this supposed to happen? 
>  Would this have happened if I had deleted the old SSTables from the 
> previously existing nodes?
>  
> The nodes in question have either been upgraded from v0.8.1 => v1.0.2 
> (scrubbed at this time) => v1.0.6 or from v1.0.2 => v1.0.6.  The secondary 
> index was dropped when the nodes were version 1.0.6.  The new node added was 
> also 1.0.6.
>  
> - Mike



Re: slow read

2012-03-05 Thread Jeesoo Shin
Thank you for reply. :)
Yes I did multiple thread.
160, 320 gave me same result.

On 3/5/12, ruslan usifov  wrote:
> 2012/3/5 Jeesoo Shin 
>
>> Hi all.
>>
>> I have very SLOW READ here. :-(
>> I made a cluster with three node (aws xlarge, replication = 3)
>> Cassandra version is 1.0.6
>> I have inserted 1,000,000 rows. (standard column)
>> Each row has 200 columns.
>> Each column has 16 byte key,  512 byte value.
>>
>> I used Hector createSliceQuery to get one column in a row.
>> This basic query(random row, fixed column) is created with multiple
>> thread and hit cassandra.
>>
>> I only get up to 140 request per second. Is this all I can get for read?
>> Or am I doing something wrong?
>> Interestingly, when I request rows which doesn't exist, it goes up to
>> 1600 per second.
>>
>>
> You must test read performance by paralel test (ie multiple threads). The
> result when not existent rows are more faster is result of bloom filter
>
>
>
>>
>> ANY insight, share will be extremely helpful.
>> Thank you.
>>
>> Regards,
>> Jeesoo.
>>
>


Re: can't find rows

2012-03-05 Thread aaron morton
am guessing a lot here, but I would check if auto_bootstrap is enabled. It is 
by default. 

When a new node joins reads are not directed to it until it is marked as "UP" 
(writes are sent to it as it is joining). So reads should continue to go to the 
original UP node.

Sounds like it's all running now. If you get see it again can you provide some 
detailed steps such as the type of query and the CL level. 

Cheers
 
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 2/03/2012, at 7:35 AM, Casey Deccio wrote:

> On Thu, Mar 1, 2012 at 9:33 AM, aaron morton  wrote:
> What RF were you using and had you been running repair regularly ? 
> 
> 
> RF 1 *sigh*.  Waiting until I have more/better resources to use RF > 1.  
> Hopefully soon.
> 
> In the mean time... Oddly (to me), when I removed the most recently added 
> node, all my rows re-appeared, but were only up-to-date as of a 10 days ago 
> (a few days before I added the node).  None of the supercolumns since then 
> show up.  But when I look at the sstable files on the different nodes, I see 
> large files with timestamps in between that date and today's, which makes me 
> think the data is still there.  Also, if I re-add the troublesome new node 
> (not having run cleanup), all rows are again inaccessible until I again 
> decommission it.
> 
> Casey



Re: Schema change causes exception when adding data

2012-03-05 Thread aaron morton
I don't have a lot of Hector experience but it sounds like the way to go. 

The CLI and cqlsh will take care of this. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 2/03/2012, at 10:12 AM, Tharindu Mathew wrote:

> There are 2. I'd like to wait till there are one, when I insert the value.
> 
> Going through the code, calling client.describe_schema_versions() seems to 
> give a good answer to this. And I discovered that if I wait till there is 
> only 1 version, I will not get this error.
> 
> Is this the best practice if I want to check this programatically?
> 
> On Thu, Mar 1, 2012 at 11:15 PM, aaron morton  wrote:
> use describe cluster in the CLI to see how many schema versions there are. 
> 
> Cheers
> 
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 2/03/2012, at 12:25 AM, Tharindu Mathew wrote:
> 
>> 
>> 
>> On Thu, Mar 1, 2012 at 11:47 AM, Tharindu Mathew  wrote:
>> Jeremiah,
>> 
>> Thanks for the reply.
>> 
>> This is what we have been doing, but it's not reliable as we don't know a 
>> definite time that the schema would get replicated. Is there any way I can 
>> know for sure that changes have propagated?
>> [Edit: corrected to a question] 
>> 
>> Then I can block the insertion of data until then.
>> 
>> 
>> On Thu, Mar 1, 2012 at 4:33 AM, Jeremiah Jordan 
>>  wrote:
>> The error is that the specified colum family doesn’t exist.  If you connect 
>> with the CLI and describe the keyspace does it show up?  Also, after adding 
>> a new column family programmatically you can’t use it immediately, you have 
>> to wait for it to propagate.  You can use calls to describe schema to do so, 
>> keep calling it until every node is on the same schema.
>> 
>>  
>> 
>> -Jeremiah
>> 
>>  
>> 
>> From: Tharindu Mathew [mailto:mcclou...@gmail.com] 
>> Sent: Wednesday, February 29, 2012 8:27 AM
>> To: user
>> Subject: Schema change causes exception when adding data
>> 
>>  
>> 
>> Hi,
>> 
>> I have a 3 node cluster and I'm dynamically updating a keyspace with a new 
>> column family. Then, when I try to write records to it I get the following 
>> exception shown at [1].
>> 
>> How do I avoid this. I'm using Hector and the default consistency level of 
>> QUORUM is used. Cassandra version 0.7.8. Replication Factor is 1.
>> 
>> How can I solve my problem?
>> 
>> [1] -
>> me.prettyprint.hector.api.exceptions.HInvalidRequestException: 
>> InvalidRequestException(why:unconfigured columnfamily proxySummary)
>> 
>> at 
>> me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:42)
>> 
>> at 
>> me.prettyprint.cassandra.service.KeyspaceServiceImpl$10.execute(KeyspaceServiceImpl.java:397)
>> 
>> at 
>> me.prettyprint.cassandra.service.KeyspaceServiceImpl$10.execute(KeyspaceServiceImpl.java:383)
>> 
>> at 
>> me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:101)
>> 
>> at 
>> me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:156)
>> 
>> at 
>> me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:129)
>> 
>> at 
>> me.prettyprint.cassandra.service.KeyspaceServiceImpl.multigetSlice(KeyspaceServiceImpl.java:401)
>> 
>> at 
>> me.prettyprint.cassandra.model.thrift.ThriftMultigetSliceQuery$1.doInKeyspace(ThriftMultigetSliceQuery.java:67)
>> 
>> at 
>> me.prettyprint.cassandra.model.thrift.ThriftMultigetSliceQuery$1.doInKeyspace(ThriftMultigetSliceQuery.java:59)
>> 
>> at 
>> me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
>> 
>> at 
>> me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:72)
>> 
>> at 
>> me.prettyprint.cassandra.model.thrift.ThriftMultigetSliceQuery.execute(ThriftMultigetSliceQuery.java:58)
>> 
>> 
>> 
>> -- 
>> Regards,
>> 
>> Tharindu
>> 
>>  
>> 
>> blog: http://mackiemathew.com/
>> 
>>  
>> 
>> 
>> 
>> 
>> -- 
>> Regards,
>> 
>> Tharindu
>> 
>> blog: http://mackiemathew.com/
>> 
>> 
>> 
>> 
>> -- 
>> Regards,
>> 
>> Tharindu
>> 
>> blog: http://mackiemathew.com/
>> 
> 
> 
> 
> 
> -- 
> Regards,
> 
> Tharindu
> 
> blog: http://mackiemathew.com/
> 



Re: slow read

2012-03-05 Thread ruslan usifov
And sum of all rq/s threads is 160??

2012/3/5 Jeesoo Shin 

> Thank you for reply. :)
> Yes I did multiple thread.
> 160, 320 gave me same result.
>
> On 3/5/12, ruslan usifov  wrote:
> > 2012/3/5 Jeesoo Shin 
> >
> >> Hi all.
> >>
> >> I have very SLOW READ here. :-(
> >> I made a cluster with three node (aws xlarge, replication = 3)
> >> Cassandra version is 1.0.6
> >> I have inserted 1,000,000 rows. (standard column)
> >> Each row has 200 columns.
> >> Each column has 16 byte key,  512 byte value.
> >>
> >> I used Hector createSliceQuery to get one column in a row.
> >> This basic query(random row, fixed column) is created with multiple
> >> thread and hit cassandra.
> >>
> >> I only get up to 140 request per second. Is this all I can get for read?
> >> Or am I doing something wrong?
> >> Interestingly, when I request rows which doesn't exist, it goes up to
> >> 1600 per second.
> >>
> >>
> > You must test read performance by paralel test (ie multiple threads). The
> > result when not existent rows are more faster is result of bloom filter
> >
> >
> >
> >>
> >> ANY insight, share will be extremely helpful.
> >> Thank you.
> >>
> >> Regards,
> >> Jeesoo.
> >>
> >
>


Re: composite types in CQL

2012-03-05 Thread aaron morton
It's not currently supported in CQL 
https://issues.apache.org/jira/browse/CASSANDRA-3761

You can do it using the CLI, see the online help. 

Cheers


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 2/03/2012, at 10:39 AM, Bayle Shanks wrote:

> hi, i'm wondering how to do composite data
> storage types in CQL. I am trying to mimic the Composite Types
> functionality of the Pycassa client:
> 
> http://pycassa.github.com/pycassa/assorted/composite_types.html
> 
> 
> In short, in Pycassa you can do something like:
> 
> ---
> 
> itemTimeCompositeType = CompositeType(UTF8Type(), LongType())
> 
> pycassa.system_manager.SystemManager().create_column_family(
>  keyspaceName,
>  columnFamilyName, 
>  key_validation_class=itemTimeCompositeType
> )
> 
> ...
> 
> columnFamily.insert(self._makeKey(item, time_as_integer), {field : value})
> 
> ---
> 
> 
> and then your primary key for this column family is a pair of a string
> and an integer. This is important because i am using
> ByteOrderedPartitioner and doing range scans among keys which share
> the same string 'item' but have different values for the integer.
> 
> My motivation is that i am trying to port
> https://github.com/bshanks/cassandra-timeseries-py to Ruby and i
> thought i might try CQL.
> 
> thanks,
>  bayle



Re: Test Data creation in Cassandra

2012-03-05 Thread aaron morton
try tools/stress in the source distribution. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 3/03/2012, at 6:01 AM, A J wrote:

> What is the best way to create millions of test data in Cassandra ?
> 
> I would like to have some script where I first insert say 100 rows in
> a CF. Then reinsert the same data on 'server side' with new unique
> key. That will make it 200 rows. Then continue the exercise a few
> times till I get lot of records.
> I don't care if the column names and values are identical between the
> different rows. Just a lot of records generated for a few seed
> records.
> 
> The rows are very fat. So I don't want to use any client side
> scripting that would push individual or batched rows to cassandra.
> 
> Thanks for any tips.



RE: cli question

2012-03-05 Thread Rishabh Agrawal
I faced the same issue some time back. Solution which fit my bill is as follows:

CREATE COLUMN FAMILY aaa
with comparator = 'CompositeType(UTF8Type,UTF8Type)'
and default_validation_class = 'UTF8Type'
and key_validation_class = 'CompositeType(UTF8Type,UTF8Type,UTF8Type,)';

notice I have mentioned three datatypes or validators in key_validation_class 
under CompositeType.

Now if I have to insert with key aaa:bbb:ccc it will work smoothly and even if 
I wish to insert with just aaa:bbb it will work just fine.

Do let me know if it solves your problem.

Regards
RIshabh Agrawal


From: Tamar Fraenkel [mailto:ta...@tok-media.com]
Sent: Monday, March 05, 2012 1:19 PM
To: cassandra-u...@incubator.apache.org
Subject: cli question

Hi!
I have CF with the following deffinition:

CREATE COLUMN FAMILY a_b_indx
with comparator = 'CompositeType(LongType,UUIDType)'
and default_validation_class = 'UTF8Type'
and key_validation_class = 'CompositeType(UTF8Type,UTF8Type)';

Where the key may be a composite of the following two strings: 'AAA' and 
'BBB:CCC'
Notice, that the second string has ':' in it.
I try to query for rows I know exist in the CF but can't.
I tried those and many more :)

  *   get  a_b_indx ['AAA:BBB:CCC'];
  *   get  a_b_indx ['AAA:BBB\:CCC'];
  *   get  a_b_indx [utf8('AAA'):utf8('BBB:CCC')];

Is it possible? Does anyone know how?

Thanks,

Tamar Fraenkel
Senior Software Engineer, TOK Media
[Inline image 1]

ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956






Impetus' Head of Innovation labs, Vineet Tyagi will be presenting on 'Big Data 
Big Costs?' at the Strata Conference, CA (Feb 28 - Mar 1) http://bit.ly/bSMWd7.

Listen to our webcast 'Hybrid Approach to Extend Web Apps to Tablets & 
Smartphones' available at http://bit.ly/yQC1oD.


NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.
<>

running two rings on the same subnet

2012-03-05 Thread Tamar Fraenkel
Hi!
I have a Cassandra  cluster with two nodes

nodetool ring -h localhost
Address DC  RackStatus State   LoadOwns
   Token

   85070591730234615865843651857942052864
10.0.0.19   datacenter1 rack1   Up Normal  488.74 KB
50.00%  0
10.0.0.28   datacenter1 rack1   Up Normal  504.63 KB
50.00%  85070591730234615865843651857942052864

I want to create a second ring with the same name but two different nodes.
using tokengentool I get the same tokens as they are affected from the
number of nodes in a ring.

My question is like this:
Lets say I create two new VMs, with IPs: 10.0.0.31 and 10.0.0.11
*In 10.0.0.31 cassandra.yaml I will set*
initial_token: 0
seeds: "10.0.0.31"
listen_address: 10.0.0.31
rpc_address: 0.0.0.0

*In 10.0.0.11 cassandra.yaml I will set*
initial_token: 85070591730234615865843651857942052864
seeds: "10.0.0.31"
listen_address: 10.0.0.11
rpc_address: 0.0.0.0

*Would the rings be separate?*

Thanks,

*Tamar Fraenkel *
Senior Software Engineer, TOK Media

[image: Inline image 1]

ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956
<>

Re: cli question

2012-03-05 Thread Tamar Fraenkel
Thanks!
I decided to just replace all ":" with "^" and I can simply run:
get  a_b_indx ['AAA:BBB^CCC'];


*Tamar Fraenkel *
Senior Software Engineer, TOK Media

[image: Inline image 1]

ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956





On Mon, Mar 5, 2012 at 11:58 AM, Rishabh Agrawal <
rishabh.agra...@impetus.co.in> wrote:

>  I faced the same issue some time back. Solution which fit my bill is as
> follows:
>
>
>
> CREATE COLUMN FAMILY aaa
>
> with comparator = 'CompositeType(UTF8Type,UTF8Type)'
>
> and default_validation_class = 'UTF8Type'
>
> and key_validation_class =
> 'CompositeType(UTF8Type,UTF8Type,UTF8Type,)';
>
>
>
> notice I have mentioned three datatypes or validators in
> key_validation_class under CompositeType.
>
>
>
> Now if I have to insert with key aaa:bbb:ccc it will work smoothly and
> even if I wish to insert with just aaa:bbb it will work just fine.
>
>
>
> Do let me know if it solves your problem.
>
>
>
> Regards
>
> RIshabh Agrawal
>
>
>
>
>
> *From:* Tamar Fraenkel [mailto:ta...@tok-media.com]
> *Sent:* Monday, March 05, 2012 1:19 PM
> *To:* cassandra-u...@incubator.apache.org
> *Subject:* cli question
>
>
>
> Hi!
> I have CF with the following deffinition:
>
>
>
> CREATE COLUMN FAMILY a_b_indx
>
> with comparator = 'CompositeType(LongType,UUIDType)'
>
> and default_validation_class = 'UTF8Type'
>
> and key_validation_class = 'CompositeType(UTF8Type,UTF8Type)';
>
>
>
> Where the key may be a composite of the following two strings: 'AAA' and
> 'BBB:CCC'
>
> Notice, that the second string has ':' in it.
>
> I try to query for rows I know exist in the CF but can't.
>
> I tried those and many more :)
>
>- get  a_b_indx ['AAA:BBB:CCC'];
>- get  a_b_indx ['AAA:BBB\:CCC'];
>- get  a_b_indx [utf8('AAA'):utf8('BBB:CCC')];
>
>
>
> Is it possible? Does anyone know how?
>
>
>
> Thanks,
>
>
>   *Tamar Fraenkel *
> Senior Software Engineer, TOK Media
>
> [image: Inline image 1]
>
>
> ta...@tok-media.com
> Tel:   +972 2 6409736
> Mob:  +972 54 8356490
> Fax:   +972 2 5612956
>
>
>
>
>
>
>
> --
>
> Impetus’ Head of Innovation labs, Vineet Tyagi will be presenting on ‘Big
> Data Big Costs?’ at the Strata Conference, CA (Feb 28 - Mar 1)
> http://bit.ly/bSMWd7.
>
> Listen to our webcast ‘Hybrid Approach to Extend Web Apps to Tablets &
> Smartphones’ available at http://bit.ly/yQC1oD.
>
>
> NOTE: This message may contain information that is confidential,
> proprietary, privileged or otherwise protected by law. The message is
> intended solely for the named addressee. If received in error, please
> destroy and notify the sender. Any use of this email is prohibited when
> received in error. Impetus does not represent, warrant and/or guarantee,
> that the integrity of this communication has been maintained nor that the
> communication is free of errors, virus, interception or interference.
>
<><>

Re: Maximum Row Size in Cassandra : Potential Bottleneck

2012-03-05 Thread aaron morton
> Is there any way in which the writes can be made pretty slow on different 
> nodes. Ideally I would like data to be written on one node and eventually 
> replicating across other nodes I dont really need a real time update, so can 
> pretty much live with slow writes.
Replicating "inside" the mutation request is a core feature of cassandra. 

You can hack something by disabling the the gossip on a node and doing the 
inserts on it (at CL One). Re-enable gossip and let HH send the data to the 
other nodes, or disable HH and use repair to distribute the changes. HH will be 
less resource intensive. 

> 1250.188: [Full GC [PSYoungGen: 76825K->0K(571648K)] [PSOldGen: 
> 7356362K->2356764K(7569408K)] 7433188K->2356764K(8141056K) [PSPermGen: 
> 32579K->32579K(61056K)], 8.1019330 secs] [Times: user=8.10 sys=0.00, 
> real=8.10 secs]
What JVM are you using and what are the JVM options ? 
(The info I can find about PSYoungGen suggest it's pretty old. )

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 5/03/2012, at 1:13 AM, Shubham Srivastava wrote:

> Tried all the possible options and nothing actually seems to work.
> 
> Was trying to get insights of where exactly problem is arising when writes on 
> done on one node and read on another . I found that GC gets triggered when 
> writes are done on the other node through RowMutationVerbHandler.
> 
> Settings
> 
> 1.I tested this on two node setup with RF:2 and Read CL:1. 
> 2.Also heap is of 8G and Xmn:800M on both the nodes with 4Cores.
> 3.I am using concurrent_write:32 as default (8 * core). 
> 4. -Dcassandra.compaction.priority=1
> 5.No explicit GC settings commented the one in solandra-env.sh
> 6.in_memory_compaction_limit_in_mb: 1
> 7.read repair:0.1
> 8.concurrent_compactors: 1
> 
> 
> Is there any way in which the writes can be made pretty slow on different 
> nodes. Ideally I would like data to be written on one node and eventually 
> replicating across other nodes I dont really need a real time update, so can 
> pretty much live with slow writes.
> 
> Sharing the cassandra and GC logs as below
> 
> 
> 
> 
> DEBUG [MutationStage:5] 2012-03-04 17:07:52,014 RowMutationVerbHandler.java 
> (line 56) RowMutation(keyspace='L', 
> key='686f74656c737e30efbfbf7365617263686669656c64efbfbf64657320', 
> modifications=[ColumnFamily(TI [357025:false:4@1330861061454,])]) applied.  
> Sending response to 14541197@/10.86.29.21
> DEBUG [MutationStage:5] 2012-03-04 17:07:52,014 RowMutationVerbHandler.java 
> (line 44) Applying RowMutation(keyspace='L', 
> key='686f74656c737e30efbfbf686f74656c666163696c69747964657461696cefbfbf49726f6e696e672053657276696365',
>  modifications=[ColumnFamily(TI [357025:false:2@1330861061454,])])
> DEBUG [MutationStage:5] 2012-03-04 17:07:52,014 Table.java (line 387) 
> applying mutation of row 
> 686f74656c737e30efbfbf686f74656c666163696c69747964657461696cefbfbf49726f6e696e672053657276696365
> DEBUG [MutationStage:5] 2012-03-04 17:07:52,014 RowMutationVerbHandler.java 
> (line 56) RowMutation(keyspace='L', 
> key='686f74656c737e30efbfbf686f74656c666163696c69747964657461696cefbfbf49726f6e696e672053657276696365',
>  modifications=[ColumnFamily(TI [357025:false:2@1330861061454,])]) applied.  
> Sending response to 14541198@/10.86.29.21
> DEBUG [MutationStage:5] 2012-03-04 17:07:52,014 RowMutationVerbHandler.java 
> (line 44) Applying RowMutation(keyspace='L', 
> key='686f74656c737e30efbfbf686f74656c666163696c697479efbfbf4c61756e647279205365727669636573',
>  modifications=[ColumnFamily(TI [357025:false:2@1330861061454,])])
> DEBUG [MutationStage:5] 2012-03-04 17:07:52,014 Table.java (line 387) 
> applying mutation of row 
> 686f74656c737e30efbfbf686f74656c666163696c697479efbfbf4c61756e647279205365727669636573
> DEBUG [MutationStage:5] 2012-03-04 17:07:52,014 RowMutationVerbHandler.java 
> (line 56) RowMutation(keyspace='L', 
> key='686f74656c737e30efbfbf686f74656c666163696c697479efbfbf4c61756e647279205365727669636573',
>  modifications=[ColumnFamily(TI [357025:false:2@1330861061454,])]) applied.  
> Sending response to 14541199@/10.86.29.21
> DEBUG [248896865@qtp-1257398760-2] 2012-03-04 17:07:51,572 
> SolrIndexReader.java (line 928) getCoreCacheKey() - start
> DEBUG [MutationStage:32] 2012-03-04 17:07:51,426 RowMutationVerbHandler.java 
> (line 56) RowMutation(keyspace='L', 
> key='686f74656c737e30efbfbf636f756e74727953636f7265', 
> modifications=[ColumnFamily(FC [356996:false:4@1330861059172,])]) applied.  
> Sending response to 14532390@/10.86.29.21
> DEBUG [1568261127@qtp-1257398760-14] 2012-03-04 17:07:51,425 
> SolrIndexSearcher.java (line 557) doc(int, Set) - start
> DEBUG [MutationStage:2] 2012-03-04 17:07:51,425 Table.java (line 387) 
> applying mutation of row 686f74656c737e30efbfbf6e616d65efbfbf6c61
> DEBUG [1568261127@qtp-1257398760-14] 2012-03-04 17:07:52,015 
> SolrIndexSearcher.java (line 557) doc(int, Set) - start
> DEBUG [1861954021@qtp-1257398760-11] 2012-

Re: Writing Data To A Super Column That Is In A Column Family With A Type Of Standard

2012-03-05 Thread aaron morton
> Is it possible to mix both Standard and Super columns in the same
> Column Family? 
No. 

> create column family users
>with comparator = UTF8T
>and key_validation_class=UTF8TYpe
>and compression_options = { sstable_compression:SnappyCompressor,
> chunk_length_kb:64}
>and column_metadata = [
>{ column_name: FirstName, validation_class : UTF8Type},
>{ column_name: LastName, validation_class : UTF8Type},
>{ column_name: FavStore, validation_class : IntegerType},
>{ column_type: super, column_name: HomeAddress…

column_type is not a support attribute for column_metadata.

That is not a valid create column family statement, it fails to execute on a 
clean 1.0.7. install . If you are able to get it working can you show the 
output from the CLI

it may be a trick performed by the client. 
Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 5/03/2012, at 4:24 PM, Christopher Bowland wrote:

> Is it possible to mix both Standard and Super columns in the same
> Column Family? One of our Perl developers seems to have this working,
> but I have been using the Java Pelops client and have been unable to
> make this happen.
> 
> I'm not asking how to do with Pelops as I'll bug those guys if it is
> possible, but can standard and super columns be mixed in the same
> column family?
> 
> Here's the script he was using (which looks like he is able to define
> a column_type of super for some the columns but not others):
> 
> create column family users
>with comparator = UTF8T
>and key_validation_class=UTF8TYpe
>and compression_options = { sstable_compression:SnappyCompressor,
> chunk_length_kb:64}
>and column_metadata = [
>{ column_name: FirstName, validation_class : UTF8Type},
>{ column_name: LastName, validation_class : UTF8Type},
>{ column_name: FavStore, validation_class : IntegerType},
>{ column_type: super, column_name: HomeAddress
>, column_metadata = [
>  { column_name: Street, validation_class : UTF8Type},
>  { column_name: State, validation_class : UTF8Type},
>  { column_name: Zip, validation_class : LongType} ] },
>{ column_type: super, column_name: WorkAddress
>, column_metadata = [
>{ column_name: Street, validation_class : UTF8Type},
>{ column_name: State, validation_class : UTF8Type},
>{ column_name: Zip, validation_class : LongType} ] },
>{ column_type: super, column_name: Favorites }
>];
> 
> And here's the CLI output which for the column_types of super
> indicates a hash value instead of a scaler.
> 
> list users;
> Using default limit of 100
> ---
> RowKey: bobjones
> => (column=FavStore, value=59580595188280, timestamp=1330696438)
> => (column=Favorites, value=HASH(0x3618bc0), timestamp=1330696438)
> => (column=FirstName, value=Bob, timestamp=1330696438)
> => (column=HomeAddress, value=HASH(0x3618b30), timestamp=1330696438)
> => (column=LastName, value=Jones, timestamp=1330696438)
> => (column=WorkAddress, value=HASH(0x3619688), timestamp=1330696438)
> 
> 
> Thanks in advance.
> 
> cb
> 
> -- 
> Christopher Bowland
> cbowl...@gmail.com



Mutation Dropped Messages

2012-03-05 Thread Tiwari, Dushyant
Hi All,

While benchmarking Cassandra I found "Mutation Dropped" messages in the logs.  
Now I know this is a good old question. It will be really great if someone can 
provide a check list to recover when such a thing happens. I am looking for 
answers of the following questions  -


1.   Which parameters to tune in the config files? - Especially looking for 
heavy writes

2.   What is the difference between TimedOutException and silently dropping 
mutation messages while operating on a CL of QUORUM.


Regards,
Dushyant

--
NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or 
views contained herein are not intended to be, and do not constitute, advice 
within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and 
Consumer Protection Act. If you have received this communication in error, 
please destroy all electronic and paper copies and notify the sender 
immediately. Mistransmission is not intended to waive confidentiality or 
privilege. Morgan Stanley reserves the right, to the extent permitted under 
applicable law, to monitor electronic communications. This message is subject 
to terms available at the following link: 
http://www.morganstanley.com/disclaimers. If you cannot access these links, 
please notify us by reply message and we will send the contents to you. By 
messaging with Morgan Stanley you consent to the foregoing.


Re: slow read

2012-03-05 Thread aaron morton
Where is the client running from ? 

To see if a node it keeping up with requests look at nodetool tpstats, check if 
the read stage is backing up. 

To see how long a read takes, use nodetool cfstats and look at the read 
latency. (this the latency of a read on that node, not cluster wide)

To see how long a read takes cluster wide, use the StorageProxyMBean via 
JConsole. 

Hope that helps. 

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 5/03/2012, at 10:46 PM, ruslan usifov wrote:

> And sum of all rq/s threads is 160?? 
> 
> 2012/3/5 Jeesoo Shin 
> Thank you for reply. :)
> Yes I did multiple thread.
> 160, 320 gave me same result.
> 
> On 3/5/12, ruslan usifov  wrote:
> > 2012/3/5 Jeesoo Shin 
> >
> >> Hi all.
> >>
> >> I have very SLOW READ here. :-(
> >> I made a cluster with three node (aws xlarge, replication = 3)
> >> Cassandra version is 1.0.6
> >> I have inserted 1,000,000 rows. (standard column)
> >> Each row has 200 columns.
> >> Each column has 16 byte key,  512 byte value.
> >>
> >> I used Hector createSliceQuery to get one column in a row.
> >> This basic query(random row, fixed column) is created with multiple
> >> thread and hit cassandra.
> >>
> >> I only get up to 140 request per second. Is this all I can get for read?
> >> Or am I doing something wrong?
> >> Interestingly, when I request rows which doesn't exist, it goes up to
> >> 1600 per second.
> >>
> >>
> > You must test read performance by paralel test (ie multiple threads). The
> > result when not existent rows are more faster is result of bloom filter
> >
> >
> >
> >>
> >> ANY insight, share will be extremely helpful.
> >> Thank you.
> >>
> >> Regards,
> >> Jeesoo.
> >>
> >
> 



Re: running two rings on the same subnet

2012-03-05 Thread aaron morton
> Would the rings be separate?
Yes. 
But I would recommend you give them different cluster names. It's a good 
protections against nodes accidentally  joining the wrong cluster. 

cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 5/03/2012, at 11:06 PM, Tamar Fraenkel wrote:

> Hi!
> I have a Cassandra  cluster with two nodes
> 
> nodetool ring -h localhost
> Address DC  RackStatus State   LoadOwns   
>  Token
>   
>  85070591730234615865843651857942052864
> 10.0.0.19   datacenter1 rack1   Up Normal  488.74 KB   50.00% 
>  0
> 10.0.0.28   datacenter1 rack1   Up Normal  504.63 KB   50.00% 
>  85070591730234615865843651857942052864
> 
> I want to create a second ring with the same name but two different nodes.
> using tokengentool I get the same tokens as they are affected from the number 
> of nodes in a ring.
> 
> My question is like this:
> Lets say I create two new VMs, with IPs: 10.0.0.31 and 10.0.0.11
> In 10.0.0.31 cassandra.yaml I will set
> initial_token: 0
> seeds: "10.0.0.31"
> listen_address: 10.0.0.31
> rpc_address: 0.0.0.0
> 
> In 10.0.0.11 cassandra.yaml I will set
> initial_token: 85070591730234615865843651857942052864
> seeds: "10.0.0.31"
> listen_address: 10.0.0.11
> rpc_address: 0.0.0.0 
> 
> Would the rings be separate?
> 
> Thanks,
> 
> Tamar Fraenkel 
> Senior Software Engineer, TOK Media 
> 
> 
> 
> ta...@tok-media.com
> Tel:   +972 2 6409736 
> Mob:  +972 54 8356490 
> Fax:   +972 2 5612956 
> 
> 
> 



-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com



Re: running two rings on the same subnet

2012-03-05 Thread Hontvári József Levente

  
  
You have to use PropertyFileSnitch and NetworkTopologyStrategy to
create a multi-datacenter setup with two circles. You can start
reading from this page:
http://www.datastax.com/docs/1.0/cluster_architecture/replication#about-replica-placement-strategy

Moreover all tokens must be unique (even across datacenters),
although - from pure curiosity - I wonder what is the rationale
behind this.

By the way, can someone enlighten me about the first line in the
output of the nodetool. Obviously it contains a token, but nothing
else. It seems like a formatting glitch, but maybe it has a role. 

On 2012.03.05. 11:06, Tamar Fraenkel wrote:

  Hi!
I have a Cassandra 
  cluster with two nodes

  
  
  nodetool ring -h localhost
  Address         DC          Rack  
       Status State   Load            Owns    Token
                                     
                                           
   85070591730234615865843651857942052864
  10.0.0.19       datacenter1 rack1  
      Up     Normal  488.74 KB       50.00%  0
  10.0.0.28       datacenter1 rack1  
      Up     Normal  504.63 KB       50.00%
   85070591730234615865843651857942052864



I want to create a second ring with the same name but two
  different nodes.
using tokengentool I get the same tokens as they are
  affected from the number of nodes in a ring.


My question is like this:
Lets say I create two new VMs, with IPs: 10.0.0.31 and
  10.0.0.11
In 10.0.0.31 cassandra.yaml I will set
initial_token: 0
seeds: "10.0.0.31"
listen_address: 10.0.0.31
rpc_address: 0.0.0.0


In 10.0.0.11 cassandra.yaml I will set
initial_token: 85070591730234615865843651857942052864
seeds: "10.0.0.31"

listen_address: 10.0.0.11

rpc_address: 0.0.0.0 


Would the rings be separate?


Thanks,

  
Tamar Fraenkel 
  Senior Software Engineer, TOK Media 
  
  

  ta...@tok-media.com
  Tel:   +972
2 6409736 
  Mob:  +972
54 8356490 
  Fax:   +972
2 5612956 
  
  


  
  

  


  



Re: Mutation Dropped Messages

2012-03-05 Thread aaron morton
> 1.   Which parameters to tune in the config files? – Especially looking 
> for heavy writes
The node is overloaded. It may be because there are no enough nodes, or the 
node is under temporary stress such as GC or repair. 
If you have spare IO / CPU capacity you could increase the current_writes to 
increase throughput on the write stage. You then need to ensure the commit log 
and, to a lesser degree, the data volumes can keep up. 

> 2.   What is the difference between TimedOutException and silently 
> dropping mutation messages while operating on a CL of QUORUM.
TimedOutExceptions means CL nodes did not respond to the coordinator before 
rpc_timeout. Dropping messages happens when a message is removed from the queue 
in the a thread pool after rpc_timeout has occurred. it is a feature of the 
architecture, and correct behaviour under stress. 
Inconsistencies created by dropped messages are repaired via reads as high CL, 
HH (in 1.+), Read Repair or Anti Entropy.

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 5/03/2012, at 11:32 PM, Tiwari, Dushyant wrote:

> Hi All,
>  
> While benchmarking Cassandra I found “Mutation Dropped” messages in the logs. 
>  Now I know this is a good old question. It will be really great if someone 
> can provide a check list to recover when such a thing happens. I am looking 
> for answers of the following questions  -
>  
> 1.   Which parameters to tune in the config files? – Especially looking 
> for heavy writes
> 2.   What is the difference between TimedOutException and silently 
> dropping mutation messages while operating on a CL of QUORUM.
>  
>  
> Regards,
> Dushyant
> NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions 
> or views contained herein are not intended to be, and do not constitute, 
> advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform 
> and Consumer Protection Act. If you have received this communication in 
> error, please destroy all electronic and paper copies and notify the sender 
> immediately. Mistransmission is not intended to waive confidentiality or 
> privilege. Morgan Stanley reserves the right, to the extent permitted under 
> applicable law, to monitor electronic communications. This message is subject 
> to terms available at the following link: 
> http://www.morganstanley.com/disclaimers. If you cannot access these links, 
> please notify us by reply message and we will send the contents to you. By 
> messaging with Morgan Stanley you consent to the foregoing.



Re: running two rings on the same subnet

2012-03-05 Thread aaron morton
Do you want to create two separate clusters or a single cluster with two data 
centres ? 

If it's the later, token selection is discussed here 
http://www.datastax.com/docs/1.0/install/cluster_init#token-gen-cassandra
 
> Moreover all tokens must be unique (even across datacenters), although - from 
> pure curiosity - I wonder what is the rationale behind this.
Otherwise data is not evenly distributed.

> By the way, can someone enlighten me about the first line in the output of 
> the nodetool. Obviously it contains a token, but nothing else. It seems like 
> a formatting glitch, but maybe it has a role. 
It's the exclusive lower bound token for the first node in the ring. This also 
happens to be the token for the last node in the ring. 

In your setup 
10.0.0.19 "owns" (85070591730234615865843651857942052864+1) to 0
10.0.0.28 "owns"  (0 + 1) to 85070591730234615865843651857942052864

(does not imply primary replica, just used to map keys to nodes.)
 


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 5/03/2012, at 11:38 PM, Hontvári József Levente wrote:

> You have to use PropertyFileSnitch and NetworkTopologyStrategy to create a 
> multi-datacenter setup with two circles. You can start reading from this page:
> http://www.datastax.com/docs/1.0/cluster_architecture/replication#about-replica-placement-strategy
> 
> Moreover all tokens must be unique (even across datacenters), although - from 
> pure curiosity - I wonder what is the rationale behind this.
> 
> By the way, can someone enlighten me about the first line in the output of 
> the nodetool. Obviously it contains a token, but nothing else. It seems like 
> a formatting glitch, but maybe it has a role. 
> 
> On 2012.03.05. 11:06, Tamar Fraenkel wrote:
>> 
>> Hi!
>> I have a Cassandra  cluster with two nodes
>> 
>> nodetool ring -h localhost
>> Address DC  RackStatus State   LoadOwns  
>>   Token
>>  
>>   85070591730234615865843651857942052864
>> 10.0.0.19   datacenter1 rack1   Up Normal  488.74 KB   
>> 50.00%  0
>> 10.0.0.28   datacenter1 rack1   Up Normal  504.63 KB   
>> 50.00%  85070591730234615865843651857942052864
>> 
>> I want to create a second ring with the same name but two different nodes.
>> using tokengentool I get the same tokens as they are affected from the 
>> number of nodes in a ring.
>> 
>> My question is like this:
>> Lets say I create two new VMs, with IPs: 10.0.0.31 and 10.0.0.11
>> In 10.0.0.31 cassandra.yaml I will set
>> initial_token: 0
>> seeds: "10.0.0.31"
>> listen_address: 10.0.0.31
>> rpc_address: 0.0.0.0
>> 
>> In 10.0.0.11 cassandra.yaml I will set
>> initial_token: 85070591730234615865843651857942052864
>> seeds: "10.0.0.31"
>> listen_address: 10.0.0.11
>> rpc_address: 0.0.0.0 
>> 
>> Would the rings be separate?
>> 
>> Thanks,
>> 
>> Tamar Fraenkel 
>> Senior Software Engineer, TOK Media 
>> 
>> 
>> 
>> ta...@tok-media.com
>> Tel:   +972 2 6409736 
>> Mob:  +972 54 8356490 
>> Fax:   +972 2 5612956 
>> 
>> 
>> 
> 



Issue with nodetool clearsnapshot

2012-03-05 Thread B R
Version 0.8.9

We run a 2 node cluster with RF=2. We ran a scrub and after that ran the
clearsnapshot to remove the backup snapshot created by scrub. It seems that
instead of removing the snapshot, clearsnapshot moved the data files from
the snapshot directory to the parent directory and the size of the data for
that keyspace has doubled. Many of the files are looking like duplicates.

in Keyspace1 directory
156987786084 Jan 21 03:18 Standard1-g-7317-Data.db
156987786084 Mar  4 01:33 Standard1-g-8850-Data.db
118211555728 Jan 31 12:50 Standard1-g-7968-Data.db
118211555728 Mar  3 22:58 Standard1-g-8840-Data.db
116902342895 Feb 25 02:04 Standard1-g-8832-Data.db
116902342895 Mar  3 22:10 Standard1-g-8836-Data.db
93788425710 Feb 21 04:20 Standard1-g-8791-Data.db
93788425710 Mar  4 00:29 Standard1-g-8845-Data.db
.

Even though the nodetool ring command shows the correct data size for the
node, the du -sh on the keyspace directory gives double the size.

Can you guide us to proceed from this situation ?

Thanks.


Re: how stable is 1.0 these days?

2012-03-05 Thread Viktor Jevdokimov
1.0.7 is very stable, weeks in high-load production environment without any
exception, 1.0.8 should be even more stable, check changes.txt for what was
fixed.


2012/3/2 Marcus Eriksson 

> beware of https://issues.apache.org/jira/browse/CASSANDRA-3820 though if
> you have many keys per node
>
> other than that, yep, it seems solid
>
> /Marcus
>
>
> On Wed, Feb 29, 2012 at 6:20 PM, Thibaut Britz <
> thibaut.br...@trendiction.com> wrote:
>
>> Thanks!
>>
>> We will test it on our test cluster in the coming weeks and hopefully put
>> it into production on our 200 node main cluster. :)
>>
>> Thibaut
>>
>> On Wed, Feb 29, 2012 at 5:52 PM, Edward Capriolo 
>> wrote:
>>
>>> On Wed, Feb 29, 2012 at 10:35 AM, Thibaut Britz
>>>  wrote:
>>> > Any more feedback on larger deployments of 1.0.*?
>>> >
>>> > We are eager to try out the new features in production, but don't want
>>> to
>>> > run into bugs as on former 0.7 and 0.8 versions.
>>> >
>>> > Thanks,
>>> > Thibaut
>>> >
>>> >
>>> >
>>> > On Tue, Jan 31, 2012 at 6:59 AM, Ben Coverston <
>>> ben.covers...@datastax.com>
>>> > wrote:
>>> >>
>>> >> I'm not sure what Carlo is referring to, but generally if you have
>>> done,
>>> >> thousands of migrations you can end up in a situation where the
>>> migrations
>>> >> take a long time to replay, and there are some race conditions that
>>> can be
>>> >> problematic in the case where there are thousands of migrations that
>>> may
>>> >> need to be replayed while a node is bootstrapped. If you get into this
>>> >> situation it can be fixed by copying migrations from a known good
>>> schema to
>>> >> the node that you are trying to bootstrap.
>>> >>
>>> >> Generally I would advise against frequent schema updates. Unlike rows
>>> in
>>> >> column families the schema itself is designed to be relatively static.
>>> >>
>>> >> On Mon, Jan 30, 2012 at 2:14 PM, Jim Newsham >> >
>>> >> wrote:
>>> >>>
>>> >>>
>>> >>> Could you also elaborate for creating/dropping column families?
>>>  We're
>>> >>> currently working on moving to 1.0 and using dynamically created
>>> tables, so
>>> >>> I'm very interested in what issues we might encounter.
>>> >>>
>>> >>> So far the only thing I've encountered (with 1.0.7 + hector 1.0-2) is
>>> >>> that dropping a cf may sometimes fail with UnavailableException.  I
>>> think
>>> >>> this happens when the cf is busy being compacted.  When I
>>> sleep/retry within
>>> >>> a loop it eventually succeeds.
>>> >>>
>>> >>> Thanks,
>>> >>> Jim
>>> >>>
>>> >>>
>>> >>> On 1/26/2012 7:32 AM, Pierre-Yves Ritschard wrote:
>>> 
>>>  Can you elaborate on the composite types instabilities ? is this
>>>  specific to hector as the radim's posts suggests ?
>>>  These one liner answers are quite stressful :)
>>> 
>>>  On Thu, Jan 26, 2012 at 1:28 PM, Carlo Pires
>>>   wrote:
>>> >
>>> > If you need to use composite types and create/drop column families
>>> on
>>> > the
>>> > fly you must be prepared to instabilities.
>>> >
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Ben Coverston
>>> >> DataStax -- The Apache Cassandra Company
>>> >>
>>> >
>>>
>>> I would call 1.0.7 rock fricken solid. Incredibly stable. It has been
>>> that way since I updated to 0.8.8  really. TBs of data, billions of
>>> requests a day, and thanks to JAMM, memtable type auto-tuning, and
>>> other enhancements I rarely, if ever, find a node in a state where it
>>> requires a restart. My clusters are beast-ing.
>>>
>>> There always is bugs in software, but coming from a guy who ran
>>> cassandra 0.6.1.Administration on my Cassandra cluster is like a
>>> vacation now.
>>>
>>
>>
>


Re: Huge amount of empty files in data directory.

2012-03-05 Thread Viktor Jevdokimov
After running Cassandra for 2 years in production on Windows servers,
starting from 0.7 beta2 up to 1.0.7 we have moved to Linux and forgot all
the hell we had on Windows. Having JNA, off-heap row cache and normally
working MMAP on Linux you're getting a lot better performance and stability
comparing to Windows, and less maintenance.

2012/3/1 Henrik Schröder 

> Great, thanks!
>
>
> /Henrik
>
>
> On Thu, Mar 1, 2012 at 13:08, Sylvain Lebresne wrote:
>
>> It's a bug, namely: https://issues.apache.org/jira/browse/CASSANDRA-3616
>> You'd want to upgrade.
>>
>> --
>> Sylvain
>>
>> On Thu, Mar 1, 2012 at 1:01 PM, Henrik Schröder 
>> wrote:
>> > Hi,
>> >
>> > We're running Cassandra 1.0.6 on Windows, and noticed that the amount of
>> > files in the datadirectory just keeps growing. We have about 60GB of
>> data
>> > per node, we do a major compaction about once a week, but after
>> compaction
>> > there's a lot of 0-byte temp files and old files that are kept for some
>> > reason. After 50 days of uptime there was around 5 files in each
>> > datadirectory, but when we restarted a server it deleted all the
>> unnecessary
>> > files and it shrunk down to about 200 files.
>> >
>> > We're running without compression, and with the regular compaction
>> strategy,
>> > not leveldb. I don't remember seeing this behaviour in older versions of
>> > Cassandra, shouldn't it delete temp files while running? Is it possible
>> to
>> > force it to delete temp files while running? Is this fixed in a later
>> > version? Or do we have to periodically restart servers to clean up the
>> > datadirectories?
>> >
>> >
>> > /Henrik Schröder
>>
>
>


Re: running two rings on the same subnet

2012-03-05 Thread Tamar Fraenkel
I want tow separate clusters.
*Tamar Fraenkel *
Senior Software Engineer, TOK Media

[image: Inline image 1]

ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956





On Mon, Mar 5, 2012 at 12:48 PM, aaron morton wrote:

> Do you want to create two separate clusters or a single cluster with two
> data centres ?
>
> If it's the later, token selection is discussed here
> http://www.datastax.com/docs/1.0/install/cluster_init#token-gen-cassandra
>
>
> Moreover all tokens must be unique (even across datacenters), although -
> from pure curiosity - I wonder what is the rationale behind this.
>
> Otherwise data is not evenly distributed.
>
> By the way, can someone enlighten me about the first line in the output of
> the nodetool. Obviously it contains a token, but nothing else. It seems
> like a formatting glitch, but maybe it has a role.
>
> It's the exclusive lower bound token for the first node in the ring. This
> also happens to be the token for the last node in the ring.
>
> In your setup
> 10.0.0.19 "owns" (85070591730234615865843651857942052864+1) to 0
> 10.0.0.28 "owns"  (0 + 1) to 85070591730234615865843651857942052864
>
> (does not imply primary replica, just used to map keys to nodes.)
>
>
>
>   -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 5/03/2012, at 11:38 PM, Hontvári József Levente wrote:
>
>  You have to use PropertyFileSnitch and NetworkTopologyStrategy to create
> a multi-datacenter setup with two circles. You can start reading from this
> page:
>
> http://www.datastax.com/docs/1.0/cluster_architecture/replication#about-replica-placement-strategy
>
> Moreover all tokens must be unique (even across datacenters), although -
> from pure curiosity - I wonder what is the rationale behind this.
>
> By the way, can someone enlighten me about the first line in the output of
> the nodetool. Obviously it contains a token, but nothing else. It seems
> like a formatting glitch, but maybe it has a role.
>
> On 2012.03.05. 11:06, Tamar Fraenkel wrote:
>
> Hi!
> I have a Cassandra  cluster with two nodes
>
>  nodetool ring -h localhost
> Address DC  RackStatus State   Load
>  OwnsToken
>
>  85070591730234615865843651857942052864
> 10.0.0.19   datacenter1 rack1   Up Normal  488.74 KB
> 50.00%  0
> 10.0.0.28   datacenter1 rack1   Up Normal  504.63 KB
> 50.00%  85070591730234615865843651857942052864
>
>  I want to create a second ring with the same name but two different
> nodes.
> using tokengentool I get the same tokens as they are affected from the
> number of nodes in a ring.
>
>  My question is like this:
> Lets say I create two new VMs, with IPs: 10.0.0.31 and 10.0.0.11
> *In 10.0.0.31 cassandra.yaml I will set*
> initial_token: 0
> seeds: "10.0.0.31"
> listen_address: 10.0.0.31
> rpc_address: 0.0.0.0
>
>  *In 10.0.0.11 cassandra.yaml I will set*
> initial_token: 85070591730234615865843651857942052864
> seeds: "10.0.0.31"
> listen_address: 10.0.0.11
> rpc_address: 0.0.0.0
>
>  *Would the rings be separate?*
>
>  Thanks,
>
>  *Tamar Fraenkel *
> Senior Software Engineer, TOK Media
>
> 
>
>
> ta...@tok-media.com
> Tel:   +972 2 6409736
> Mob:  +972 54 8356490
> Fax:   +972 2 5612956
>
>
>
>
>
>
<>

Rationale behind incrementing all tokens by one in a different datacenter (was: running two rings on the same subnet)

2012-03-05 Thread Hontvári József Levente

I am thinking about the frequent example:

dc1 - node1: 0
dc1 - node2: large...number

dc2 - node1: 1
dc2 - node2: large...number + 1

In theory using the same tokens in dc2 as in dc1 does not significantly 
affect key distribution, specifically the two keys on the border will 
move to the next one, but that is not much. However it seems that there 
is an unexplained requirement (at least I could not find an 
explanation), that all nodes must have a unique token, even if they are 
put into a different circle by NetworkTopologyStrategy.





On 2012.03.05. 11:48, aaron morton wrote:
Moreover all tokens must be unique (even across datacenters), although 
- from pure curiosity - I wonder what is the rationale behind this.

Otherwise data is not evenly distributed.


RE: Mutation Dropped Messages

2012-03-05 Thread Tiwari, Dushyant
Thanks a lot for the concurrent_writes hint that really improves the 
throughput. Do you mean dropped messages and no timedoutexception will mean the 
data is written somewhere in the cluster and by taking corrective measures 
desired CL can be achieved?



From: aaron morton [mailto:aa...@thelastpickle.com]
Sent: Monday, March 05, 2012 4:15 PM
To: user@cassandra.apache.org
Subject: Re: Mutation Dropped Messages

1.   Which parameters to tune in the config files? - Especially looking for 
heavy writes
The node is overloaded. It may be because there are no enough nodes, or the 
node is under temporary stress such as GC or repair.
If you have spare IO / CPU capacity you could increase the current_writes to 
increase throughput on the write stage. You then need to ensure the commit log 
and, to a lesser degree, the data volumes can keep up.

2.   What is the difference between TimedOutException and silently dropping 
mutation messages while operating on a CL of QUORUM.
TimedOutExceptions means CL nodes did not respond to the coordinator before 
rpc_timeout. Dropping messages happens when a message is removed from the queue 
in the a thread pool after rpc_timeout has occurred. it is a feature of the 
architecture, and correct behaviour under stress.
Inconsistencies created by dropped messages are repaired via reads as high CL, 
HH (in 1.+), Read Repair or Anti Entropy.

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 5/03/2012, at 11:32 PM, Tiwari, Dushyant wrote:


Hi All,

While benchmarking Cassandra I found "Mutation Dropped" messages in the logs.  
Now I know this is a good old question. It will be really great if someone can 
provide a check list to recover when such a thing happens. I am looking for 
answers of the following questions  -

1.   Which parameters to tune in the config files? - Especially looking for 
heavy writes
2.   What is the difference between TimedOutException and silently dropping 
mutation messages while operating on a CL of QUORUM.


Regards,
Dushyant

NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or 
views contained herein are not intended to be, and do not constitute, advice 
within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and 
Consumer Protection Act. If you have received this communication in error, 
please destroy all electronic and paper copies and notify the sender 
immediately. Mistransmission is not intended to waive confidentiality or 
privilege. Morgan Stanley reserves the right, to the extent permitted under 
applicable law, to monitor electronic communications. This message is subject 
to terms available at the following link: 
http://www.morganstanley.com/disclaimers. If you cannot access these links, 
please notify us by reply message and we will send the contents to you. By 
messaging with Morgan Stanley you consent to the foregoing.


--
NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or 
views contained herein are not intended to be, and do not constitute, advice 
within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and 
Consumer Protection Act. If you have received this communication in error, 
please destroy all electronic and paper copies and notify the sender 
immediately. Mistransmission is not intended to waive confidentiality or 
privilege. Morgan Stanley reserves the right, to the extent permitted under 
applicable law, to monitor electronic communications. This message is subject 
to terms available at the following link: 
http://www.morganstanley.com/disclaimers. If you cannot access these links, 
please notify us by reply message and we will send the contents to you. By 
messaging with Morgan Stanley you consent to the foregoing.


RE: Mutation Dropped Messages

2012-03-05 Thread Tiwari, Dushyant
Hey Aaron,

I increased the size of the cluster also the concurrent_writes parameter. Still 
there is a node which keeps on dropping the mutation messages. The other nodes 
are not dropping mutation messages. I am using Hector API and had done nothing 
for load balancing so far. Just provided the host:port of the nodes in the 
Cassandrahostconfig. Is this due to some improper load balancing? Also the 
physical host where the node is hosted is relatively heavier than other nodes' 
host. What can I do to improve?
PS: The node is seed of the cluster.

Thanks,
Dushyant

From: aaron morton [mailto:aa...@thelastpickle.com]
Sent: Monday, March 05, 2012 4:15 PM
To: user@cassandra.apache.org
Subject: Re: Mutation Dropped Messages

1.   Which parameters to tune in the config files? - Especially looking for 
heavy writes
The node is overloaded. It may be because there are no enough nodes, or the 
node is under temporary stress such as GC or repair.
If you have spare IO / CPU capacity you could increase the current_writes to 
increase throughput on the write stage. You then need to ensure the commit log 
and, to a lesser degree, the data volumes can keep up.

2.   What is the difference between TimedOutException and silently dropping 
mutation messages while operating on a CL of QUORUM.
TimedOutExceptions means CL nodes did not respond to the coordinator before 
rpc_timeout. Dropping messages happens when a message is removed from the queue 
in the a thread pool after rpc_timeout has occurred. it is a feature of the 
architecture, and correct behaviour under stress.
Inconsistencies created by dropped messages are repaired via reads as high CL, 
HH (in 1.+), Read Repair or Anti Entropy.

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 5/03/2012, at 11:32 PM, Tiwari, Dushyant wrote:


Hi All,

While benchmarking Cassandra I found "Mutation Dropped" messages in the logs.  
Now I know this is a good old question. It will be really great if someone can 
provide a check list to recover when such a thing happens. I am looking for 
answers of the following questions  -

1.   Which parameters to tune in the config files? - Especially looking for 
heavy writes
2.   What is the difference between TimedOutException and silently dropping 
mutation messages while operating on a CL of QUORUM.


Regards,
Dushyant

NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or 
views contained herein are not intended to be, and do not constitute, advice 
within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and 
Consumer Protection Act. If you have received this communication in error, 
please destroy all electronic and paper copies and notify the sender 
immediately. Mistransmission is not intended to waive confidentiality or 
privilege. Morgan Stanley reserves the right, to the extent permitted under 
applicable law, to monitor electronic communications. This message is subject 
to terms available at the following link: 
http://www.morganstanley.com/disclaimers. If you cannot access these links, 
please notify us by reply message and we will send the contents to you. By 
messaging with Morgan Stanley you consent to the foregoing.


--
NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or 
views contained herein are not intended to be, and do not constitute, advice 
within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and 
Consumer Protection Act. If you have received this communication in error, 
please destroy all electronic and paper copies and notify the sender 
immediately. Mistransmission is not intended to waive confidentiality or 
privilege. Morgan Stanley reserves the right, to the extent permitted under 
applicable law, to monitor electronic communications. This message is subject 
to terms available at the following link: 
http://www.morganstanley.com/disclaimers. If you cannot access these links, 
please notify us by reply message and we will send the contents to you. By 
messaging with Morgan Stanley you consent to the foregoing.


Adding a second datacenter

2012-03-05 Thread David Koblas
Everything that I've read about data centers focuses on setting things 
up at the beginning of time.


I've the the following situation:

10 machines in a datacenter (DC1), with replication factor of 2.

I want to set up a second data center (DC2) with the following 
configuration:

  20 machines with a replication factor of 4

What I've found is that if I initially start adding things, the first 
machine to join the network attempts to replicate all of the data from 
DC1 and fills up it's disk drive.  I've played with setting the 
storage_options to have a replication factor of 0, then I can bring up 
all 20 machines in DC2 but then start getting a huge number of read 
errors from read on DC1.


Is there a simple cookbook on how to add a second DC?  I'm currently 
trying to set the replication factor to 1 and do a repair, but that 
doesn't feel like the right approach.


Thanks,





Re: Adding a second datacenter

2012-03-05 Thread Jeremiah Jordan
You need to make sure your clients are reading using LOCAL_* settings so 
that they don't try to get data from the other data center.  But you 
shouldn't get errors while replication_factor is 0.  Once you change the 
replication factor to 4, you should get missing data if you are using 
LOCAL_* for reading.


What version are you using?

See the IRC logs at the begining of this JIRA discussion thread for some 
info:


https://issues.apache.org/jira/browse/CASSANDRA-3483

But you should be able to:
1. Set dc2:0 in the replication_factor.
2. Set bootstrap to false on the new nodes.
2. Start all of the new nodes.
3. Change replication_factor to dc2:4
4. run repair on the nodes in dc2.

Once the repairs finish you should be able to start using DC2.  You are 
still going to need a bunch of extra space because the repair is going 
to get you a couple copies of the data.


Once 1.1 comes out it will have new nodetool commands for making this a 
little nicer per CASSANDRA-3483


-Jeremiah


On 03/05/2012 09:42 AM, David Koblas wrote:
Everything that I've read about data centers focuses on setting things 
up at the beginning of time.


I've the the following situation:

10 machines in a datacenter (DC1), with replication factor of 2.

I want to set up a second data center (DC2) with the following 
configuration:

  20 machines with a replication factor of 4

What I've found is that if I initially start adding things, the first 
machine to join the network attempts to replicate all of the data from 
DC1 and fills up it's disk drive.  I've played with setting the 
storage_options to have a replication factor of 0, then I can bring up 
all 20 machines in DC2 but then start getting a huge number of read 
errors from read on DC1.


Is there a simple cookbook on how to add a second DC?  I'm currently 
trying to set the replication factor to 1 and do a repair, but that 
doesn't feel like the right approach.


Thanks,





Division by zero

2012-03-05 Thread Vanger

After upgrading from version 1.0.1 to 1.0.8  we started to get exception:

ERROR [http-8095-1 WideEntityServiceImpl.java:142] - get: key1 - 
{type=RANGE, start=0, end=9223372036854775807, orderDesc=false, limit=1}
me.prettyprint.hector.api.exceptions.HCassandraInternalException: 
Cassandra encountered an internal error processing this request: 
TApplicationError type: 6 message:Internal error processing get_slice
at 
me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:31)
at 
me.prettyprint.cassandra.service.KeyspaceServiceImpl$7.execute(KeyspaceServiceImpl.java:285)
at 
me.prettyprint.cassandra.service.KeyspaceServiceImpl$7.execute(KeyspaceServiceImpl.java:268)
at 
me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:101)
at 
me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:233)
at 
me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:131)
at 
me.prettyprint.cassandra.service.KeyspaceServiceImpl.getSlice(KeyspaceServiceImpl.java:289)
at 
me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:53)
at 
me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:49)
at 
me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
at 
me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85)
at 
me.prettyprint.cassandra.model.thrift.ThriftSliceQuery.execute(ThriftSliceQuery.java:48)



I already (not too soon?) created an issue in jira with more detailed 
description:

https://issues.apache.org/jira/browse/CASSANDRA-4000

Any ideas?

Thanks.


Re: Rationale behind incrementing all tokens by one in a different datacenter (was: running two rings on the same subnet)

2012-03-05 Thread Jeremiah Jordan
There is a requirement that all nodes have a unique token.  There is 
still one global cluster/ring that each node needs to be unique on.  The 
logically seperate rings that NetworkTopologyStrategy puts them into is 
hidden from the rest of the code.


-Jeremiah

On 03/05/2012 05:13 AM, Hontvári József Levente wrote:

I am thinking about the frequent example:

dc1 - node1: 0
dc1 - node2: large...number

dc2 - node1: 1
dc2 - node2: large...number + 1

In theory using the same tokens in dc2 as in dc1 does not 
significantly affect key distribution, specifically the two keys on 
the border will move to the next one, but that is not much. However it 
seems that there is an unexplained requirement (at least I could not 
find an explanation), that all nodes must have a unique token, even if 
they are put into a different circle by NetworkTopologyStrategy.





On 2012.03.05. 11:48, aaron morton wrote:
Moreover all tokens must be unique (even across datacenters), 
although - from pure curiosity - I wonder what is the rationale 
behind this.

Otherwise data is not evenly distributed.


Re: how stable is 1.0 these days?

2012-03-05 Thread Thibaut Britz
Thanks for the feedback. I will certainly execute scrub after the update.


On Mon, Mar 5, 2012 at 11:55 AM, Viktor Jevdokimov wrote:

> 1.0.7 is very stable, weeks in high-load production environment without
> any exception, 1.0.8 should be even more stable, check changes.txt for what
> was fixed.
>
>
> 2012/3/2 Marcus Eriksson 
>
>> beware of https://issues.apache.org/jira/browse/CASSANDRA-3820 though if
>> you have many keys per node
>>
>> other than that, yep, it seems solid
>>
>> /Marcus
>>
>>
>> On Wed, Feb 29, 2012 at 6:20 PM, Thibaut Britz <
>> thibaut.br...@trendiction.com> wrote:
>>
>>> Thanks!
>>>
>>> We will test it on our test cluster in the coming weeks and hopefully
>>> put it into production on our 200 node main cluster. :)
>>>
>>> Thibaut
>>>
>>> On Wed, Feb 29, 2012 at 5:52 PM, Edward Capriolo 
>>> wrote:
>>>
 On Wed, Feb 29, 2012 at 10:35 AM, Thibaut Britz
  wrote:
 > Any more feedback on larger deployments of 1.0.*?
 >
 > We are eager to try out the new features in production, but don't
 want to
 > run into bugs as on former 0.7 and 0.8 versions.
 >
 > Thanks,
 > Thibaut
 >
 >
 >
 > On Tue, Jan 31, 2012 at 6:59 AM, Ben Coverston <
 ben.covers...@datastax.com>
 > wrote:
 >>
 >> I'm not sure what Carlo is referring to, but generally if you have
 done,
 >> thousands of migrations you can end up in a situation where the
 migrations
 >> take a long time to replay, and there are some race conditions that
 can be
 >> problematic in the case where there are thousands of migrations that
 may
 >> need to be replayed while a node is bootstrapped. If you get into
 this
 >> situation it can be fixed by copying migrations from a known good
 schema to
 >> the node that you are trying to bootstrap.
 >>
 >> Generally I would advise against frequent schema updates. Unlike
 rows in
 >> column families the schema itself is designed to be relatively
 static.
 >>
 >> On Mon, Jan 30, 2012 at 2:14 PM, Jim Newsham <
 jnews...@referentia.com>
 >> wrote:
 >>>
 >>>
 >>> Could you also elaborate for creating/dropping column families?
  We're
 >>> currently working on moving to 1.0 and using dynamically created
 tables, so
 >>> I'm very interested in what issues we might encounter.
 >>>
 >>> So far the only thing I've encountered (with 1.0.7 + hector 1.0-2)
 is
 >>> that dropping a cf may sometimes fail with UnavailableException.  I
 think
 >>> this happens when the cf is busy being compacted.  When I
 sleep/retry within
 >>> a loop it eventually succeeds.
 >>>
 >>> Thanks,
 >>> Jim
 >>>
 >>>
 >>> On 1/26/2012 7:32 AM, Pierre-Yves Ritschard wrote:
 
  Can you elaborate on the composite types instabilities ? is this
  specific to hector as the radim's posts suggests ?
  These one liner answers are quite stressful :)
 
  On Thu, Jan 26, 2012 at 1:28 PM, Carlo Pires
   wrote:
 >
 > If you need to use composite types and create/drop column
 families on
 > the
 > fly you must be prepared to instabilities.
 >
 >>>
 >>
 >>
 >>
 >> --
 >> Ben Coverston
 >> DataStax -- The Apache Cassandra Company
 >>
 >

 I would call 1.0.7 rock fricken solid. Incredibly stable. It has been
 that way since I updated to 0.8.8  really. TBs of data, billions of
 requests a day, and thanks to JAMM, memtable type auto-tuning, and
 other enhancements I rarely, if ever, find a node in a state where it
 requires a restart. My clusters are beast-ing.

 There always is bugs in software, but coming from a guy who ran
 cassandra 0.6.1.Administration on my Cassandra cluster is like a
 vacation now.

>>>
>>>
>>
>


Re: Adding a second datacenter

2012-03-05 Thread David Koblas

Jeremiah,

Thanks!

I'm running 1.0.8, two interesting things to note:

- I don't have sufficient disk space to handle the straight bump to a 
replication factor of 4, so I think I'm going to have to do it one by 
one (1,2,3 and 4) with a bunch of cleanups in between.


- Also, using a LOCAL_QUORUM doesn't work since my application has a 
hard response time limit then my read speed ends up being the speed of 
the slowest node.  What I want is LOCAL_ONE which doesn't exist in the 
API (unless I missed something).


Yes, CASSANDRA-3483 is really what I'm looking for.

--david

On 3/5/12 8:02 AM, Jeremiah Jordan wrote:
You need to make sure your clients are reading using LOCAL_* settings 
so that they don't try to get data from the other data center.  But 
you shouldn't get errors while replication_factor is 0.  Once you 
change the replication factor to 4, you should get missing data if you 
are using LOCAL_* for reading.


What version are you using?

See the IRC logs at the begining of this JIRA discussion thread for 
some info:


https://issues.apache.org/jira/browse/CASSANDRA-3483

But you should be able to:
1. Set dc2:0 in the replication_factor.
2. Set bootstrap to false on the new nodes.
2. Start all of the new nodes.
3. Change replication_factor to dc2:4
4. run repair on the nodes in dc2.

Once the repairs finish you should be able to start using DC2.  You 
are still going to need a bunch of extra space because the repair is 
going to get you a couple copies of the data.


Once 1.1 comes out it will have new nodetool commands for making this 
a little nicer per CASSANDRA-3483


-Jeremiah


On 03/05/2012 09:42 AM, David Koblas wrote:
Everything that I've read about data centers focuses on setting 
things up at the beginning of time.


I've the the following situation:

10 machines in a datacenter (DC1), with replication factor of 2.

I want to set up a second data center (DC2) with the following 
configuration:

  20 machines with a replication factor of 4

What I've found is that if I initially start adding things, the first 
machine to join the network attempts to replicate all of the data 
from DC1 and fills up it's disk drive.  I've played with setting the 
storage_options to have a replication factor of 0, then I can bring 
up all 20 machines in DC2 but then start getting a huge number of 
read errors from read on DC1.


Is there a simple cookbook on how to add a second DC?  I'm currently 
trying to set the replication factor to 1 and do a repair, but that 
doesn't feel like the right approach.


Thanks,





Re: Issue with nodetool clearsnapshot

2012-03-05 Thread aaron morton
> It seems that instead of removing the snapshot, clearsnapshot moved the data 
> files from the snapshot directory to the parent directory and the size of the 
> data for that keyspace has doubled.
That is not possible, there is only code there to delete a files in the 
snapshot. 

Note that in the snapshot are hard links to the files in the data dir. Deleting 
/ clearing the snapshot will not delete the files from the data dir if they are 
still in use. 

>  Many of the files are looking like duplicates.
> 
> in Keyspace1 directory
> 156987786084 Jan 21 03:18 Standard1-g-7317-Data.db
> 156987786084 Mar  4 01:33 Standard1-g-8850-Data.db
Under 0.8.x files are not immediately deleted. Did the data directory contain 
zero size -Compacted files with the same number ?
  
Cheers


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 5/03/2012, at 11:50 PM, B R wrote:

> Version 0.8.9
> 
> We run a 2 node cluster with RF=2. We ran a scrub and after that ran the 
> clearsnapshot to remove the backup snapshot created by scrub. It seems that 
> instead of removing the snapshot, clearsnapshot moved the data files from the 
> snapshot directory to the parent directory and the size of the data for that 
> keyspace has doubled. Many of the files are looking like duplicates.
> 
> in Keyspace1 directory
> 156987786084 Jan 21 03:18 Standard1-g-7317-Data.db
> 156987786084 Mar  4 01:33 Standard1-g-8850-Data.db
> 118211555728 Jan 31 12:50 Standard1-g-7968-Data.db
> 118211555728 Mar  3 22:58 Standard1-g-8840-Data.db
> 116902342895 Feb 25 02:04 Standard1-g-8832-Data.db
> 116902342895 Mar  3 22:10 Standard1-g-8836-Data.db
> 93788425710 Feb 21 04:20 Standard1-g-8791-Data.db
> 93788425710 Mar  4 00:29 Standard1-g-8845-Data.db
> .
> 
> Even though the nodetool ring command shows the correct data size for the 
> node, the du -sh on the keyspace directory gives double the size.
> 
> Can you guide us to proceed from this situation ?
> 
> Thanks.



Re: running two rings on the same subnet

2012-03-05 Thread aaron morton
Create nodes that do not share seeds, and give the clusters different names as 
a safety measure. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 6/03/2012, at 12:04 AM, Tamar Fraenkel wrote:

> I want tow separate clusters.
> Tamar Fraenkel 
> Senior Software Engineer, TOK Media 
> 
> 
> 
> ta...@tok-media.com
> Tel:   +972 2 6409736 
> Mob:  +972 54 8356490 
> Fax:   +972 2 5612956 
> 
> 
> 
> 
> 
> On Mon, Mar 5, 2012 at 12:48 PM, aaron morton  wrote:
> Do you want to create two separate clusters or a single cluster with two data 
> centres ? 
> 
> If it's the later, token selection is discussed here 
> http://www.datastax.com/docs/1.0/install/cluster_init#token-gen-cassandra
>  
>> Moreover all tokens must be unique (even across datacenters), although - 
>> from pure curiosity - I wonder what is the rationale behind this.
> Otherwise data is not evenly distributed.
> 
>> By the way, can someone enlighten me about the first line in the output of 
>> the nodetool. Obviously it contains a token, but nothing else. It seems like 
>> a formatting glitch, but maybe it has a role. 
> It's the exclusive lower bound token for the first node in the ring. This 
> also happens to be the token for the last node in the ring. 
> 
> In your setup 
> 10.0.0.19 "owns" (85070591730234615865843651857942052864+1) to 0
> 10.0.0.28 "owns"  (0 + 1) to 85070591730234615865843651857942052864
> 
> (does not imply primary replica, just used to map keys to nodes.)
>  
> 
> 
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 5/03/2012, at 11:38 PM, Hontvári József Levente wrote:
> 
>> You have to use PropertyFileSnitch and NetworkTopologyStrategy to create a 
>> multi-datacenter setup with two circles. You can start reading from this 
>> page:
>> http://www.datastax.com/docs/1.0/cluster_architecture/replication#about-replica-placement-strategy
>> 
>> Moreover all tokens must be unique (even across datacenters), although - 
>> from pure curiosity - I wonder what is the rationale behind this.
>> 
>> By the way, can someone enlighten me about the first line in the output of 
>> the nodetool. Obviously it contains a token, but nothing else. It seems like 
>> a formatting glitch, but maybe it has a role. 
>> 
>> On 2012.03.05. 11:06, Tamar Fraenkel wrote:
>>> Hi!
>>> I have a Cassandra  cluster with two nodes
>>> 
>>> nodetool ring -h localhost
>>> Address DC  RackStatus State   LoadOwns 
>>>Token
>>> 
>>>85070591730234615865843651857942052864
>>> 10.0.0.19   datacenter1 rack1   Up Normal  488.74 KB   
>>> 50.00%  0
>>> 10.0.0.28   datacenter1 rack1   Up Normal  504.63 KB   
>>> 50.00%  85070591730234615865843651857942052864
>>> 
>>> I want to create a second ring with the same name but two different nodes.
>>> using tokengentool I get the same tokens as they are affected from the 
>>> number of nodes in a ring.
>>> 
>>> My question is like this:
>>> Lets say I create two new VMs, with IPs: 10.0.0.31 and 10.0.0.11
>>> In 10.0.0.31 cassandra.yaml I will set
>>> initial_token: 0
>>> seeds: "10.0.0.31"
>>> listen_address: 10.0.0.31
>>> rpc_address: 0.0.0.0
>>> 
>>> In 10.0.0.11 cassandra.yaml I will set
>>> initial_token: 85070591730234615865843651857942052864
>>> seeds: "10.0.0.31"
>>> listen_address: 10.0.0.11
>>> rpc_address: 0.0.0.0 
>>> 
>>> Would the rings be separate?
>>> 
>>> Thanks,
>>> 
>>> Tamar Fraenkel 
>>> Senior Software Engineer, TOK Media 
>>> 
>>> 
>>> 
>>> 
>>> ta...@tok-media.com
>>> Tel:   +972 2 6409736 
>>> Mob:  +972 54 8356490 
>>> Fax:   +972 2 5612956 
>>> 
>>> 
>>> 
>> 
> 
> 



Re: Division by zero

2012-03-05 Thread aaron morton
(Commented in the ticket as well)
What is the error in the server log ? 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 6/03/2012, at 5:04 AM, Vanger wrote:

> After upgrading from version 1.0.1 to 1.0.8  we started to get exception:
> 
> ERROR [http-8095-1 WideEntityServiceImpl.java:142] - get: key1 - {type=RANGE, 
> start=0, end=9223372036854775807, orderDesc=false, limit=1}
> me.prettyprint.hector.api.exceptions.HCassandraInternalException: Cassandra 
> encountered an internal error processing this request: TApplicationError 
> type: 6 message:Internal error processing get_slice
> at 
> me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:31)
> at 
> me.prettyprint.cassandra.service.KeyspaceServiceImpl$7.execute(KeyspaceServiceImpl.java:285)
> at 
> me.prettyprint.cassandra.service.KeyspaceServiceImpl$7.execute(KeyspaceServiceImpl.java:268)
> at 
> me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:101)
> at 
> me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:233)
> at 
> me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:131)
> at 
> me.prettyprint.cassandra.service.KeyspaceServiceImpl.getSlice(KeyspaceServiceImpl.java:289)
> at 
> me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:53)
> at 
> me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:49)
> at 
> me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
> at 
> me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85)
> at 
> me.prettyprint.cassandra.model.thrift.ThriftSliceQuery.execute(ThriftSliceQuery.java:48)
> 
> 
> I already (not too soon?) created an issue in jira with more detailed 
> description:
> https://issues.apache.org/jira/browse/CASSANDRA-4000
> 
> Any ideas?
> 
> Thanks.



Re: Mutation Dropped Messages

2012-03-05 Thread aaron morton
> I increased the size of the cluster also the concurrent_writes parameter. 
> Still there is a node which keeps on dropping the mutation messages.
Ensure all the nodes have the same spec, and the nodes have the same config. In 
a virtual environment consider moving the node.

> Is this due to some improper load balancing? 
What does nodetool ring say and what sort of queries (and RF and CL) are you 
sending.

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 6/03/2012, at 3:58 AM, Tiwari, Dushyant wrote:

> Hey Aaron,
>  
> I increased the size of the cluster also the concurrent_writes parameter. 
> Still there is a node which keeps on dropping the mutation messages. The 
> other nodes are not dropping mutation messages. I am using Hector API and had 
> done nothing for load balancing so far. Just provided the host:port of the 
> nodes in the Cassandrahostconfig. Is this due to some improper load 
> balancing? Also the physical host where the node is hosted is relatively 
> heavier than other nodes’ host. What can I do to improve?
> PS: The node is seed of the cluster.
>  
> Thanks,
> Dushyant
>  
> From: aaron morton [mailto:aa...@thelastpickle.com] 
> Sent: Monday, March 05, 2012 4:15 PM
> To: user@cassandra.apache.org
> Subject: Re: Mutation Dropped Messages
>  
> 1.   Which parameters to tune in the config files? – Especially looking 
> for heavy writes
> The node is overloaded. It may be because there are no enough nodes, or the 
> node is under temporary stress such as GC or repair. 
> If you have spare IO / CPU capacity you could increase the current_writes to 
> increase throughput on the write stage. You then need to ensure the commit 
> log and, to a lesser degree, the data volumes can keep up. 
>  
> 2.   What is the difference between TimedOutException and silently 
> dropping mutation messages while operating on a CL of QUORUM.
> TimedOutExceptions means CL nodes did not respond to the coordinator before 
> rpc_timeout. Dropping messages happens when a message is removed from the 
> queue in the a thread pool after rpc_timeout has occurred. it is a feature of 
> the architecture, and correct behaviour under stress. 
> Inconsistencies created by dropped messages are repaired via reads as high 
> CL, HH (in 1.+), Read Repair or Anti Entropy.
>  
> Cheers
>  
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>  
> On 5/03/2012, at 11:32 PM, Tiwari, Dushyant wrote:
> 
> 
> Hi All,
>  
> While benchmarking Cassandra I found “Mutation Dropped” messages in the logs. 
>  Now I know this is a good old question. It will be really great if someone 
> can provide a check list to recover when such a thing happens. I am looking 
> for answers of the following questions  -
>  
> 1.   Which parameters to tune in the config files? – Especially looking 
> for heavy writes
> 2.   What is the difference between TimedOutException and silently 
> dropping mutation messages while operating on a CL of QUORUM.
>  
>  
> Regards,
> Dushyant
> NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions 
> or views contained herein are not intended to be, and do not constitute, 
> advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform 
> and Consumer Protection Act. If you have received this communication in 
> error, please destroy all electronic and paper copies and notify the sender 
> immediately. Mistransmission is not intended to waive confidentiality or 
> privilege. Morgan Stanley reserves the right, to the extent permitted under 
> applicable law, to monitor electronic communications. This message is subject 
> to terms available at the following link: 
> http://www.morganstanley.com/disclaimers. If you cannot access these links, 
> please notify us by reply message and we will send the contents to you. By 
> messaging with Morgan Stanley you consent to the foregoing.
>  
> NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions 
> or views contained herein are not intended to be, and do not constitute, 
> advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform 
> and Consumer Protection Act. If you have received this communication in 
> error, please destroy all electronic and paper copies and notify the sender 
> immediately. Mistransmission is not intended to waive confidentiality or 
> privilege. Morgan Stanley reserves the right, to the extent permitted under 
> applicable law, to monitor electronic communications. This message is subject 
> to terms available at the following 
> link:http://www.morganstanley.com/disclaimers. If you cannot access these 
> links, please notify us by reply message and we will send the contents to 
> you. By messaging with Morgan Stanley you consent to the foregoing.



Re: Issue with nodetool clearsnapshot

2012-03-05 Thread B R
Hi Aaron,

1)Since you mentioned hard links, I would like to add that our data
directory itself is a sym-link. Could that be causing an issue ?

2)Yes, there are 0 byte files of the same numbers
in Keyspace1 directory
0 Mar  4 01:33 Standard1-g-7317-Compacted
0 Mar  3 22:58 Standard1-g-7968-Compacted
0 Mar  3 23:10 Standard1-g-8778-Compacted
0 Mar  3 23:47 Standard1-g-8782-Compacted
...

I restarted the node and it went about deleting the files and the disk
space has been released. Can this be done using nodetool, and without
restarting ?

Thanks.

On Mon, Mar 5, 2012 at 10:59 PM, aaron morton wrote:

>   It seems that instead of removing the snapshot, clearsnapshot moved the
> data files from the snapshot directory to the parent directory and the size
> of the data for that keyspace has doubled.
>
> That is not possible, there is only code there to delete a files in the
> snapshot.
>
> Note that in the snapshot are hard links to the files in the data dir.
> Deleting / clearing the snapshot will not delete the files from the data
> dir if they are still in use.
>
>  Many of the files are looking like duplicates.
>
> in Keyspace1 directory
> 156987786084 Jan 21 03:18 Standard1-g-7317-Data.db
> 156987786084 Mar  4 01:33 Standard1-g-8850-Data.db
>
> Under 0.8.x files are not immediately deleted. Did the data directory
> contain zero size -Compacted files with the same number ?
>
> Cheers
>
>
>   -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 5/03/2012, at 11:50 PM, B R wrote:
>
> Version 0.8.9
>
> We run a 2 node cluster with RF=2. We ran a scrub and after that ran the
> clearsnapshot to remove the backup snapshot created by scrub. It seems that
> instead of removing the snapshot, clearsnapshot moved the data files from
> the snapshot directory to the parent directory and the size of the data for
> that keyspace has doubled. Many of the files are looking like duplicates.
>
> in Keyspace1 directory
> 156987786084 Jan 21 03:18 Standard1-g-7317-Data.db
> 156987786084 Mar  4 01:33 Standard1-g-8850-Data.db
> 118211555728 Jan 31 12:50 Standard1-g-7968-Data.db
> 118211555728 Mar  3 22:58 Standard1-g-8840-Data.db
> 116902342895 Feb 25 02:04 Standard1-g-8832-Data.db
> 116902342895 Mar  3 22:10 Standard1-g-8836-Data.db
> 93788425710 Feb 21 04:20 Standard1-g-8791-Data.db
> 93788425710 Mar  4 00:29 Standard1-g-8845-Data.db
> .
>
> Even though the nodetool ring command shows the correct data size for the
> node, the du -sh on the keyspace directory gives double the size.
>
> Can you guide us to proceed from this situation ?
>
> Thanks.
>
>
>


hector connection pool

2012-03-05 Thread Daning Wang
I just got this error ": All host pools marked down. Retry burden pushed
out to client." in a few clients recently, client could not  recover, we
have to restart client application.  we are using 0.8.0.3 hector.

At that time we did compaction  for a CF, it takes several hours, server
was busy. But I think client should recover after server load was down.

Any bug reported about this? I did search but could not find one.

Thanks,

Daning


RE: Secondary indexes don't go away after metadata change

2012-03-05 Thread Frisch, Michael
Thank you very much for your response.  It is true that the older, previously 
existing nodes are not snapshotting the indexes that I had removed.  I'll go 
ahead and just delete those SSTables from the data directory.  They may be 
around still because they were created back when we used 0.8.

The more troubling issue is with adding new nodes to the cluster though.  It 
built indexes for column families that have had all indexes dropped weeks or 
months in the past.  It also will snapshot the index SSTables that it created.  
The index files are non-empty as well, some are hundreds of megabytes.

All nodes have the same schema, none list themselves as having the rows 
indexed.  I cannot drop the indexes via the CLI either because it says that 
they don't exist.  It's quite perplexing.

- Mike


From: aaron morton [mailto:aa...@thelastpickle.com]
Sent: Monday, March 05, 2012 3:58 AM
To: user@cassandra.apache.org
Subject: Re: Secondary indexes don't go away after metadata change

The secondary index CF's are marked as no longer required / marked as 
compacted. under 1.x they would then be deleted reasonably quickly, and 
definitely deleted after a restart.

Is there a zero length .Compacted file there ?

Also, when adding a new node to the ring the new node will build indexes for 
the ones that supposedly don't exist any longer.  Is this supposed to happen?  
Would this have happened if I had deleted the old SSTables from the previously 
existing nodes?
Check you have a consistent schema using describe cluster in the CLI. And check 
the schema is what you think it is using show schema.

Another trick is to do a snapshot. Only the files in use are included the 
snapshot.

Hope that helps.

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 2/03/2012, at 2:53 AM, Frisch, Michael wrote:


I have a few column families that I decided to get rid of the secondary indexes 
on.  I see that there aren't any new index SSTables being created, but all of 
the old ones remain (some from as far back as September).  Is it safe to just 
delete then when the node is offline?  Should I run clean-up or scrub?

Also, when adding a new node to the ring the new node will build indexes for 
the ones that supposedly don't exist any longer.  Is this supposed to happen?  
Would this have happened if I had deleted the old SSTables from the previously 
existing nodes?

The nodes in question have either been upgraded from v0.8.1 => v1.0.2 (scrubbed 
at this time) => v1.0.6 or from v1.0.2 => v1.0.6.  The secondary index was 
dropped when the nodes were version 1.0.6.  The new node added was also 1.0.6.

- Mike



Cassandra cache patterns with thiny and wide rows

2012-03-05 Thread Maciej Miklas
I've asked this question already on stackoverflow but without answer - I
wll try again:


My use case expects heavy read load - there are two possible model design
strategies:

   1.

   Tiny rows with row cache: In this case row is small enough to fit into
   RAM and all columns are being cached. Read access should be fast.
   2.

   Wide rows with key cache. Wide rows with large columns amount are to big
   for row cache. Access to column subset requires HDD seek.

As I understand using wide rows is a good design pattern. But we would need
to disable row cache - so  what is the benefit of such wide row (at
least for read access)?

Which approach is better 1 or 2?


Re: hector connection pool

2012-03-05 Thread Maciej Miklas
Have you tried to change:
me.prettyprint.cassandra.service.CassandraHostConfigurator#retryDownedHostsDelayInSeconds
?

Hector will ping down hosts every xx seconds and recover connection.

Regards,
Maciej

On Mon, Mar 5, 2012 at 8:13 PM, Daning Wang  wrote:

> I just got this error ": All host pools marked down. Retry burden pushed
> out to client." in a few clients recently, client could not  recover, we
> have to restart client application.  we are using 0.8.0.3 hector.
>
> At that time we did compaction  for a CF, it takes several hours, server
> was busy. But I think client should recover after server load was down.
>
> Any bug reported about this? I did search but could not find one.
>
> Thanks,
>
> Daning
>
>


Re: Cassandra cache patterns with thiny and wide rows

2012-03-05 Thread Viktor Jevdokimov
Depends on how large is a data set, specifically hot data, comparing to
available RAM, what is a heavy read load, and what are the latency
requirements.


2012/3/6 Maciej Miklas 

> I've asked this question already on stackoverflow but without answer - I
> wll try again:
>
>
> My use case expects heavy read load - there are two possible model design
> strategies:
>
>1.
>
>Tiny rows with row cache: In this case row is small enough to fit into
>RAM and all columns are being cached. Read access should be fast.
>2.
>
>Wide rows with key cache. Wide rows with large columns amount are to
>big for row cache. Access to column subset requires HDD seek.
>
> As I understand using wide rows is a good design pattern. But we would
> need to disable row cache - so  what is the benefit of such wide row
> (at least for read access)?
>
> Which approach is better 1 or 2?
>


Re: running two rings on the same subnet

2012-03-05 Thread Tamar Fraenkel
Works..

But during the night my setup encountered a problem.
I have two VMs on my cluster (running on VmWare ESXi).
Each VM has1GB memory, and two Virtual Disks of 16 GB
They are running on a small server with 4CPUs (2.66 GHz), and 4 GB memory
(together with two other VMs)
I put cassandra data on the second disk of each machine.
VMs are running Ubuntu 11.10 and cassandra 1.0.7.

I left them running overnight and this morning when I came:
In one node cassandra was down, and the last thing in the system.log is:

 INFO [CompactionExecutor:150] 2012-03-06 00:55:04,821 CompactionTask.java
(line 113) Compacting
[SSTableReader(path='/opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1243-Data.db'),
SSTableReader(path='/opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1245-Data.db'),
SSTableReader(path='/opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1242-Data.db'),
SSTableReader(path='/opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1244-Data.db')]
 INFO [CompactionExecutor:150] 2012-03-06 00:55:07,919 CompactionTask.java
(line 221) Compacted to
[/opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1246-Data.db,].
 32,424,771 to 26,447,685 (~81% of original) bytes for 58,938 keys at
8.144165MB/s.  Time: 3,097ms.


The other node was using all it's CPU and I had to restart it.
After that, I can see that the last lines in it's system.log are that the
other node is down...

 INFO [FlushWriter:142] 2012-03-06 00:55:02,418 Memtable.java (line 246)
Writing Memtable-tk_vertical_tag_story_indx@1365852701(1122169/25154556
serialized/live bytes, 21173 ops)
 INFO [FlushWriter:142] 2012-03-06 00:55:02,742 Memtable.java (line 283)
Completed flushing
/opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1244-Data.db (2075930
bytes)
 INFO [GossipTasks:1] 2012-03-06 08:02:18,584 Gossiper.java (line 818)
InetAddress /10.0.0.31 is now dead.

How can I trace why that happened?
Also, I brought cassandra up in both nodes. They both spend long time
reading commit logs, but now they seem to run.
Any idea how to debug or improve my setup?
Thanks,
Tamar



*Tamar Fraenkel *
Senior Software Engineer, TOK Media

[image: Inline image 1]

ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956





On Mon, Mar 5, 2012 at 7:30 PM, aaron morton wrote:

> Create nodes that do not share seeds, and give the clusters different
> names as a safety measure.
>
> Cheers
>
>   -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 6/03/2012, at 12:04 AM, Tamar Fraenkel wrote:
>
> I want tow separate clusters.
> *Tamar Fraenkel *
> Senior Software Engineer, TOK Media
>
> 
>
>
> ta...@tok-media.com
> Tel:   +972 2 6409736
> Mob:  +972 54 8356490
> Fax:   +972 2 5612956
>
>
>
>
>
> On Mon, Mar 5, 2012 at 12:48 PM, aaron morton wrote:
>
>> Do you want to create two separate clusters or a single cluster with two
>> data centres ?
>>
>> If it's the later, token selection is discussed here
>> http://www.datastax.com/docs/1.0/install/cluster_init#token-gen-cassandra
>>
>>
>> Moreover all tokens must be unique (even across datacenters), although -
>> from pure curiosity - I wonder what is the rationale behind this.
>>
>> Otherwise data is not evenly distributed.
>>
>> By the way, can someone enlighten me about the first line in the output
>> of the nodetool. Obviously it contains a token, but nothing else. It seems
>> like a formatting glitch, but maybe it has a role.
>>
>> It's the exclusive lower bound token for the first node in the ring. This
>> also happens to be the token for the last node in the ring.
>>
>> In your setup
>> 10.0.0.19 "owns" (85070591730234615865843651857942052864+1) to 0
>> 10.0.0.28 "owns"  (0 + 1) to 85070591730234615865843651857942052864
>>
>> (does not imply primary replica, just used to map keys to nodes.)
>>
>>
>>
>>   -
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 5/03/2012, at 11:38 PM, Hontvári József Levente wrote:
>>
>>  You have to use PropertyFileSnitch and NetworkTopologyStrategy to create
>> a multi-datacenter setup with two circles. You can start reading from this
>> page:
>>
>> http://www.datastax.com/docs/1.0/cluster_architecture/replication#about-replica-placement-strategy
>>
>> Moreover all tokens must be unique (even across datacenters), although -
>> from pure curiosity - I wonder what is the rationale behind this.
>>
>> By the way, can someone enlighten me about the first line in the output
>> of the nodetool. Obviously it contains a token, but nothing else. It seems
>> like a formatting glitch, but maybe it has a role.
>>
>> On 2012.03.05. 11:06, Tamar Fraenkel wrote:
>>
>> Hi!
>> I have a Cassandra  cluster with two nodes
>>
>>  nodetool ring -h localhost
>> Address DC  RackStatus State   Load
>>  OwnsToken
>>
>>  85070591730234615865843651857942052864
>> 10.0.0.19   datacenter1 rack1   Up Normal  488.7