Re: INSERT ... IF NOT EXISTS with some nodes unavailable

2014-06-03 Thread Frederick Haebin Na
Hello Mitchell,

I think it is due to your replication factor, which, I assume, is 2 since
you have only 2 nodes in the cluster.
If you are using even number of nodes, Cassandra is impossible to run
queries require QUORUM participants.

So, I think you have to expand your cluster to 3 nodes and make replication
factor 3.
HTH

Haebin


2014-06-03 12:09 GMT+09:00 Ackerman, Mitchell :

>  Hi, I’m trying to get a query using
>
>
>
>INSERT ... IF NOT EXISTS
>
>
>
> working when not all of the nodes are available.  As a test case I have 2
> nodes, one in AWS us-west-1, another in AWS eu-west-1.  The keyspace
> settings are described below.  When I only have one of the nodes available,
> the insert fails with an UnavaliableException (via
> TokenRangeOfflineException, see below for details).
>
>
>
> From reading about the lightweight transactions (
> http://www.datastax.com/dev/blog/lightweight-transactions-in-cassandra-2-0)
> and the CQL expression, it looks like this scenario should be supported.
> Does anyone have any idea why it is not working?
>
>
>
> You will notice that I’m using Astyanax, could this be a source of
> problems for this use case?
>
>
>
> I presume that a Consistency Level of Serial is in use and that a quorum
> of 1 should suffice for a 2 node cache.
>
>
>
> Thanks, Mitchell
>
>
>
> conferenceassignmentcache
>
>
>
> replica_placement_strategy
> org.apache.cassandra.locator.NetworkTopologyStrategy
>
> Replication Strategy Options
>
> us-west1
>
> eu-west   1
>
>
>
> CREATE TABLE conferenceassignmentcache.conferenceassignmentcache_cf (
>
>   id text PRIMARY KEY,
>
>   value text
>
> ) WITH
>
>   bloom_filter_fp_chance=0.01 AND
>
>   caching='KEYS_ONLY' AND
>
>   comment='' AND
>
>   dclocal_read_repair_chance=0.00 AND
>
>   gc_grace_seconds=864000 AND
>
>   read_repair_chance=0.10 AND
>
>   replicate_on_write='true' AND
>
>   populate_io_cache_on_flush='false' AND
>
>   compaction={'class': 'SizeTieredCompactionStrategy'} AND
>
>   compression={'sstable_compression': 'LZ4Compressor'};
>
>
>
> When I try to INSERT a row with only one node available I get the
> following exception:
>
>
>
> 2014-06-02 20:57:59,947 [eventTaskExecutor-15] DEBUG
> [CassandraConferenceAssignmentManager.getInsertStatement() 53] - INSERT
> INTO conferenceassignmentcache_cf (id, value) VALUES (?, ?) IF NOT EXISTS
> USING TTL 60;
>
> 2014-06-02 20:58:00,084 [eventTaskExecutor-15] DEBUG
> [ThriftConverter.ToConnectionPoolException() 157] -
>
> 2014-06-02 20:58:00,086 [eventTaskExecutor-15] ERROR
> [CountingConnectionPoolMonitor.trackError() 94] -
> *com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException*:
> *TokenRangeOfflineException*: 
> [host=cache.alpha.us-west-1.bobdev.com(10.89.0.37):9160,
> latency=137(137), attempts=1]UnavailableException()
>
> *com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException*:
> *TokenRangeOfflineException*: 
> [host=cache.alpha.us-west-1.bobdev.com(10.89.0.37):9160,
> latency=137(137), attempts=1]UnavailableException()
>
>
>
> Caused by: UnavailableException()
>
>at
> org.apache.cassandra.thrift.Cassandra$execute_prepared_cql3_query_result.read(Cassandra.java:41876)
>
>at
> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
>
>at
> org.apache.cassandra.thrift.Cassandra$Client.recv_execute_prepared_cql3_query(Cassandra.java:1689)
>
>at
> org.apache.cassandra.thrift.Cassandra$Client.execute_prepared_cql3_query(Cassandra.java:1674)
>
>at
> com.netflix.astyanax.thrift.ThriftCql3Query.execute_prepared_cql_query(ThriftCql3Query.java:29)
>
>at
> com.netflix.astyanax.thrift.AbstractThriftCqlQuery$3$1.internalExecute(AbstractThriftCqlQuery.java:93)
>
>at
> com.netflix.astyanax.thrift.AbstractThriftCqlQuery$3$1.internalExecute(AbstractThriftCqlQuery.java:83)
>
>at
> com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:60)
>
>... 35 more
>
>
>


Logging of triggers

2014-06-03 Thread Joel Samuelsson
I'm testing triggers as part of a project and would like to add some
logging to it. I'm using the same log structure as in the trigger example
InvertedIndex but can't seem to find any logs. Where would I find the
logging? In the system logs or somewhere else?

/Joel


Re: migration to a new model

2014-06-03 Thread Laing, Michael
Hi Marcelo,

I could create a fast copy program by repurposing some python apps that I
am using for benchmarking the python driver - do you still need this?

With high levels of concurrency and multiple subprocess workers, based on
my current actual benchmarks, I think I can get well over 1,000 rows/second
on my mac and significantly more in AWS. I'm using variable size rows
averaging 5kb.

This would be the initial version of a piece of the benchmark suite we will
release as part of our nyt⨍aбrik project on 21 June for my Cassandra Day
NYC talk re the python driver.

ml


On Mon, Jun 2, 2014 at 2:15 PM, Marcelo Elias Del Valle <
marc...@s1mbi0se.com.br> wrote:

> Hi Jens,
>
> Thanks for trying to help.
>
> Indeed, I know I can't do it using just CQL. But what would you use to
> migrate data manually? I tried to create a python program using auto
> paging, but I am getting timeouts. I also tried Hive, but no success.
> I only have two nodes and less than 200Gb in this cluster, any simple way
> to extract the data quickly would be good enough for me.
>
> Best regards,
> Marcelo.
>
>
>
> 2014-06-02 15:08 GMT-03:00 Jens Rantil :
>
> Hi Marcelo,
>>
>> Looks like you can't do this without migrating your data manually:
>> https://stackoverflow.com/questions/18421668/alter-cassandra-column-family-primary-key-using-cassandra-cli-or-cql
>>
>> Cheers,
>> Jens
>>
>>
>> On Mon, Jun 2, 2014 at 7:48 PM, Marcelo Elias Del Valle <
>> marc...@s1mbi0se.com.br> wrote:
>>
>>> Hi,
>>>
>>> I have some cql CFs in a 2 node Cassandra 2.0.8 cluster.
>>>
>>> I realized I created my column family with the wrong partition. Instead
>>> of:
>>>
>>> CREATE TABLE IF NOT EXISTS entity_lookup (
>>>   name varchar,
>>>   value varchar,
>>>   entity_id uuid,
>>>   PRIMARY KEY ((name, value), entity_id))
>>> WITH
>>> caching=all;
>>>
>>> I used:
>>>
>>> CREATE TABLE IF NOT EXISTS entitylookup (
>>>   name varchar,
>>>   value varchar,
>>>   entity_id uuid,
>>>   PRIMARY KEY (name, value, entity_id))
>>> WITH
>>> caching=all;
>>>
>>>
>>> Now I need to migrate the data from the second CF to the first one.
>>> I am using Data Stax Community Edition.
>>>
>>> What would be the best way to convert data from one CF to the other?
>>>
>>> Best regards,
>>> Marcelo.
>>>
>>
>>
>


Re: Logging of triggers

2014-06-03 Thread Joel Samuelsson
I found now that i logged with a too low log level set so it was filtered
from the system log. Logging with a more critical log level made the log
messages appear in the system log.

/Joel


2014-06-03 16:30 GMT+02:00 Joel Samuelsson :

> I'm testing triggers as part of a project and would like to add some
> logging to it. I'm using the same log structure as in the trigger example
> InvertedIndex but can't seem to find any logs. Where would I find the
> logging? In the system logs or somewhere else?
>
> /Joel
>


problem removing dead node from ring

2014-06-03 Thread Curious Patient
One of the nodes in a cassandra cluster has died.

I'm using cassandra 2.0.7 throughout.

When I do a nodetool status this is what I see (real addresses have been
replaced with fake 10 nets)

[root@beta-new:/opt] #nodetool status
Datacenter: datacenter1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address Load   Tokens  Owns   Host ID
Rack
UN  10.10.1.94  171.02 KB  256 49.4%
 fd2f76ae-8dcf-4e93-a37f-bf1e9088696e  rack1
DN  10.10.1.98 ?  256 50.6%
 f2a48fc7-a362-43f5-9061-4bb3739fdeaf  rack1

I tried to get the token ID for the down node by doing a nodetool ring
command, grepping for the IP and doing a head -1 to get the initial one.

[root@beta-new:/opt] #nodetool ring | grep 10.10.1.98 | head -1
10.10.1.98 rack1   Down   Normal  ?   50.59%
   -9042969066862165996

I then started following this documentation on how to replace the node:

[
http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_node_t.html?scroll=task_ds_aks_15q_gk][1
]

So I installed cassandra on a new node but did not start it.

Set the following options:

cluster_name: 'Test Cluster'
seed_provider:
  - seeds: "10.10.1.94"
listen_address: 10.10.1.94
endpoint_snitch: SimpleSnitch

And set the initial token of the new install as the token -1 of the node
I'm trying to replace in cssandra.yaml:

initial_token: -9042969066862165995

And after making sure there was no data yet in:
  /var/lib/cassandra

I started up the database:

[root@web2:/etc/alternatives/cassandrahome] #./bin/cassandra -f
-Dcassandra.replace_address=10.10.1.98


The documentation I link to above says to use the replace_address directive
on the command line rather than cassandra-env.sh if you have a tarball
install (which we do) as opposed to a package install.

After I start it up, cassandra fails with the following message:

Exception encountered during startup: Cannot replace_address /
10.10.10.98 because it doesn't exist in gossip



So I'm wondering at this point if I've missed any steps or if there is
anything else I can try to replace this dead cassandra node?


Question about replacing a dead node

2014-06-03 Thread Prem Yadav
Hi,

in the last week week, we saw at least two emails about dead node
replacement. Though I saw the documentation about how to do this, i am not
sure I understand why this is required.

Assuming replication factor is >2, if a node dies, why does it matter? If
we add a new node is added, shouldn't it just take the chunk of data it
server as the "primary" node from the other existing nodes.
Why do we need to worry about replacing the dead node?

Thanks


Re: Question about replacing a dead node

2014-06-03 Thread Jeremy Jongsma
A dead node is still allocated key ranges, and Cassandra will wait for it
to come back online rather than redistributing its data. It needs to be
decommissioned or replaced by a new node for it to be truly dead as far as
the cluster is concerned.


On Tue, Jun 3, 2014 at 11:12 AM, Prem Yadav  wrote:

> Hi,
>
> in the last week week, we saw at least two emails about dead node
> replacement. Though I saw the documentation about how to do this, i am not
> sure I understand why this is required.
>
> Assuming replication factor is >2, if a node dies, why does it matter? If
> we add a new node is added, shouldn't it just take the chunk of data it
> server as the "primary" node from the other existing nodes.
> Why do we need to worry about replacing the dead node?
>
> Thanks
>


Re: Question about replacing a dead node

2014-06-03 Thread Curious Patient
>
> Assuming replication factor is >2, if a node dies, why does it matter? If
> we add a new node is added, shouldn't it just take the chunk of data it
> server as the "primary" node from the other existing nodes.
> Why do we need to worry about replacing the dead node?


The reason this matters is because I am unable to do a nodetool repair on
my keyspace with the dead node still being listed in nodetool status.  It
fails complaining that it cant't reach the dead node.


On Tue, Jun 3, 2014 at 12:18 PM, Jeremy Jongsma  wrote:

> A dead node is still allocated key ranges, and Cassandra will wait for it
> to come back online rather than redistributing its data. It needs to be
> decommissioned or replaced by a new node for it to be truly dead as far as
> the cluster is concerned.
>
>
> On Tue, Jun 3, 2014 at 11:12 AM, Prem Yadav  wrote:
>
>> Hi,
>>
>> in the last week week, we saw at least two emails about dead node
>> replacement. Though I saw the documentation about how to do this, i am not
>> sure I understand why this is required.
>>
>> Assuming replication factor is >2, if a node dies, why does it matter? If
>> we add a new node is added, shouldn't it just take the chunk of data it
>> server as the "primary" node from the other existing nodes.
>> Why do we need to worry about replacing the dead node?
>>
>> Thanks
>>
>
>


Re: Question about replacing a dead node

2014-06-03 Thread Ipremyadav
Thanks Mongo maven:)
I understand why you need to to do this. 
My question was more from the architecture point if view. Why doesn't Cassandra 
just redistribute the data? Is it because of the gossip protocol? 

Thanks,
Prem

On 3 Jun 2014, at 17:35, Curious Patient  wrote:

>> Assuming replication factor is >2, if a node dies, why does it matter? If we 
>> add a new node is added, shouldn't it just take the chunk of data it server 
>> as the "primary" node from the other existing nodes.
>> Why do we need to worry about replacing the dead node?
> 
> The reason this matters is because I am unable to do a nodetool repair on my 
> keyspace with the dead node still being listed in nodetool status.  It fails 
> complaining that it cant't reach the dead node.
> 
> 
>> On Tue, Jun 3, 2014 at 12:18 PM, Jeremy Jongsma  wrote:
>> A dead node is still allocated key ranges, and Cassandra will wait for it to 
>> come back online rather than redistributing its data. It needs to be 
>> decommissioned or replaced by a new node for it to be truly dead as far as 
>> the cluster is concerned.
>> 
>> 
>>> On Tue, Jun 3, 2014 at 11:12 AM, Prem Yadav  wrote:
>>> Hi,
>>> 
>>> in the last week week, we saw at least two emails about dead node 
>>> replacement. Though I saw the documentation about how to do this, i am not 
>>> sure I understand why this is required.
>>> 
>>> Assuming replication factor is >2, if a node dies, why does it matter? If 
>>> we add a new node is added, shouldn't it just take the chunk of data it 
>>> server as the "primary" node from the other existing nodes.
>>> Why do we need to worry about replacing the dead node?
>>> 
>>> Thanks
> 


Re: Question about replacing a dead node

2014-06-03 Thread Curious Patient
>
> Thanks Mongo maven:)
> I understand why you need to to do this.

My question was more from the architecture point if view. Why doesn't
> Cassandra just redistribute the data? Is it because of the gossip protocol?


Sure.. well I've attempted to launch new nodes to redistribute the data on
a temporary basis. I can't really afford to run more than a couple nodes at
a time at the moment. As we're just beginning to develop our cassandra
based app. But when I do launch new nodes the new one doesn't seem to own
more than 1% of the ring. And nodetool repair on our keyspace still fails.
And I think that's because the dead node is still showing up in the ring.

I realize we would need to have a ring of 3 nodes for the replication
factor to set right. But we were going to worry about distributing the data
to more nodes once our application firms up and becomes more usable.

Thanks


On Tue, Jun 3, 2014 at 12:45 PM, Ipremyadav  wrote:

> Thanks Mongo maven:)
> I understand why you need to to do this.
> My question was more from the architecture point if view. Why doesn't
> Cassandra just redistribute the data? Is it because of the gossip protocol?
>
> Thanks,
> Prem
>
> On 3 Jun 2014, at 17:35, Curious Patient  wrote:
>
> Assuming replication factor is >2, if a node dies, why does it matter? If
>> we add a new node is added, shouldn't it just take the chunk of data it
>> server as the "primary" node from the other existing nodes.
>> Why do we need to worry about replacing the dead node?
>
>
> The reason this matters is because I am unable to do a nodetool repair on
> my keyspace with the dead node still being listed in nodetool status.  It
> fails complaining that it cant't reach the dead node.
>
>
> On Tue, Jun 3, 2014 at 12:18 PM, Jeremy Jongsma 
> wrote:
>
>> A dead node is still allocated key ranges, and Cassandra will wait for it
>> to come back online rather than redistributing its data. It needs to be
>> decommissioned or replaced by a new node for it to be truly dead as far as
>> the cluster is concerned.
>>
>>
>> On Tue, Jun 3, 2014 at 11:12 AM, Prem Yadav  wrote:
>>
>>> Hi,
>>>
>>> in the last week week, we saw at least two emails about dead node
>>> replacement. Though I saw the documentation about how to do this, i am not
>>> sure I understand why this is required.
>>>
>>> Assuming replication factor is >2, if a node dies, why does it matter?
>>> If we add a new node is added, shouldn't it just take the chunk of data it
>>> server as the "primary" node from the other existing nodes.
>>> Why do we need to worry about replacing the dead node?
>>>
>>> Thanks
>>>
>>
>>
>


Re: problem removing dead node from ring

2014-06-03 Thread Robert Coli
On Tue, Jun 3, 2014 at 8:41 AM, Curious Patient 
wrote:

> I then started following this documentation on how to replace the node:
> [
> http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_node_t.html?scroll=task_ds_aks_15q_gk][1
> ]
>
 ...

> And set the initial token of the new install as the token -1 of the node
> I'm trying to replace in cssandra.yaml:
> ...
> [root@web2:/etc/alternatives/cassandrahome] #./bin/cassandra -f
> -Dcassandra.replace_address=10.10.1.98
> ...
> The documentation I link to above says to use the replace_address
> directive on the command line rather than cassandra-env.sh if you have a
> tarball install (which we do) as opposed to a package install.
>

If you are replacing an address, you need to use the identical
initial_token to the node you are replacing, not the token -1.

=Rob


Re: I have a deaf node?

2014-06-03 Thread Kevin Burton
To be fair, it might be best to represent hex as 0xdeaf or 0xDEAF instead
of just 'deaf'


On Sun, Jun 1, 2014 at 8:37 PM, David Daeschler 
wrote:

> I wouldnt worry unless it changes from deaf to deadbeef
>
>
> On Sun, Jun 1, 2014 at 11:34 PM, Tim Dunphy  wrote:
>
>> This post should definitely make to the hall of fame!! :)
>>
>>
>> My proudest accomplishment on the list. heh
>>
>>
>> On Sun, Jun 1, 2014 at 11:24 PM, Paulo Ricardo Motta Gomes <
>> paulo.mo...@chaordicsystems.com> wrote:
>>
>>> This post should definitely make to the hall of fame!! :)
>>>
>>>
>>> On Mon, Jun 2, 2014 at 12:05 AM, Tim Dunphy 
>>> wrote:
>>>
 That made my day. Not to worry thought unless you  start seeing the
> number 23 in your host ids.


 Yeah man, glad to provide some comic relief to the list! ;)


 On Sun, Jun 1, 2014 at 11:01 PM, Apostolis Xekoukoulotakis <
 xekou...@gmail.com> wrote:

> That made my day. Not to worry thought unless you  start seeing the
> number 23 in your host ids.
>  On Jun 2, 2014 12:40 AM, "Kevin Burton"  wrote:
>
>> could be worse… it could be under caffeinated and say decafbad …
>>
>>
>> On Sat, May 31, 2014 at 10:45 AM, Tim Dunphy 
>> wrote:
>>
>>> I think the "deaf" thing is just the ending of the host ID in
 hexadecimal. It's an extraordinary coincidence that it ends with DEAF 
 :D
>>>
>>>
>>> Hah.. yeah that thought did cross my mind.  :)
>>>
>>>
>>>
>>> On Sat, May 31, 2014 at 1:35 PM, DuyHai Doan 
>>> wrote:
>>>
 I think the "deaf" thing is just the ending of the host ID in
 hexadecimal. It's an extraordinary coincidence that it ends with DEAF 
 :D


 On Sat, May 31, 2014 at 6:38 PM, Tim Dunphy 
 wrote:

> I didn't realize cassandra nodes could develop hearing problems. :)
>
>
> But I have a dead node in my cluster I would like to get rid of.
>
> [root@beta:~] #nodetool status
> Datacenter: datacenter1
> ===
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address Load   Tokens  Owns   Host ID
>   Rack
> UN  10.10.1.94  199.6 KB   256 49.4%
>  fd2f76ae-8dcf-4e93-a37f-bf1e9088696e  rack1
> DN  10.10.1.64  ?  256 50.6%
>  f2a48fc7-a362-43f5-9061-4bb3739f*deaf * rack1
>
> I was just wondering what this could indicate and if that might
> mean that I will have some more trouble than I would be bargaining 
> for in
> getting rid of it.
>
> I've made a couple of attempts to get rid of this so far. I'm
> about to try again.
>
> Thanks
> Tim
>
> --
> GPG me!!
>
> gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
>
>

>>>
>>>
>>> --
>>> GPG me!!
>>>
>>> gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
>>>
>>>
>>
>>
>> --
>>
>> Founder/CEO Spinn3r.com
>> Location: *San Francisco, CA*
>> Skype: *burtonator*
>> blog: http://burtonator.wordpress.com
>> … or check out my Google+ profile
>> 
>> 
>> War is peace. Freedom is slavery. Ignorance is strength. Corporations
>> are people.
>>
>>


 --
 GPG me!!

 gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B


>>>
>>>
>>> --
>>> *Paulo Motta*
>>>
>>> Chaordic | *Platform*
>>> *www.chaordic.com.br *
>>> +55 48 3232.3200
>>>
>>
>>
>>
>> --
>> GPG me!!
>>
>> gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
>>
>>
>
>
> --
> David Daeschler
>



-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
Skype: *burtonator*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile


War is peace. Freedom is slavery. Ignorance is strength. Corporations are
people.


Re: Multi-DC Environment Question

2014-06-03 Thread Robert Coli
On Fri, May 30, 2014 at 4:08 AM, Vasileios Vlachos <
vasileiosvlac...@gmail.com> wrote:

> Basically you sort of confirmed that if down_time > max_hint_window_in_ms
> the only way to bring DC1 up-to-date is anti-entropy repair.
>

Also, read repair does not help either as we assumed that down_time >
> max_hint_window_in_ms. Please correct me if I am wrong.
>

My understanding is that if you :

1) set read repair chance to 100%
2) read all keys in the keyspace with a client

You would accomplish the same increase in consistency as you would by
running repair.

In cases where this may matter, and your system can handle delivering the
hints, increasing the already-increased-from-old-default-of-1-hour current
default of 3 hours to 6 or more hours gives operators more time to work in
the case of partition or failure. Note that hints are only an optimization,
only repair (and read repair at 100%, I think..) assert any guarantee of
consistency.

=Rob


Re: problem removing dead node from ring

2014-06-03 Thread Curious Patient
Hi Rob,

If you are replacing an address, you need to use the identical
> initial_token to the node you are replacing, not the token -1.


Thanks, I hope that does the trick. Btw.. was my idea of how to get at the
initial token of the missing/dead node correct?

.i.e.


nodetool ring | grep 10.10.1.98 | head -1

I want to be sure I'm using the right token.

Thanks
Tim










On Tue, Jun 3, 2014 at 1:45 PM, Robert Coli  wrote:

> On Tue, Jun 3, 2014 at 8:41 AM, Curious Patient 
> wrote:
>
>> I then started following this documentation on how to replace the node:
>> [
>> http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_node_t.html?scroll=task_ds_aks_15q_gk][1
>> ]
>>
>  ...
>
>> And set the initial token of the new install as the token -1 of the node
>> I'm trying to replace in cssandra.yaml:
>>  ...
>> [root@web2:/etc/alternatives/cassandrahome] #./bin/cassandra -f
>> -Dcassandra.replace_address=10.10.1.98
>> ...
>> The documentation I link to above says to use the replace_address
>> directive on the command line rather than cassandra-env.sh if you have a
>> tarball install (which we do) as opposed to a package install.
>>
>
> If you are replacing an address, you need to use the identical
> initial_token to the node you are replacing, not the token -1.
>
> =Rob
>


Re: problem removing dead node from ring

2014-06-03 Thread Robert Coli
On Tue, Jun 3, 2014 at 10:53 AM, Curious Patient 
wrote:

> I want to be sure I'm using the right token.
>

In nodetool ring, if you're not using vnodes, only one token should be
listed with both the IP of the old node and the status Down.

If you are using vnodes, it's a comma delimited list in initial_token,
which you can get from :

nodetool info -T | grep Token | awk '{print $3}' | paste -s -d,

=Rob


Re: Question about replacing a dead node

2014-06-03 Thread Redmumba
Repairing the range is an expensive operation and don't forget--just
because a node is down does not mean it's dead.  I take nodes down for
maintenance all the time--maybe there was a security update that needed to
be applied, for example, or perhaps a kernel update.  There are a multitude
of reasons why a node would be dead, but not replaced.

If you really, really wanted this to be automated, it'd be trivial to setup
a cron job that looked for dead nodes and removed them from the
cluster--then ran a repair on all of the nodes in your cluster.  This will
cause spikes, especially if you have a large cluster.

Andrew


On Tue, Jun 3, 2014 at 9:45 AM, Ipremyadav  wrote:

> Thanks Mongo maven:)
> I understand why you need to to do this.
> My question was more from the architecture point if view. Why doesn't
> Cassandra just redistribute the data? Is it because of the gossip protocol?
>
> Thanks,
> Prem
>
> On 3 Jun 2014, at 17:35, Curious Patient  wrote:
>
> Assuming replication factor is >2, if a node dies, why does it matter? If
>> we add a new node is added, shouldn't it just take the chunk of data it
>> server as the "primary" node from the other existing nodes.
>> Why do we need to worry about replacing the dead node?
>
>
> The reason this matters is because I am unable to do a nodetool repair on
> my keyspace with the dead node still being listed in nodetool status.  It
> fails complaining that it cant't reach the dead node.
>
>
> On Tue, Jun 3, 2014 at 12:18 PM, Jeremy Jongsma 
> wrote:
>
>> A dead node is still allocated key ranges, and Cassandra will wait for it
>> to come back online rather than redistributing its data. It needs to be
>> decommissioned or replaced by a new node for it to be truly dead as far as
>> the cluster is concerned.
>>
>>
>> On Tue, Jun 3, 2014 at 11:12 AM, Prem Yadav  wrote:
>>
>>> Hi,
>>>
>>> in the last week week, we saw at least two emails about dead node
>>> replacement. Though I saw the documentation about how to do this, i am not
>>> sure I understand why this is required.
>>>
>>> Assuming replication factor is >2, if a node dies, why does it matter?
>>> If we add a new node is added, shouldn't it just take the chunk of data it
>>> server as the "primary" node from the other existing nodes.
>>> Why do we need to worry about replacing the dead node?
>>>
>>> Thanks
>>>
>>
>>
>


Re: problem removing dead node from ring

2014-06-03 Thread Curious Patient
>
> In nodetool ring, if you're not using vnodes, only one token should be
> listed with both the IP of the old node and the status Down.
> If you are using vnodes, it's a comma delimited list in initial_token,
> which you can get from :
> nodetool info -T | grep Token | awk '{print $3}' | paste -s -d,


Oh, I'm seeing a lot of tokens for each node. So that means I am using
vnodes I guess? Really sorry I am just starting to try to really learn
cassandra. Still new at the game, however.



On Tue, Jun 3, 2014 at 2:01 PM, Robert Coli  wrote:

> On Tue, Jun 3, 2014 at 10:53 AM, Curious Patient 
> wrote:
>
>> I want to be sure I'm using the right token.
>>
>
> In nodetool ring, if you're not using vnodes, only one token should be
> listed with both the IP of the old node and the status Down.
>
> If you are using vnodes, it's a comma delimited list in initial_token,
> which you can get from :
>
> nodetool info -T | grep Token | awk '{print $3}' | paste -s -d,
>
> =Rob
>
>


Re: problem removing dead node from ring

2014-06-03 Thread Robert Coli
On Tue, Jun 3, 2014 at 11:03 AM, Curious Patient 
wrote:

> In nodetool ring, if you're not using vnodes, only one token should be
>> listed with both the IP of the old node and the status Down.
>> If you are using vnodes, it's a comma delimited list in initial_token,
>> which you can get from :
>> nodetool info -T | grep Token | awk '{print $3}' | paste -s -d,
>
>
> Oh, I'm seeing a lot of tokens for each node. So that means I am using
> vnodes I guess? Really sorry I am just starting to try to really learn
> cassandra. Still new at the game, however.
>

The default in current configs is to use 256 vnodes per physical node.

I always recommend specifying the full list of tokens in use in the
initial_token in the conf file, as a comma delimited list in the case of
vnodes. It helps in an assortment of operational cases.

Then, once that is done, wipe the data dirs again and use replace_address!

=Rob


java.lang.AssertionError: Added column does not sort as the last column

2014-06-03 Thread Leena Ghatpande
Has anyone seen this error on Cassandra 1.2.9? We have not done any upgrades or 
changes to column families since we went live in feb 2014.

 

we are getting the following error when we run the nodetool cleanup or nodetool 
repair on one of our Production Nodes. 

We have 2 data ceners with 2 nodes in each data center and with replication 
factor 2. We only see this error on one of the Nodes. 

 

java.lang.AssertionError: Added column does not sort as the last column
at 
org.apache.cassandra.db.ArrayBackedSortedColumns.addColumn(ArrayBackedSortedColumns.java:131)
at 
org.apache.cassandra.db.AbstractColumnContainer.addColumn(AbstractColumnContainer.java:119)
at 
org.apache.cassandra.db.AbstractColumnContainer.addColumn(AbstractColumnContainer.java:114)
at org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:219)
at 
org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumnsFromSSTable(ColumnFamilySerializer.java:149)
at 
org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:234)
at 
org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:114)
at 
org.apache.cassandra.db.compaction.PrecompactedRow.(PrecompactedRow.java:98)
at 
org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:160)
at 
org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:76)
at 
org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:57)
at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:114)
at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:97)
at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at 
org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:145)
at 
org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58)
at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:211)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)

  

Consolidating records and TTL

2014-06-03 Thread Charlie Mason
Hi All.

I have a system thats going to make possibly several concurrent changes to
a running total. I know I could use a counter for this. However I have
extra meta data I can store with the changes which would allow me to reply
the changes. If I use a counter and it looses some writes I can't recover
it as I will only have its current total not the extra meta data to know
where to replay from.

What I was planning to do was write each change of the value to a CQL table
with a Time UUID as a row level primary key as well as a partition key.
Then when I need to read the running total back I will do a query for all
the changes and add them up to get the total.

As there could be tens of thousands of these I want to have a period after
which these are consolidated. Most won't be any where near that but a few
will which I need to be able to support. So I was also going to have a
consolidated total table which holds the UUID of the values consolidated up
to. Since I can bound the query for the recent updates by the UUID I should
be able to avoid all the tombstones. So if the read encounters any changes
that can be consolidated it inserts a new consolidated value and deletes
the newly consolidated changes.

What I am slightly worried about is what happens if the consolidated value
insert fails but the deletes to the change records succeed. I would be left
with an inconsistent total indefinitely. I have come up with a couple of
ideas:


1, I could make it require all nodes to acknowledge it before deleting the
difference records.

2, May be I could have another period after its consolidated but before its
deleted?

3, Is there anyway I could use the TTL to allow to it to be deleted after a
period of time? Chances are another read would come in and fix the value.


Anyone got any other suggestions on how I could implement this?


Thanks,

Charlie M


RE: INSERT ... IF NOT EXISTS with some nodes unavailable

2014-06-03 Thread Ackerman, Mitchell
Thanks Haebin, I scaled up to a 3 node system and it now behaves as expected.  
Was trying to simplify the test case but shot myself in the foot instead.

Mitchell

From: monster@gmail.com [mailto:monster@gmail.com] On Behalf Of 
Frederick Haebin Na
Sent: Tuesday, June 03, 2014 2:17 AM
To: user@cassandra.apache.org
Subject: Re: INSERT ... IF NOT EXISTS with some nodes unavailable

Hello Mitchell,

I think it is due to your replication factor, which, I assume, is 2 since you 
have only 2 nodes in the cluster.
If you are using even number of nodes, Cassandra is impossible to run queries 
require QUORUM participants.

So, I think you have to expand your cluster to 3 nodes and make replication 
factor 3.
HTH

Haebin

2014-06-03 12:09 GMT+09:00 Ackerman, Mitchell 
mailto:mitchell.acker...@pgi.com>>:
Hi, I’m trying to get a query using

   INSERT ... IF NOT EXISTS

working when not all of the nodes are available.  As a test case I have 2 
nodes, one in AWS us-west-1, another in AWS eu-west-1.  The keyspace settings 
are described below.  When I only have one of the nodes available, the insert 
fails with an UnavaliableException (via TokenRangeOfflineException, see below 
for details).

From reading about the lightweight transactions 
(http://www.datastax.com/dev/blog/lightweight-transactions-in-cassandra-2-0) 
and the CQL expression, it looks like this scenario should be supported. Does 
anyone have any idea why it is not working?

You will notice that I’m using Astyanax, could this be a source of problems for 
this use case?

I presume that a Consistency Level of Serial is in use and that a quorum of 1 
should suffice for a 2 node cache.

Thanks, Mitchell

conferenceassignmentcache

replica_placement_strategy 
org.apache.cassandra.locator.NetworkTopologyStrategy
Replication Strategy Options
us-west1
eu-west   1

CREATE TABLE conferenceassignmentcache.conferenceassignmentcache_cf (
  id text PRIMARY KEY,
  value text
) WITH
  bloom_filter_fp_chance=0.01 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.00 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=0.10 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'SizeTieredCompactionStrategy'} AND
  compression={'sstable_compression': 'LZ4Compressor'};

When I try to INSERT a row with only one node available I get the following 
exception:

2014-06-02 20:57:59,947 [eventTaskExecutor-15] DEBUG 
[CassandraConferenceAssignmentManager.getInsertStatement() 53] - INSERT INTO 
conferenceassignmentcache_cf (id, value) VALUES (?, ?) IF NOT EXISTS USING TTL 
60;
2014-06-02 20:58:00,084 [eventTaskExecutor-15] DEBUG 
[ThriftConverter.ToConnectionPoolException() 157] -
2014-06-02 20:58:00,086 [eventTaskExecutor-15] ERROR 
[CountingConnectionPoolMonitor.trackError() 94] - 
com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException: 
TokenRangeOfflineException: 
[host=cache.alpha.us-west-1.bobdev.com(10.89.0.37):9160,
 latency=137(137), attempts=1]UnavailableException()
com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException: 
TokenRangeOfflineException: 
[host=cache.alpha.us-west-1.bobdev.com(10.89.0.37):9160,
 latency=137(137), attempts=1]UnavailableException()

Caused by: UnavailableException()
   at 
org.apache.cassandra.thrift.Cassandra$execute_prepared_cql3_query_result.read(Cassandra.java:41876)
   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
   at 
org.apache.cassandra.thrift.Cassandra$Client.recv_execute_prepared_cql3_query(Cassandra.java:1689)
   at 
org.apache.cassandra.thrift.Cassandra$Client.execute_prepared_cql3_query(Cassandra.java:1674)
   at 
com.netflix.astyanax.thrift.ThriftCql3Query.execute_prepared_cql_query(ThriftCql3Query.java:29)
   at 
com.netflix.astyanax.thrift.AbstractThriftCqlQuery$3$1.internalExecute(AbstractThriftCqlQuery.java:93)
   at 
com.netflix.astyanax.thrift.AbstractThriftCqlQuery$3$1.internalExecute(AbstractThriftCqlQuery.java:83)
   at 
com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:60)
   ... 35 more




Re: Multi-DC Environment Question

2014-06-03 Thread Vasileios Vlachos

Thanks for your responses!

Matt, I did a test with 4 nodes, 2 in each DC and the answer appears to 
be yes. The tokens seem to be unique across the entire cluster, not just 
on a per DC basis. I don't know if the number of nodes deployed is 
enough to reassure me, but this is my conclusion for now. Please, 
correct me if you know I'm wrong.


Rob, this is the plan of attack I have in mind now. Although, in case of 
a catastrophic failure of a DC, the downtime is usually longer than 
that. So it's either less than the default value (when testing that the 
DR works for example) or more (actually using the DR as primary DC). 
Based on that, the default seems reasonable to me.


I also found that nodetool repair can be performed on one DC only by 
specifying the --in-local-dc option. So, presumably the classic nodetool 
repair applies to the entire cluster (sounds obvious, but is that 
actually correct?).


Question 3 in my previous email still remains unanswered to me... I 
cannot find out if there is only one hint stored in the coordinator 
irrespective of number of replicas being down, and also if the hint is 
100% of the size of the original write request.


Thanks,

Vasilis

On 03/06/14 18:52, Robert Coli wrote:
On Fri, May 30, 2014 at 4:08 AM, Vasileios Vlachos 
mailto:vasileiosvlac...@gmail.com>> wrote:


Basically you sort of confirmed that if down_time >
max_hint_window_in_ms the only way to bring DC1 up-to-date is
anti-entropy repair.


Also, read repair does not help either as we assumed that
down_time > max_hint_window_in_ms. Please correct me if I am wrong.


My understanding is that if you :

1) set read repair chance to 100%
2) read all keys in the keyspace with a client

You would accomplish the same increase in consistency as you would by 
running repair.


In cases where this may matter, and your system can handle delivering 
the hints, increasing the already-increased-from-old-default-of-1-hour 
current default of 3 hours to 6 or more hours gives operators more 
time to work in the case of partition or failure. Note that hints are 
only an optimization, only repair (and read repair at 100%, I think..) 
assert any guarantee of consistency.


=Rob



--
Kind Regards,

Vasileios Vlachos



Re: Multi-DC Environment Question

2014-06-03 Thread Vasileios Vlachos
I should have said that earlier really... I am using 1.2.16 and Vnodes 
are enabled.


Thanks,

Vasilis

--
Kind Regards,

Vasileios Vlachos



Re: problem removing dead node from ring

2014-06-03 Thread Matthew Allen
Just out of curiosity, for a dead node, would it be possible to just

 - replace the node (no data in data/commit dirs), same IP Address, same
hostname.
 - restore the cassandra.yaml (initial_token etc)
 - set auto_bootstrap:false
 - start it up and then run a nodetool rebuild ?

Or would the Host ID value change with the new node ?

Thanks

Matt


On Wed, Jun 4, 2014 at 4:09 AM, Robert Coli  wrote:

> On Tue, Jun 3, 2014 at 11:03 AM, Curious Patient 
> wrote:
>
>> In nodetool ring, if you're not using vnodes, only one token should be
>>> listed with both the IP of the old node and the status Down.
>>> If you are using vnodes, it's a comma delimited list in initial_token,
>>> which you can get from :
>>> nodetool info -T | grep Token | awk '{print $3}' | paste -s -d,
>>
>>
>> Oh, I'm seeing a lot of tokens for each node. So that means I am using
>> vnodes I guess? Really sorry I am just starting to try to really learn
>> cassandra. Still new at the game, however.
>>
>
> The default in current configs is to use 256 vnodes per physical node.
>
> I always recommend specifying the full list of tokens in use in the
> initial_token in the conf file, as a comma delimited list in the case of
> vnodes. It helps in an assortment of operational cases.
>
> Then, once that is done, wipe the data dirs again and use replace_address!
>
> =Rob
>
>


Re: Multi-DC Environment Question

2014-06-03 Thread Matthew Allen
Thanks Vasileios.  I think I need to make a call as to whether to switch to
vnodes or stick with tokens for my Multi-DC cluster.

Would you be able to show a nodetool ring/status from your cluster to see
what the token assignment looks like ?

Thanks

Matt


On Wed, Jun 4, 2014 at 8:31 AM, Vasileios Vlachos <
vasileiosvlac...@gmail.com> wrote:

>  I should have said that earlier really... I am using 1.2.16 and Vnodes
> are enabled.
>
> Thanks,
>
> Vasilis
>
> --
> Kind Regards,
>
> Vasileios Vlachos
>
>


Re: python cql driver - cassandra.ReadTimeout - “Operation timed out - received only 1 responses.”

2014-06-03 Thread Marcelo Elias Del Valle
Indeed Alex, the problem was in the rpc timeouts on the server...
Thanks a lot, it's simple but I was losing time thinking my client config
was wrong!
[]s


2014-06-02 18:15 GMT-03:00 Alex Popescu :

> If I'm reading this correctly, what you are seeing is the read_timeout on
> Cassandra side and not the client side timeout. Even if you set the client
> side timeouts, the C* read & write timeouts are still respected on that
> side.
>
>
> On Mon, Jun 2, 2014 at 10:55 AM, Marcelo Elias Del Valle <
> marc...@s1mbi0se.com.br> wrote:
>
>> I am using Cassandra 2.0 with python CQL.
>>
>> I have created a column family as follows:
>>
>> CREATE KEYSPACE IF NOT EXISTS Identification
>>   WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy',
>>   'DC1' : 1 };
>>
>> USE Identification;
>>
>> CREATE TABLE IF NOT EXISTS entitylookup (
>>   name varchar,
>>   value varchar,
>>   entity_id uuid,
>>   PRIMARY KEY ((name, value), entity_id))
>> WITH
>> caching=all;
>>
>> I then try to count the number of records in this CF as follows:
>>
>> #!/usr/bin/env pythonimport argparseimport sysimport tracebackfrom cassandra 
>> import ConsistencyLevelfrom cassandra.cluster import Clusterfrom 
>> cassandra.query import SimpleStatement
>> def count(host, cf):
>> keyspace = "identification"
>> cluster = Cluster([host], port=9042, 
>> control_connection_timeout=6)
>> session = cluster.connect(keyspace)
>> session.default_timeout=6
>>
>> st = SimpleStatement("SELECT count(*) FROM %s" % cf, 
>> consistency_level=ConsistencyLevel.ALL)
>> for row in session.execute(st, timeout=6):
>> print "count for cf %s = %s " % (cf, str(row))
>> dump_pool.close()
>> dump_pool.join()
>> if __name__ == "__main__":
>> parser = argparse.ArgumentParser()
>> parser.add_argument("-cf", "--column-family", default="entitylookup", 
>> help="Column Family to query")
>> parser.add_argument("-H", "--host", default="localhost", help="Cassandra 
>> host")
>> args = parser.parse_args()
>>
>> count(args.host, args.column_family)
>>
>> print "fim"
>>
>>  The count is not that useful to me, it's just a test with an operation
>> that takes long to complete.
>>
>> Although I have defined timeout as 6 seconds, after less than 30
>> seconds I get the following error:
>>
>> ./count_entity_lookup.py  -H localhost -cf entitylookup
>> Traceback (most recent call last):
>>   File "./count_entity_lookup.py", line 27, in 
>> count(args.host, args.column_family)
>>   File "./count_entity_lookup.py", line 16, in count
>> for row in session.execute(st, timeout=None):
>>   File 
>> "/home/mvalle/pyenv0/local/lib/python2.7/site-packages/cassandra/cluster.py",
>>  line 1026, in execute
>> result = future.result(timeout)
>>   File 
>> "/home/mvalle/pyenv0/local/lib/python2.7/site-packages/cassandra/cluster.py",
>>  line 2300, in result
>> raise self._final_exception
>> cassandra.ReadTimeout: code=1200 [Timeout during read request] 
>> message="Operation timed out - received only 1 responses." 
>> info={'received_responses': 1, 'data_retrieved': True, 'required_responses': 
>> 2, 'consistency': 5}
>>
>>  It seems the answer was found in just a replica, but this really doesn't
>> make sense to me. Should't cassandra be able to query it anyway?
>>
>> These tests are running in a two node cluster, with RF = 2, write and
>> read consistency = ALL (but same results using QUORUM).
>>
>> Thanks in advance.
>>
>> Best regards,
>>
>> Marcelo.
>>
>
>
>
> --
>
> :- a)
>
>
> Alex Popescu
> Sen. Product Manager @ DataStax
> @al3xandru
>


Re: migration to a new model

2014-06-03 Thread Marcelo Elias Del Valle
Hi Michael,

For sure I would be interested in this program!

I am new both to python and for cql. I started creating this copier, but
was having problems with timeouts. Alex solved my problem here on the list,
but I think I will still have a lot of trouble making the copy to work fine.

I open sourced my version here:
https://github.com/s1mbi0se/cql_record_processor

Just in case it's useful for anything.

However, I saw CQL has support for concurrency itself and having something
made by someone who knows Python CQL Driver better would be very helpful.

My two servers today are at OVH (ovh.com), we have servers at AWS but but
several cases we prefer other hosts. Both servers have SDD and 64 Gb RAM, I
could use the script as a benchmark for you if you want. Besides, we have
some bigger clusters, I could run on the just to test the speed if this is
going to help.

Regards
Marcelo.


2014-06-03 11:40 GMT-03:00 Laing, Michael :

> Hi Marcelo,
>
> I could create a fast copy program by repurposing some python apps that I
> am using for benchmarking the python driver - do you still need this?
>
> With high levels of concurrency and multiple subprocess workers, based on
> my current actual benchmarks, I think I can get well over 1,000 rows/second
> on my mac and significantly more in AWS. I'm using variable size rows
> averaging 5kb.
>
> This would be the initial version of a piece of the benchmark suite we
> will release as part of our nyt⨍aбrik project on 21 June for my Cassandra
> Day NYC talk re the python driver.
>
> ml
>
>
> On Mon, Jun 2, 2014 at 2:15 PM, Marcelo Elias Del Valle <
> marc...@s1mbi0se.com.br> wrote:
>
>> Hi Jens,
>>
>> Thanks for trying to help.
>>
>> Indeed, I know I can't do it using just CQL. But what would you use to
>> migrate data manually? I tried to create a python program using auto
>> paging, but I am getting timeouts. I also tried Hive, but no success.
>> I only have two nodes and less than 200Gb in this cluster, any simple way
>> to extract the data quickly would be good enough for me.
>>
>> Best regards,
>> Marcelo.
>>
>>
>>
>> 2014-06-02 15:08 GMT-03:00 Jens Rantil :
>>
>> Hi Marcelo,
>>>
>>> Looks like you can't do this without migrating your data manually:
>>> https://stackoverflow.com/questions/18421668/alter-cassandra-column-family-primary-key-using-cassandra-cli-or-cql
>>>
>>> Cheers,
>>> Jens
>>>
>>>
>>> On Mon, Jun 2, 2014 at 7:48 PM, Marcelo Elias Del Valle <
>>> marc...@s1mbi0se.com.br> wrote:
>>>
 Hi,

 I have some cql CFs in a 2 node Cassandra 2.0.8 cluster.

 I realized I created my column family with the wrong partition. Instead
 of:

 CREATE TABLE IF NOT EXISTS entity_lookup (
   name varchar,
   value varchar,
   entity_id uuid,
   PRIMARY KEY ((name, value), entity_id))
 WITH
 caching=all;

 I used:

 CREATE TABLE IF NOT EXISTS entitylookup (
   name varchar,
   value varchar,
   entity_id uuid,
   PRIMARY KEY (name, value, entity_id))
 WITH
 caching=all;


 Now I need to migrate the data from the second CF to the first one.
 I am using Data Stax Community Edition.

 What would be the best way to convert data from one CF to the other?

 Best regards,
 Marcelo.

>>>
>>>
>>
>


Re: problem removing dead node from ring

2014-06-03 Thread Robert Coli
On Tue, Jun 3, 2014 at 3:48 PM, Matthew Allen 
wrote:

> Just out of curiosity, for a dead node, would it be possible to just
>
>  - replace the node (no data in data/commit dirs), same IP Address, same
> hostname.
>  - restore the cassandra.yaml (initial_token etc)
>  - set auto_bootstrap:false
>  - start it up and then run a nodetool rebuild ?
>
> Or would the Host ID value change with the new node ?
>

That would work, but until CASSANDRA-6961 [1] there is no way to prevent
this node from having a long window where it may serve stale reads at CLs
below QUORUM, until the rebuild completes.

"rebuild" gets you exactly one replica's worth of data, just like bootstrap
does. If you want to actually sync a node with all of its replicas and
RF>2, you want "repair" and not "rebuild." I wish "rebuild" had been named
something else, because people seem to think it does something it doesn't
do. This property of decreasing what I call "unique replica count" is why
people like me prefer to back up their nodes with something like tablesnap
[2], so that losing a node does not decrease the "unique replica count." A
simpler solution if you want to avoid the chance of inconsistency is to
operate with CL.QUORUM instead of CL.ONE.

You'd be better off leaving auto_bootstrap set to true and setting
-Dcassandra.replace_address, which bootstraps you (from a single-replica
source per range) to the token owned by the dead node. This is exactly like
your process above, except that you don't serve stale reads while doing so.

That said, the single-replica source thing is why people want to first
bootstrap (which does the same single-replica source thing as "rebuild" but
does not serve writes while it does so) and then repair and then, finally,
join the ring. Note that if writes are incoming, this does not actually
*close* the race window for stale reads at ONE, it just makes it much
shorter.

=Rob
[1] https://issues.apache.org/jira/browse/CASSANDRA-6961
[2] https://github.com/JeremyGrosser/tablesnap


Re: problem removing dead node from ring

2014-06-03 Thread Matthew Allen
| That would work, but until CASSANDRA-6961 [1] there is no way to prevent
this node from having a long window where it may serve stale
| reads at CLs below QUORUM, until the rebuild completes.

Thanks Robert, this makes perfect sense.  Do you know if CASSANDRA-6961
will be ported to 1.2.x ?

And apologies if these appear to be dumb questions, but is a repair more
suitable than a rebuild because the rebuild only contacts 1 replica (per
range), which may itself contain stale data ?

Thanks

Matt




On Wed, Jun 4, 2014 at 11:03 AM, Robert Coli  wrote:

> On Tue, Jun 3, 2014 at 3:48 PM, Matthew Allen 
> wrote:
>
>> Just out of curiosity, for a dead node, would it be possible to just
>>
>>  - replace the node (no data in data/commit dirs), same IP Address, same
>> hostname.
>>  - restore the cassandra.yaml (initial_token etc)
>>  - set auto_bootstrap:false
>>  - start it up and then run a nodetool rebuild ?
>>
>> Or would the Host ID value change with the new node ?
>>
>
> That would work, but until CASSANDRA-6961 [1] there is no way to prevent
> this node from having a long window where it may serve stale reads at CLs
> below QUORUM, until the rebuild completes.
>
> "rebuild" gets you exactly one replica's worth of data, just like
> bootstrap does. If you want to actually sync a node with all of its
> replicas and RF>2, you want "repair" and not "rebuild." I wish "rebuild"
> had been named something else, because people seem to think it does
> something it doesn't do. This property of decreasing what I call "unique
> replica count" is why people like me prefer to back up their nodes with
> something like tablesnap [2], so that losing a node does not decrease the
> "unique replica count." A simpler solution if you want to avoid the chance
> of inconsistency is to operate with CL.QUORUM instead of CL.ONE.
>
> You'd be better off leaving auto_bootstrap set to true and setting
> -Dcassandra.replace_address, which bootstraps you (from a single-replica
> source per range) to the token owned by the dead node. This is exactly like
> your process above, except that you don't serve stale reads while doing so.
>
> That said, the single-replica source thing is why people want to first
> bootstrap (which does the same single-replica source thing as "rebuild" but
> does not serve writes while it does so) and then repair and then, finally,
> join the ring. Note that if writes are incoming, this does not actually
> *close* the race window for stale reads at ONE, it just makes it much
> shorter.
>
> =Rob
> [1] https://issues.apache.org/jira/browse/CASSANDRA-6961
> [2] https://github.com/JeremyGrosser/tablesnap
>