Change primary key from int to bigint

2017-01-11 Thread Benjamin Roth
Hi there,

Does anyone know if there is a hack to change a "int" to a "bigint" in a
primary key?
I recognized very late, I took the wrong type and our production DB already
contains billions of records :(
Is there maybe a hack for it, because int and bigint are similar types or
does the SSTable serialization and maybe the token generation require the
tables to be completely reread+rewritten?

-- 
Benjamin Roth
Prokurist

Jaumo GmbH · www.jaumo.com
Wehrstraße 46 · 73035 Göppingen · Germany
Phone +49 7161 304880-6 · Fax +49 7161 304880-1
AG Ulm · HRB 731058 · Managing Director: Jens Kammerer


Re: Point in time restore

2017-01-11 Thread Stefan Podkowinski
Hi Hannu

It should be as simple as copying the archived commit logs to the recovery
directory, specifying the point in time you like to restore from the logs
by using the 'restore_point_in_time' setting and afterwards starting the
node.

On Tue, Jan 10, 2017 at 7:45 PM, Hannu Kröger  wrote:

> Hello,
>
> Are there any guides how to do a point-in-time restore for Cassandra?
>
> All I have seen is this:
> http://docs.datastax.com/en/archived/cassandra/2.0/
> cassandra/configuration/configLogArchive_t.html
>
> That gives an idea how to store the data for restore but how to do an
> actual restore is still a mystery to me.
>
> Any pointers?
>
> Cheers,
> Hannu
>


Re: Change primary key from int to bigint

2017-01-11 Thread Tom van der Woerdt
Hi Benjamin,

bigint and int have incompatible serialization types, so that won't work.
However, changing to 'varint' will work fine.

Hope that helps.

Tom


On Wed, Jan 11, 2017 at 9:21 AM, Benjamin Roth 
wrote:

> Hi there,
>
> Does anyone know if there is a hack to change a "int" to a "bigint" in a
> primary key?
> I recognized very late, I took the wrong type and our production DB
> already contains billions of records :(
> Is there maybe a hack for it, because int and bigint are similar types or
> does the SSTable serialization and maybe the token generation require the
> tables to be completely reread+rewritten?
>
> --
> Benjamin Roth
> Prokurist
>
> Jaumo GmbH · www.jaumo.com
> Wehrstraße 46 · 73035 Göppingen · Germany
> Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161 304880-1
> <+49%207161%203048801>
> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>


Re: Change primary key from int to bigint

2017-01-11 Thread Benjamin Roth
Few! You saved my life, thanks!

For my understanding:
When creating a new table, is bigint or varint a better choice for storing
(up to) 64bit ints? Is there a difference in performance?

2017-01-11 9:39 GMT+01:00 Tom van der Woerdt :

> Hi Benjamin,
>
> bigint and int have incompatible serialization types, so that won't work.
> However, changing to 'varint' will work fine.
>
> Hope that helps.
>
> Tom
>
>
>
> On Wed, Jan 11, 2017 at 9:21 AM, Benjamin Roth 
> wrote:
>
>> Hi there,
>>
>> Does anyone know if there is a hack to change a "int" to a "bigint" in a
>> primary key?
>> I recognized very late, I took the wrong type and our production DB
>> already contains billions of records :(
>> Is there maybe a hack for it, because int and bigint are similar types or
>> does the SSTable serialization and maybe the token generation require the
>> tables to be completely reread+rewritten?
>>
>> --
>> Benjamin Roth
>> Prokurist
>>
>> Jaumo GmbH · www.jaumo.com
>> Wehrstraße 46 · 73035 Göppingen · Germany
>> Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161 304880-1
>> <+49%207161%203048801>
>> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>>
>
>


-- 
Benjamin Roth
Prokurist

Jaumo GmbH · www.jaumo.com
Wehrstraße 46 · 73035 Göppingen · Germany
Phone +49 7161 304880-6 · Fax +49 7161 304880-1
AG Ulm · HRB 731058 · Managing Director: Jens Kammerer


Re: Change primary key from int to bigint

2017-01-11 Thread Tom van der Woerdt
Actually, come to think of it, there's a subtle serialization difference
between varint and int that will break token generation (see bottom of
mail). I think it's a bug that Cassandra will allow this, so don't do this
in production.

You can think of varint encoding as regular bigints with all the leading
zero bytes stripped off. This means the varint decoder will happily decode
the tinyint, smallint, int, and bigint types, but the encoder won't
necessarily re-encode to the same thing. Specifically, any int below
8388608 will have a different encoding in a varint.

There's a small performance impact with the varint encoding and decoding
scheme, but likely insignificant for any reasonable use case.

Tom






cqlsh> select * from foo where id in (1, 128, 256, 65535, 65536, 16777215,
16777216, 2147483647);

 id | value
+---
  1 |  test
128 |  test
256 |  test
  65535 |  test
  65536 |  test
   16777215 |  test
   16777216 |  test
 2147483647 |  test

(8 rows)
cqlsh> alter table foo alter id TYPE varint;
cqlsh> select * from foo where id in (1, 128, 256, 65535, 65536, 16777215,
16777216, 2147483647);

 id | value
+---
   16777215 |  test
   16777216 |  test
 2147483647 |  test

(3 rows)
cqlsh> select * from foo;

 id | value
+---
128 |  test
   16777216 |  test
  1 |  test
 2147483647 |  test
   16777215 |  test
256 |  test
  65535 |  test
  65536 |  test



On Wed, Jan 11, 2017 at 9:54 AM, Benjamin Roth 
wrote:

> Few! You saved my life, thanks!
>
> For my understanding:
> When creating a new table, is bigint or varint a better choice for storing
> (up to) 64bit ints? Is there a difference in performance?
>
> 2017-01-11 9:39 GMT+01:00 Tom van der Woerdt  >:
>
>> Hi Benjamin,
>>
>> bigint and int have incompatible serialization types, so that won't work.
>> However, changing to 'varint' will work fine.
>>
>> Hope that helps.
>>
>> Tom
>>
>>
>>
>> On Wed, Jan 11, 2017 at 9:21 AM, Benjamin Roth 
>> wrote:
>>
>>> Hi there,
>>>
>>> Does anyone know if there is a hack to change a "int" to a "bigint" in a
>>> primary key?
>>> I recognized very late, I took the wrong type and our production DB
>>> already contains billions of records :(
>>> Is there maybe a hack for it, because int and bigint are similar types
>>> or does the SSTable serialization and maybe the token generation require
>>> the tables to be completely reread+rewritten?
>>>
>>> --
>>> Benjamin Roth
>>> Prokurist
>>>
>>> Jaumo GmbH · www.jaumo.com
>>> Wehrstraße 46 · 73035 Göppingen · Germany
>>> Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161 304880-1
>>> <+49%207161%203048801>
>>> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>>>
>>
>>
>
>
> --
> Benjamin Roth
> Prokurist
>
> Jaumo GmbH · www.jaumo.com
> Wehrstraße 46 · 73035 Göppingen · Germany
> Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161 304880-1
> <+49%207161%203048801>
> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>


Re: Change primary key from int to bigint

2017-01-11 Thread Benjamin Roth
Wow okay! Fortunately I did not change the types, yet!

So there is no other way than reading the whole table and re-insert all
data?
Is there a faster way than doing all this with CQL? Like importing existing
SSTables directly into a new CF with the new column types?

2017-01-11 10:09 GMT+01:00 Tom van der Woerdt 
:

> Actually, come to think of it, there's a subtle serialization difference
> between varint and int that will break token generation (see bottom of
> mail). I think it's a bug that Cassandra will allow this, so don't do this
> in production.
>
> You can think of varint encoding as regular bigints with all the leading
> zero bytes stripped off. This means the varint decoder will happily decode
> the tinyint, smallint, int, and bigint types, but the encoder won't
> necessarily re-encode to the same thing. Specifically, any int below
> 8388608 will have a different encoding in a varint.
>
> There's a small performance impact with the varint encoding and decoding
> scheme, but likely insignificant for any reasonable use case.
>
> Tom
>
>
>
>
>
>
> cqlsh> select * from foo where id in (1, 128, 256, 65535, 65536, 16777215,
> 16777216, 2147483647);
>
>  id | value
> +---
>   1 |  test
> 128 |  test
> 256 |  test
>   65535 |  test
>   65536 |  test
>16777215 |  test
>16777216 |  test
>  2147483647 |  test
>
> (8 rows)
> cqlsh> alter table foo alter id TYPE varint;
> cqlsh> select * from foo where id in (1, 128, 256, 65535, 65536, 16777215,
> 16777216, 2147483647);
>
>  id | value
> +---
>16777215 |  test
>16777216 |  test
>  2147483647 |  test
>
> (3 rows)
> cqlsh> select * from foo;
>
>  id | value
> +---
> 128 |  test
>16777216 |  test
>   1 |  test
>  2147483647 |  test
>16777215 |  test
> 256 |  test
>   65535 |  test
>   65536 |  test
>
>
>
>
> On Wed, Jan 11, 2017 at 9:54 AM, Benjamin Roth 
> wrote:
>
>> Few! You saved my life, thanks!
>>
>> For my understanding:
>> When creating a new table, is bigint or varint a better choice for
>> storing (up to) 64bit ints? Is there a difference in performance?
>>
>> 2017-01-11 9:39 GMT+01:00 Tom van der Woerdt <
>> tom.vanderwoe...@booking.com>:
>>
>>> Hi Benjamin,
>>>
>>> bigint and int have incompatible serialization types, so that won't
>>> work. However, changing to 'varint' will work fine.
>>>
>>> Hope that helps.
>>>
>>> Tom
>>>
>>>
>>>
>>> On Wed, Jan 11, 2017 at 9:21 AM, Benjamin Roth 
>>> wrote:
>>>
 Hi there,

 Does anyone know if there is a hack to change a "int" to a "bigint" in
 a primary key?
 I recognized very late, I took the wrong type and our production DB
 already contains billions of records :(
 Is there maybe a hack for it, because int and bigint are similar types
 or does the SSTable serialization and maybe the token generation require
 the tables to be completely reread+rewritten?

 --
 Benjamin Roth
 Prokurist

 Jaumo GmbH · www.jaumo.com
 Wehrstraße 46 · 73035 Göppingen · Germany
 Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161 304880-1
 <+49%207161%203048801>
 AG Ulm · HRB 731058 · Managing Director: Jens Kammerer

>>>
>>>
>>
>>
>> --
>> Benjamin Roth
>> Prokurist
>>
>> Jaumo GmbH · www.jaumo.com
>> Wehrstraße 46 · 73035 Göppingen · Germany
>> Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161 304880-1
>> <+49%207161%203048801>
>> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>>
>
>


-- 
Benjamin Roth
Prokurist

Jaumo GmbH · www.jaumo.com
Wehrstraße 46 · 73035 Göppingen · Germany
Phone +49 7161 304880-6 · Fax +49 7161 304880-1
AG Ulm · HRB 731058 · Managing Director: Jens Kammerer


Re: Change primary key from int to bigint

2017-01-11 Thread Benjamin Roth
But it is safe to change non-primary-key columns from int to varint, right?

2017-01-11 10:09 GMT+01:00 Tom van der Woerdt 
:

> Actually, come to think of it, there's a subtle serialization difference
> between varint and int that will break token generation (see bottom of
> mail). I think it's a bug that Cassandra will allow this, so don't do this
> in production.
>
> You can think of varint encoding as regular bigints with all the leading
> zero bytes stripped off. This means the varint decoder will happily decode
> the tinyint, smallint, int, and bigint types, but the encoder won't
> necessarily re-encode to the same thing. Specifically, any int below
> 8388608 will have a different encoding in a varint.
>
> There's a small performance impact with the varint encoding and decoding
> scheme, but likely insignificant for any reasonable use case.
>
> Tom
>
>
>
>
>
>
> cqlsh> select * from foo where id in (1, 128, 256, 65535, 65536, 16777215,
> 16777216, 2147483647);
>
>  id | value
> +---
>   1 |  test
> 128 |  test
> 256 |  test
>   65535 |  test
>   65536 |  test
>16777215 |  test
>16777216 |  test
>  2147483647 |  test
>
> (8 rows)
> cqlsh> alter table foo alter id TYPE varint;
> cqlsh> select * from foo where id in (1, 128, 256, 65535, 65536, 16777215,
> 16777216, 2147483647);
>
>  id | value
> +---
>16777215 |  test
>16777216 |  test
>  2147483647 |  test
>
> (3 rows)
> cqlsh> select * from foo;
>
>  id | value
> +---
> 128 |  test
>16777216 |  test
>   1 |  test
>  2147483647 |  test
>16777215 |  test
> 256 |  test
>   65535 |  test
>   65536 |  test
>
>
>
>
> On Wed, Jan 11, 2017 at 9:54 AM, Benjamin Roth 
> wrote:
>
>> Few! You saved my life, thanks!
>>
>> For my understanding:
>> When creating a new table, is bigint or varint a better choice for
>> storing (up to) 64bit ints? Is there a difference in performance?
>>
>> 2017-01-11 9:39 GMT+01:00 Tom van der Woerdt <
>> tom.vanderwoe...@booking.com>:
>>
>>> Hi Benjamin,
>>>
>>> bigint and int have incompatible serialization types, so that won't
>>> work. However, changing to 'varint' will work fine.
>>>
>>> Hope that helps.
>>>
>>> Tom
>>>
>>>
>>>
>>> On Wed, Jan 11, 2017 at 9:21 AM, Benjamin Roth 
>>> wrote:
>>>
 Hi there,

 Does anyone know if there is a hack to change a "int" to a "bigint" in
 a primary key?
 I recognized very late, I took the wrong type and our production DB
 already contains billions of records :(
 Is there maybe a hack for it, because int and bigint are similar types
 or does the SSTable serialization and maybe the token generation require
 the tables to be completely reread+rewritten?

 --
 Benjamin Roth
 Prokurist

 Jaumo GmbH · www.jaumo.com
 Wehrstraße 46 · 73035 Göppingen · Germany
 Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161 304880-1
 <+49%207161%203048801>
 AG Ulm · HRB 731058 · Managing Director: Jens Kammerer

>>>
>>>
>>
>>
>> --
>> Benjamin Roth
>> Prokurist
>>
>> Jaumo GmbH · www.jaumo.com
>> Wehrstraße 46 · 73035 Göppingen · Germany
>> Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161 304880-1
>> <+49%207161%203048801>
>> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>>
>
>


-- 
Benjamin Roth
Prokurist

Jaumo GmbH · www.jaumo.com
Wehrstraße 46 · 73035 Göppingen · Germany
Phone +49 7161 304880-6 · Fax +49 7161 304880-1
AG Ulm · HRB 731058 · Managing Director: Jens Kammerer


Re: Change primary key from int to bigint

2017-01-11 Thread Tom van der Woerdt
My understanding is that it's safe... but considering "alter type" is going
to be removed completely (
https://issues.apache.org/jira/browse/CASSANDRA-12443), maybe not.

As for faster ways to do this: no idea :-(

Tom


On Wed, Jan 11, 2017 at 12:12 PM, Benjamin Roth 
wrote:

> But it is safe to change non-primary-key columns from int to varint, right?
>
> 2017-01-11 10:09 GMT+01:00 Tom van der Woerdt <
> tom.vanderwoe...@booking.com>:
>
>> Actually, come to think of it, there's a subtle serialization difference
>> between varint and int that will break token generation (see bottom of
>> mail). I think it's a bug that Cassandra will allow this, so don't do this
>> in production.
>>
>> You can think of varint encoding as regular bigints with all the leading
>> zero bytes stripped off. This means the varint decoder will happily decode
>> the tinyint, smallint, int, and bigint types, but the encoder won't
>> necessarily re-encode to the same thing. Specifically, any int below
>> 8388608 will have a different encoding in a varint.
>>
>> There's a small performance impact with the varint encoding and decoding
>> scheme, but likely insignificant for any reasonable use case.
>>
>> Tom
>>
>>
>>
>>
>>
>>
>> cqlsh> select * from foo where id in (1, 128, 256, 65535, 65536,
>> 16777215, 16777216, 2147483647 <%28214%29%20748-3647>);
>>
>>  id | value
>> +---
>>   1 |  test
>> 128 |  test
>> 256 |  test
>>   65535 |  test
>>   65536 |  test
>>16777215 |  test
>>16777216 |  test
>>  2147483647 <%28214%29%20748-3647> |  test
>>
>> (8 rows)
>> cqlsh> alter table foo alter id TYPE varint;
>> cqlsh> select * from foo where id in (1, 128, 256, 65535, 65536,
>> 16777215, 16777216, 2147483647 <%28214%29%20748-3647>);
>>
>>  id | value
>> +---
>>16777215 |  test
>>16777216 |  test
>>  2147483647 <%28214%29%20748-3647> |  test
>>
>> (3 rows)
>> cqlsh> select * from foo;
>>
>>  id | value
>> +---
>> 128 |  test
>>16777216 |  test
>>   1 |  test
>>  2147483647 <%28214%29%20748-3647> |  test
>>16777215 |  test
>> 256 |  test
>>   65535 |  test
>>   65536 |  test
>>
>>
>>
>>
>> On Wed, Jan 11, 2017 at 9:54 AM, Benjamin Roth 
>> wrote:
>>
>>> Few! You saved my life, thanks!
>>>
>>> For my understanding:
>>> When creating a new table, is bigint or varint a better choice for
>>> storing (up to) 64bit ints? Is there a difference in performance?
>>>
>>> 2017-01-11 9:39 GMT+01:00 Tom van der Woerdt <
>>> tom.vanderwoe...@booking.com>:
>>>
 Hi Benjamin,

 bigint and int have incompatible serialization types, so that won't
 work. However, changing to 'varint' will work fine.

 Hope that helps.

 Tom



 On Wed, Jan 11, 2017 at 9:21 AM, Benjamin Roth >>> > wrote:

> Hi there,
>
> Does anyone know if there is a hack to change a "int" to a "bigint" in
> a primary key?
> I recognized very late, I took the wrong type and our production DB
> already contains billions of records :(
> Is there maybe a hack for it, because int and bigint are similar types
> or does the SSTable serialization and maybe the token generation require
> the tables to be completely reread+rewritten?
>
> --
> Benjamin Roth
> Prokurist
>
> Jaumo GmbH · www.jaumo.com
> Wehrstraße 46 · 73035 Göppingen · Germany
> Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161 304880-1
> <+49%207161%203048801>
> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>


>>>
>>>
>>> --
>>> Benjamin Roth
>>> Prokurist
>>>
>>> Jaumo GmbH · www.jaumo.com
>>> Wehrstraße 46 · 73035 Göppingen · Germany
>>> Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161 304880-1
>>> <+49%207161%203048801>
>>> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>>>
>>
>>
>
>
> --
> Benjamin Roth
> Prokurist
>
> Jaumo GmbH · www.jaumo.com
> Wehrstraße 46 · 73035 Göppingen · Germany
> Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161 304880-1
> <+49%207161%203048801>
> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>


Re: Change primary key from int to bigint

2017-01-11 Thread DuyHai Doan
I don't understand why ALTER TYPE was even allowed initially. Apart from
very few corner cases, changing data type on existing data will lead to
disaster in many cases.

On Wed, Jan 11, 2017 at 12:20 PM, Tom van der Woerdt <
tom.vanderwoe...@booking.com> wrote:

> My understanding is that it's safe... but considering "alter type" is
> going to be removed completely (https://issues.apache.org/
> jira/browse/CASSANDRA-12443), maybe not.
>
> As for faster ways to do this: no idea :-(
>
> Tom
>
>
>
> On Wed, Jan 11, 2017 at 12:12 PM, Benjamin Roth 
> wrote:
>
>> But it is safe to change non-primary-key columns from int to varint,
>> right?
>>
>> 2017-01-11 10:09 GMT+01:00 Tom van der Woerdt <
>> tom.vanderwoe...@booking.com>:
>>
>>> Actually, come to think of it, there's a subtle serialization difference
>>> between varint and int that will break token generation (see bottom of
>>> mail). I think it's a bug that Cassandra will allow this, so don't do this
>>> in production.
>>>
>>> You can think of varint encoding as regular bigints with all the leading
>>> zero bytes stripped off. This means the varint decoder will happily decode
>>> the tinyint, smallint, int, and bigint types, but the encoder won't
>>> necessarily re-encode to the same thing. Specifically, any int below
>>> 8388608 will have a different encoding in a varint.
>>>
>>> There's a small performance impact with the varint encoding and decoding
>>> scheme, but likely insignificant for any reasonable use case.
>>>
>>> Tom
>>>
>>>
>>>
>>>
>>>
>>>
>>> cqlsh> select * from foo where id in (1, 128, 256, 65535, 65536,
>>> 16777215, 16777216, 2147483647 <%28214%29%20748-3647>);
>>>
>>>  id | value
>>> +---
>>>   1 |  test
>>> 128 |  test
>>> 256 |  test
>>>   65535 |  test
>>>   65536 |  test
>>>16777215 |  test
>>>16777216 |  test
>>>  2147483647 <%28214%29%20748-3647> |  test
>>>
>>> (8 rows)
>>> cqlsh> alter table foo alter id TYPE varint;
>>> cqlsh> select * from foo where id in (1, 128, 256, 65535, 65536,
>>> 16777215, 16777216, 2147483647 <%28214%29%20748-3647>);
>>>
>>>  id | value
>>> +---
>>>16777215 |  test
>>>16777216 |  test
>>>  2147483647 <%28214%29%20748-3647> |  test
>>>
>>> (3 rows)
>>> cqlsh> select * from foo;
>>>
>>>  id | value
>>> +---
>>> 128 |  test
>>>16777216 |  test
>>>   1 |  test
>>>  2147483647 <%28214%29%20748-3647> |  test
>>>16777215 |  test
>>> 256 |  test
>>>   65535 |  test
>>>   65536 |  test
>>>
>>>
>>>
>>>
>>> On Wed, Jan 11, 2017 at 9:54 AM, Benjamin Roth 
>>> wrote:
>>>
 Few! You saved my life, thanks!

 For my understanding:
 When creating a new table, is bigint or varint a better choice for
 storing (up to) 64bit ints? Is there a difference in performance?

 2017-01-11 9:39 GMT+01:00 Tom van der Woerdt <
 tom.vanderwoe...@booking.com>:

> Hi Benjamin,
>
> bigint and int have incompatible serialization types, so that won't
> work. However, changing to 'varint' will work fine.
>
> Hope that helps.
>
> Tom
>
>
>
> On Wed, Jan 11, 2017 at 9:21 AM, Benjamin Roth <
> benjamin.r...@jaumo.com> wrote:
>
>> Hi there,
>>
>> Does anyone know if there is a hack to change a "int" to a "bigint"
>> in a primary key?
>> I recognized very late, I took the wrong type and our production DB
>> already contains billions of records :(
>> Is there maybe a hack for it, because int and bigint are similar
>> types or does the SSTable serialization and maybe the token generation
>> require the tables to be completely reread+rewritten?
>>
>> --
>> Benjamin Roth
>> Prokurist
>>
>> Jaumo GmbH · www.jaumo.com
>> Wehrstraße 46 · 73035 Göppingen · Germany
>> Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161
>> 304880-1 <+49%207161%203048801>
>> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>>
>
>


 --
 Benjamin Roth
 Prokurist

 Jaumo GmbH · www.jaumo.com
 Wehrstraße 46 · 73035 Göppingen · Germany
 Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161 304880-1
 <+49%207161%203048801>
 AG Ulm · HRB 731058 · Managing Director: Jens Kammerer

>>>
>>>
>>
>>
>> --
>> Benjamin Roth
>> Prokurist
>>
>> Jaumo GmbH · www.jaumo.com
>> Wehrstraße 46 · 73035 Göppingen · Germany
>> Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161 304880-1
>> <+49%207161%203048801>
>> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>>
>
>


Re: Backups eating up disk space

2017-01-11 Thread Kunal Gangakhedkar
Thanks for the reply, Razi.

As I mentioned earlier, we're not currently using snapshots - it's only the
backups that are bothering me right now.

So my next question is pertaining to this statement of yours:

As far as I am aware, using *rm* is perfectly safe to delete the
> directories for snapshots/backups as long as you are careful not to delete
> your actively used sstable files and directories.


How do I find out which are the actively used sstables?
If by that you mean the main data files, does that mean I can safely remove
all files ONLY under the "backups/" directory?
Or, removing any files that are current hard-links inside backups can
potentially cause any issues?

Thanks,
Kunal

On 11 January 2017 at 01:06, Khaja, Raziuddin (NIH/NLM/NCBI) [C] <
raziuddin.kh...@nih.gov> wrote:

> Hello Kunal,
>
>
>
> I would take a look at the following configuration options in the
> Cassandra.yaml
>
>
>
> *Common automatic backup settings*
>
> *Incremental_backups:*
>
> http://docs.datastax.com/en/archived/cassandra/3.x/
> cassandra/configuration/configCassandra_yaml.html#configCassandra_yaml__
> incremental_backups
>
>
>
> (Default: false) Backs up data updated since the last snapshot was taken.
> When enabled, Cassandra creates a hard link to each SSTable flushed or
> streamed locally in a backups subdirectory of the keyspace data. Removing
> these links is the operator's responsibility.
>
>
>
> *snapshot_before_compaction*:
>
> http://docs.datastax.com/en/archived/cassandra/3.x/
> cassandra/configuration/configCassandra_yaml.html#configCassandra_yaml__
> snapshot_before_compaction
>
>
>
> (Default: false) Enables or disables taking a snapshot before each
> compaction. A snapshot is useful to back up data when there is a data
> format change. Be careful using this option: Cassandra does not clean up
> older snapshots automatically.
>
>
>
>
>
> *Advanced automatic backup setting*
>
> *auto_snapshot*:
>
> http://docs.datastax.com/en/archived/cassandra/3.x/
> cassandra/configuration/configCassandra_yaml.html#
> configCassandra_yaml__auto_snapshot
>
>
>
> (Default: true) Enables or disables whether Cassandra takes a snapshot of
> the data before truncating a keyspace or dropping a table. To prevent data
> loss, Datastax strongly advises using the default setting. If you
> set auto_snapshot to false, you lose data on truncation or drop.
>
>
>
>
>
> *nodetool* also provides methods to manage snapshots.
> http://docs.datastax.com/en/archived/cassandra/3.x/
> cassandra/tools/toolsNodetool.html
>
> See the specific commands:
>
>- nodetool clearsnapshot
>
> 
>Removes one or more snapshots.
>- nodetool listsnapshots
>
> 
>Lists snapshot names, size on disk, and true size.
>- nodetool snapshot
>
> 
>Take a snapshot of one or more keyspaces, or of a table, to backup
>data.
>
>
>
> As far as I am aware, using *rm* is perfectly safe to delete the
> directories for snapshots/backups as long as you are careful not to delete
> your actively used sstable files and directories.  I think the *nodetool
> clearsnapshot* command is provided so that you don’t accidentally delete
> actively used files.  Last I used *clearsnapshot*, (a very long time
> ago), I thought it left behind the directory, but this could have been
> fixed in newer versions (so you might want to check that).
>
>
>
> HTH
>
> -Razi
>
>
>
>
>
> *From: *Jonathan Haddad 
> *Reply-To: *"user@cassandra.apache.org" 
> *Date: *Tuesday, January 10, 2017 at 12:26 PM
> *To: *"user@cassandra.apache.org" 
> *Subject: *Re: Backups eating up disk space
>
>
>
> If you remove the files from the backup directory, you would not have data
> loss in the case of a node going down.  They're hard links to the same
> files that are in your data directory, and are created when an sstable is
> written to disk.  At the time, they take up (almost) no space, so they
> aren't a big deal, but when the sstable gets compacted, they stick around,
> so they end up not freeing space up.
>
>
>
> Usually you use incremental backups as a means of moving the sstables off
> the node to a backup location.  If you're not doing anything with them,
> they're just wasting space and you should disable incremental backups.
>
>
>
> Some people take snapshots then rely on incremental backups.  Others use
> the tablesnap utility which does sort of the same thing.
>
>
>
> On Tue, Jan 10, 2017 at 9:18 AM Kunal Gangakhedkar <
> kgangakhed...@gmail.com> wrote:
>
> Thanks for quick reply, Jon.
>
>
>
> But, what about in case of node/cluster going down? Would there be data
> loss if I remove these files manually?
>
>
>
> How is it typically managed in production setups?
>
> What are the best-practices for the same

Is this normal!?

2017-01-11 Thread Cogumelos Maravilha
Cassandra 3.9.

nodetool status
Datacenter: dc1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address   Load   Tokens   Owns (effective)  Host
ID   Rack
UN  10.0.120.145  1.21 MiB   256  49.5%
da6683cd-c3cf-4c14-b3cc-e7af4080c24f  rack1
UN  10.0.120.179  1020.51 KiB  256  48.1%
fb695bea-d5e8-4bde-99db-9f756456a035  rack1
UN  10.0.120.55   1.02 MiB   256  53.3%
eb911989-3555-4aef-b11c-4a684a89a8c4  rack1
UN  10.0.120.46   1.01 MiB   256  49.1%
8034c30a-c1bc-44d4-bf84-36742e0ec21c  rack1

nodetool repair
[2017-01-11 13:58:27,274] Replication factor is 1. No repair is needed
for keyspace 'system_auth'
[2017-01-11 13:58:27,284] Starting repair command #4, repairing keyspace
system_traces with repair options (parallelism: parallel, primary range:
false, incremental: true, job threads: 1, ColumnFamilies: [],
dataCenters: [], hosts: [], # of ranges: 515)
[2017-01-11 14:01:55,628] Repair session
82a25960-d806-11e6-8ac4-73b93fe4986d for range
[(-1278992819359672027,-1209509957304098060],
(-2593749995021251600,-2592266543457887959],
(-6451044457481580778,-6438233936014720969],
(-1917989291840804877,-1912580903456869648],
(-3693090304802198257,-3681923561719364766],
(-380426998894740867,-350094836653869552],
(1890591246410309420,1899294587910578387],
(6561031217224224632,6580230317350171440],
... 4 pages of data
, (6033828815719998292,6079920177089043443]] finished (progress: 1%)
[2017-01-11 13:58:27,986] Repair completed successfully
[2017-01-11 13:58:27,988] Repair command #4 finished in 0 seconds

nodetool gcstats
Interval (ms) Max GC Elapsed (ms)Total GC Elapsed (ms)Stdev GC Elapsed
(ms)   GC Reclaimed (MB) Collections  Direct Memory Bytes
360134  23 
23   0   333975216  
1   -1

(wait)
nodetool gcstats
Interval (ms) Max GC Elapsed (ms)Total GC Elapsed (ms)Stdev GC Elapsed
(ms)   GC Reclaimed (MB) Collections  Direct Memory Bytes
   60016   0   0
NaN   0   0   -1

nodetool repair
[2017-01-11 14:00:45,888] Replication factor is 1. No repair is needed
for keyspace 'system_auth'
[2017-01-11 14:00:45,896] Starting repair command #5, repairing keyspace
system_traces with repair options (parallelism: parallel, primary range:
false, incremental: true, job threads: 1, ColumnFamilies: [],
dataCenters: [], hosts: [], # of ranges: 515)
... 4 pages of data
, (94613607632078948,219237792837906432],
(6033828815719998292,6079920177089043443]] finished (progress: 1%)
[2017-01-11 14:00:46,567] Repair completed successfully
[2017-01-11 14:00:46,576] Repair command #5 finished in 0 seconds

nodetool gcstats
Interval (ms) Max GC Elapsed (ms)Total GC Elapsed (ms)Stdev GC Elapsed
(ms)   GC Reclaimed (MB) Collections  Direct Memory Bytes
   9169  25  25  
0   330518688   1   -1


Always in loop, I think!

Thanks in advance.



Re: Is this normal!?

2017-01-11 Thread Hannu Kröger
Just to understand:

What exactly is the problem?

Cheers,
Hannu

> On 11 Jan 2017, at 16.07, Cogumelos Maravilha  
> wrote:
> 
> Cassandra 3.9.
> 
> nodetool status
> Datacenter: dc1
> ===
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address   Load   Tokens   Owns (effective)  Host
> ID   Rack
> UN  10.0.120.145  1.21 MiB   256  49.5%
> da6683cd-c3cf-4c14-b3cc-e7af4080c24f  rack1
> UN  10.0.120.179  1020.51 KiB  256  48.1%
> fb695bea-d5e8-4bde-99db-9f756456a035  rack1
> UN  10.0.120.55   1.02 MiB   256  53.3%
> eb911989-3555-4aef-b11c-4a684a89a8c4  rack1
> UN  10.0.120.46   1.01 MiB   256  49.1%
> 8034c30a-c1bc-44d4-bf84-36742e0ec21c  rack1
> 
> nodetool repair
> [2017-01-11 13:58:27,274] Replication factor is 1. No repair is needed
> for keyspace 'system_auth'
> [2017-01-11 13:58:27,284] Starting repair command #4, repairing keyspace
> system_traces with repair options (parallelism: parallel, primary range:
> false, incremental: true, job threads: 1, ColumnFamilies: [],
> dataCenters: [], hosts: [], # of ranges: 515)
> [2017-01-11 14:01:55,628] Repair session
> 82a25960-d806-11e6-8ac4-73b93fe4986d for range
> [(-1278992819359672027,-1209509957304098060],
> (-2593749995021251600,-2592266543457887959],
> (-6451044457481580778,-6438233936014720969],
> (-1917989291840804877,-1912580903456869648],
> (-3693090304802198257,-3681923561719364766],
> (-380426998894740867,-350094836653869552],
> (1890591246410309420,1899294587910578387],
> (6561031217224224632,6580230317350171440],
> ... 4 pages of data
> , (6033828815719998292,6079920177089043443]] finished (progress: 1%)
> [2017-01-11 13:58:27,986] Repair completed successfully
> [2017-01-11 13:58:27,988] Repair command #4 finished in 0 seconds
> 
> nodetool gcstats
> Interval (ms) Max GC Elapsed (ms)Total GC Elapsed (ms)Stdev GC Elapsed
> (ms)   GC Reclaimed (MB) Collections  Direct Memory Bytes
>360134  23 
> 23   0   333975216  
> 1   -1
> 
> (wait)
> nodetool gcstats
> Interval (ms) Max GC Elapsed (ms)Total GC Elapsed (ms)Stdev GC Elapsed
> (ms)   GC Reclaimed (MB) Collections  Direct Memory Bytes
>   60016   0   0
> NaN   0   0   -1
> 
> nodetool repair
> [2017-01-11 14:00:45,888] Replication factor is 1. No repair is needed
> for keyspace 'system_auth'
> [2017-01-11 14:00:45,896] Starting repair command #5, repairing keyspace
> system_traces with repair options (parallelism: parallel, primary range:
> false, incremental: true, job threads: 1, ColumnFamilies: [],
> dataCenters: [], hosts: [], # of ranges: 515)
> ... 4 pages of data
> , (94613607632078948,219237792837906432],
> (6033828815719998292,6079920177089043443]] finished (progress: 1%)
> [2017-01-11 14:00:46,567] Repair completed successfully
> [2017-01-11 14:00:46,576] Repair command #5 finished in 0 seconds
> 
> nodetool gcstats
> Interval (ms) Max GC Elapsed (ms)Total GC Elapsed (ms)Stdev GC Elapsed
> (ms)   GC Reclaimed (MB) Collections  Direct Memory Bytes
>   9169  25  25  
> 0   330518688   1   -1
> 
> 
> Always in loop, I think!
> 
> Thanks in advance.
> 



Re: Is this normal!?

2017-01-11 Thread Cogumelos Maravilha
Nodetool repair always list lots of data and never stays repaired. I think.

Cheers


On 01/11/2017 02:15 PM, Hannu Kröger wrote:
> Just to understand:
>
> What exactly is the problem?
>
> Cheers,
> Hannu
>
>> On 11 Jan 2017, at 16.07, Cogumelos Maravilha  
>> wrote:
>>
>> Cassandra 3.9.
>>
>> nodetool status
>> Datacenter: dc1
>> ===
>> Status=Up/Down
>> |/ State=Normal/Leaving/Joining/Moving
>> --  Address   Load   Tokens   Owns (effective)  Host
>> ID   Rack
>> UN  10.0.120.145  1.21 MiB   256  49.5%
>> da6683cd-c3cf-4c14-b3cc-e7af4080c24f  rack1
>> UN  10.0.120.179  1020.51 KiB  256  48.1%
>> fb695bea-d5e8-4bde-99db-9f756456a035  rack1
>> UN  10.0.120.55   1.02 MiB   256  53.3%
>> eb911989-3555-4aef-b11c-4a684a89a8c4  rack1
>> UN  10.0.120.46   1.01 MiB   256  49.1%
>> 8034c30a-c1bc-44d4-bf84-36742e0ec21c  rack1
>>
>> nodetool repair
>> [2017-01-11 13:58:27,274] Replication factor is 1. No repair is needed
>> for keyspace 'system_auth'
>> [2017-01-11 13:58:27,284] Starting repair command #4, repairing keyspace
>> system_traces with repair options (parallelism: parallel, primary range:
>> false, incremental: true, job threads: 1, ColumnFamilies: [],
>> dataCenters: [], hosts: [], # of ranges: 515)
>> [2017-01-11 14:01:55,628] Repair session
>> 82a25960-d806-11e6-8ac4-73b93fe4986d for range
>> [(-1278992819359672027,-1209509957304098060],
>> (-2593749995021251600,-2592266543457887959],
>> (-6451044457481580778,-6438233936014720969],
>> (-1917989291840804877,-1912580903456869648],
>> (-3693090304802198257,-3681923561719364766],
>> (-380426998894740867,-350094836653869552],
>> (1890591246410309420,1899294587910578387],
>> (6561031217224224632,6580230317350171440],
>> ... 4 pages of data
>> , (6033828815719998292,6079920177089043443]] finished (progress: 1%)
>> [2017-01-11 13:58:27,986] Repair completed successfully
>> [2017-01-11 13:58:27,988] Repair command #4 finished in 0 seconds
>>
>> nodetool gcstats
>> Interval (ms) Max GC Elapsed (ms)Total GC Elapsed (ms)Stdev GC Elapsed
>> (ms)   GC Reclaimed (MB) Collections  Direct Memory Bytes
>>360134  23 
>> 23   0   333975216  
>> 1   -1
>>
>> (wait)
>> nodetool gcstats
>> Interval (ms) Max GC Elapsed (ms)Total GC Elapsed (ms)Stdev GC Elapsed
>> (ms)   GC Reclaimed (MB) Collections  Direct Memory Bytes
>>   60016   0   0
>> NaN   0   0   -1
>>
>> nodetool repair
>> [2017-01-11 14:00:45,888] Replication factor is 1. No repair is needed
>> for keyspace 'system_auth'
>> [2017-01-11 14:00:45,896] Starting repair command #5, repairing keyspace
>> system_traces with repair options (parallelism: parallel, primary range:
>> false, incremental: true, job threads: 1, ColumnFamilies: [],
>> dataCenters: [], hosts: [], # of ranges: 515)
>> ... 4 pages of data
>> , (94613607632078948,219237792837906432],
>> (6033828815719998292,6079920177089043443]] finished (progress: 1%)
>> [2017-01-11 14:00:46,567] Repair completed successfully
>> [2017-01-11 14:00:46,576] Repair command #5 finished in 0 seconds
>>
>> nodetool gcstats
>> Interval (ms) Max GC Elapsed (ms)Total GC Elapsed (ms)Stdev GC Elapsed
>> (ms)   GC Reclaimed (MB) Collections  Direct Memory Bytes
>>   9169  25  25  
>> 0   330518688   1   -1
>>
>>
>> Always in loop, I think!
>>
>> Thanks in advance.
>>



Re: Strange issue wherein cassandra not being started from cron

2017-01-11 Thread Ajay Garg
Tried everything.
Every other cron job/script I try works, just the cassandra-service does
not.

On Wed, Jan 11, 2017 at 8:51 AM, Edward Capriolo 
wrote:

>
>
> On Tuesday, January 10, 2017, Jonathan Haddad  wrote:
>
>> Last I checked, cron doesn't load the same, full environment you see when
>> you log in. Also, why put Cassandra on a cron?
>> On Mon, Jan 9, 2017 at 9:47 PM Bhuvan Rawal  wrote:
>>
>>> Hi Ajay,
>>>
>>> Have you had a look at cron logs? - mine is in path /var/log/cron
>>>
>>> Thanks & Regards,
>>>
>>> On Tue, Jan 10, 2017 at 9:45 AM, Ajay Garg 
>>> wrote:
>>>
 Hi All.

 Facing a very weird issue, wherein the command

 */etc/init.d/cassandra start*

 causes cassandra to start when the command is run from command-line.


 However, if I put the above as a cron job



 ** * * * * /etc/init.d/cassandra start*
 cassandra never starts.


 I have checked, and "cron" service is running.


 Any ideas what might be wrong?
 I am pasting the cassandra script for brevity.


 Thanks and Regards,
 Ajay


 
 
 #! /bin/sh
 ### BEGIN INIT INFO
 # Provides:  cassandra
 # Required-Start:$remote_fs $network $named $time
 # Required-Stop: $remote_fs $network $named $time
 # Should-Start:  ntp mdadm
 # Should-Stop:   ntp mdadm
 # Default-Start: 2 3 4 5
 # Default-Stop:  0 1 6
 # Short-Description: distributed storage system for structured data
 # Description:   Cassandra is a distributed (peer-to-peer) system
 for
 #the management and storage of structured data.
 ### END INIT INFO

 # Author: Eric Evans 

 DESC="Cassandra"
 NAME=cassandra
 PIDFILE=/var/run/$NAME/$NAME.pid
 SCRIPTNAME=/etc/init.d/$NAME
 CONFDIR=/etc/cassandra
 WAIT_FOR_START=10
 CASSANDRA_HOME=/usr/share/cassandra
 FD_LIMIT=10

 [ -e /usr/share/cassandra/apache-cassandra.jar ] || exit 0
 [ -e /etc/cassandra/cassandra.yaml ] || exit 0
 [ -e /etc/cassandra/cassandra-env.sh ] || exit 0

 # Read configuration variable file if it is present
 [ -r /etc/default/$NAME ] && . /etc/default/$NAME

 # Read Cassandra environment file.
 . /etc/cassandra/cassandra-env.sh

 if [ -z "$JVM_OPTS" ]; then
 echo "Initialization failed; \$JVM_OPTS not set!" >&2
 exit 3
 fi

 export JVM_OPTS

 # Export JAVA_HOME, if set.
 [ -n "$JAVA_HOME" ] && export JAVA_HOME

 # Load the VERBOSE setting and other rcS variables
 . /lib/init/vars.sh

 # Define LSB log_* functions.
 # Depend on lsb-base (>= 3.0-6) to ensure that this file is present.
 . /lib/lsb/init-functions

 #
 # Function that returns 0 if process is running, or nonzero if not.
 #
 # The nonzero value is 3 if the process is simply not running, and 1 if
 the
 # process is not running but the pidfile exists (to match the exit
 codes for
 # the "status" command; see LSB core spec 3.1, section 20.2)
 #
 CMD_PATT="cassandra.+CassandraDaemon"
 is_running()
 {
 if [ -f $PIDFILE ]; then
 pid=`cat $PIDFILE`
 grep -Eq "$CMD_PATT" "/proc/$pid/cmdline" 2>/dev/null && return
 0
 return 1
 fi
 return 3
 }
 #
 # Function that starts the daemon/service
 #
 do_start()
 {
 # Return
 #   0 if daemon has been started
 #   1 if daemon was already running
 #   2 if daemon could not be started

 ulimit -l unlimited
 ulimit -n "$FD_LIMIT"

 cassandra_home=`getent passwd cassandra | awk -F ':' '{ print $6;
 }'`
 heap_dump_f="$cassandra_home/java_`date +%s`.hprof"
 error_log_f="$cassandra_home/hs_err_`date +%s`.log"

 [ -e `dirname "$PIDFILE"` ] || \
 install -d -ocassandra -gcassandra -m755 `dirname $PIDFILE`



 start-stop-daemon -S -c cassandra -a /usr/sbin/cassandra -q -p
 "$PIDFILE" -t >/dev/null || return 1

 start-stop-daemon -S -c cassandra -a /usr/sbin/cassandra -b -p
 "$PIDFILE" -- \
 -p "$PIDFILE" -H "$heap_dump_f" -E "$error_log_f" >/dev/null ||
 return 2

 }

 #
 # Function that stops the daemon/service
 #
 do_stop()
 {
 # Return
 #   0 if daemon has been stopped
 #   1 if daemon was already stopped
 #   2 if daemon could not be stopped
 #   other if a failure occurred
 start-stop-daemon -K -p "$PIDFILE" -R TERM/30/KILL/5 >/dev/null
 RET=$?
 rm -f "$PIDFILE"
 return $RET
 }

 case "$1" in
   start)
 [ "$VERBOSE" != 

System Keyspace replication for multi DC clusters

2017-01-11 Thread John Hughes
Hi All,

I have been looking for definitive information on this, and either it
doesn't seem to exists, or I cannot find the correct combination of
keywords to find it(entirely possible, maybe even likely).

When setting up multi rack/multi dc clusters(currently I am deploying in
AWS across multiAZ/multiRegion with the aws multi region snitch), which
system keyspaces need to be moved to NetworkTopologyStrategy?

Currently I have been doing system_auth and system_distributed, but other
than system_auth(needed to make sure all my users are everywhere(especially
replacement admin), I cannot find an authoritative document. This feels
like something that should be in the FAQ, or just in a section about
deploying to multi-DC.

Does anyone have a link to authoritative info?

Thanks in advance!

John Hughes


Re: Strange issue wherein cassandra not being started from cron

2017-01-11 Thread Martin Schröder
2017-01-11 15:42 GMT+01:00 Ajay Garg :
> Tried everything.

Then try
   service cassandra start
or
   systemctl start cassandra

You still haven't explained to us why you want to start cassandra every minute.

Best
   Martin


Re: Strange issue wherein cassandra not being started from cron

2017-01-11 Thread Hannu Kröger
One possible reason is that cassandra process gets different user when run 
differently. Check who owns the data files and check also what gets written 
into the /var/log/cassandra/system.log (or whatever that was).

Hannu

> On 11 Jan 2017, at 16.42, Ajay Garg  wrote:
> 
> Tried everything.
> Every other cron job/script I try works, just the cassandra-service does not.
> 
> On Wed, Jan 11, 2017 at 8:51 AM, Edward Capriolo  > wrote:
> 
> 
> On Tuesday, January 10, 2017, Jonathan Haddad  > wrote:
> Last I checked, cron doesn't load the same, full environment you see when you 
> log in. Also, why put Cassandra on a cron?
> On Mon, Jan 9, 2017 at 9:47 PM Bhuvan Rawal > wrote:
> Hi Ajay,
> 
> Have you had a look at cron logs? - mine is in path /var/log/cron
> 
> Thanks & Regards,
> 
> On Tue, Jan 10, 2017 at 9:45 AM, Ajay Garg > wrote:
> Hi All.
> 
> Facing a very weird issue, wherein the command
> 
> /etc/init.d/cassandra start
> 
> causes cassandra to start when the command is run from command-line.
> 
> 
> However, if I put the above as a cron job
> 
> * * * * * /etc/init.d/cassandra start
> 
> cassandra never starts.
> 
> 
> I have checked, and "cron" service is running.
> 
> 
> Any ideas what might be wrong?
> I am pasting the cassandra script for brevity.
> 
> 
> Thanks and Regards,
> Ajay
> 
> 
> 
> #! /bin/sh
> ### BEGIN INIT INFO
> # Provides:  cassandra
> # Required-Start:$remote_fs $network $named $time
> # Required-Stop: $remote_fs $network $named $time
> # Should-Start:  ntp mdadm
> # Should-Stop:   ntp mdadm
> # Default-Start: 2 3 4 5
> # Default-Stop:  0 1 6
> # Short-Description: distributed storage system for structured data
> # Description:   Cassandra is a distributed (peer-to-peer) system for
> #the management and storage of structured data.
> ### END INIT INFO
> 
> # Author: Eric Evans >
> 
> DESC="Cassandra"
> NAME=cassandra
> PIDFILE=/var/run/$NAME/$NAME.pid
> SCRIPTNAME=/etc/init.d/$NAME
> CONFDIR=/etc/cassandra
> WAIT_FOR_START=10
> CASSANDRA_HOME=/usr/share/cassandra
> FD_LIMIT=10
> 
> [ -e /usr/share/cassandra/apache-cassandra.jar ] || exit 0
> [ -e /etc/cassandra/cassandra.yaml ] || exit 0
> [ -e /etc/cassandra/cassandra-env.sh ] || exit 0
> 
> # Read configuration variable file if it is present
> [ -r /etc/default/$NAME ] && . /etc/default/$NAME
> 
> # Read Cassandra environment file.
> . /etc/cassandra/cassandra-env.sh
> 
> if [ -z "$JVM_OPTS" ]; then
> echo "Initialization failed; \$JVM_OPTS not set!" >&2
> exit 3
> fi
> 
> export JVM_OPTS
> 
> # Export JAVA_HOME, if set.
> [ -n "$JAVA_HOME" ] && export JAVA_HOME
> 
> # Load the VERBOSE setting and other rcS variables
> . /lib/init/vars.sh
> 
> # Define LSB log_* functions.
> # Depend on lsb-base (>= 3.0-6) to ensure that this file is present.
> . /lib/lsb/init-functions
> 
> #
> # Function that returns 0 if process is running, or nonzero if not.
> #
> # The nonzero value is 3 if the process is simply not running, and 1 if the
> # process is not running but the pidfile exists (to match the exit codes for
> # the "status" command; see LSB core spec 3.1, section 20.2)
> #
> CMD_PATT="cassandra.+CassandraDaemon"
> is_running()
> {
> if [ -f $PIDFILE ]; then
> pid=`cat $PIDFILE`
> grep -Eq "$CMD_PATT" "/proc/$pid/cmdline" 2>/dev/null && return 0
> return 1
> fi
> return 3
> }
> #
> # Function that starts the daemon/service
> #
> do_start()
> {
> # Return
> #   0 if daemon has been started
> #   1 if daemon was already running
> #   2 if daemon could not be started
> 
> ulimit -l unlimited
> ulimit -n "$FD_LIMIT"
> 
> cassandra_home=`getent passwd cassandra | awk -F ':' '{ print $6; }'`
> heap_dump_f="$cassandra_home/java_`date +%s`.hprof"
> error_log_f="$cassandra_home/hs_err_`date +%s`.log"
> 
> [ -e `dirname "$PIDFILE"` ] || \
> install -d -ocassandra -gcassandra -m755 `dirname $PIDFILE`
> 
> 
> 
> start-stop-daemon -S -c cassandra -a /usr/sbin/cassandra -q -p "$PIDFILE" 
> -t >/dev/null || return 1
> 
> start-stop-daemon -S -c cassandra -a /usr/sbin/cassandra -b -p "$PIDFILE" 
> -- \
> -p "$PIDFILE" -H "$heap_dump_f" -E "$error_log_f" >/dev/null || 
> return 2
> 
> }
> 
> #
> # Function that stops the daemon/service
> #
> do_stop()
> {
> # Return
> #   0 if daemon has been stopped
> #   1 if daemon was already stopped
> #   2 if daemon could not be stopped
> #   other if a failure occurred
> start-stop-daemon -K -p "$PIDFILE" -R TERM/30/KILL/5 >/dev/null
> RET=$?
> rm -f "$PIDFILE"
> return $RET
> }
> 
> case "$1" in
>   start)
> [ "$VERBOSE" != no ] && log_daemon_msg "Starting $DESC" "$NAME"
> do_start
> case "$?" in
> 0|1) [ "$VERBOSE"

Re: Backups eating up disk space

2017-01-11 Thread Khaja, Raziuddin (NIH/NLM/NCBI) [C]
Hello Kunal,

Caveat: I am not a super-expert on Cassandra, but it helps to explain to 
others, in order to eventually become an expert, so if my explanation is wrong, 
I would hope others would correct me. ☺

The active sstables/data files are are all the files located in the directory 
for the table.
You can safely remove all files under the backups/ directory and the directory 
itself.
Removing any files that are current hard-links inside backups won’t cause any 
issues, and I will explain why.

Have you looked at your Cassandra.yaml file and checked the setting for 
incremental_backups?  If it is set to true, and you don’t want to make new 
backups, you can set it to false, so that after you clean up, you will not have 
to clean up the backups again.

Explanation:
Lets look at the the definition of incremental backups again: “Cassandra 
creates a hard link to each SSTable flushed or streamed locally in a backups 
subdirectory of the keyspace data.”

Suppose we have a directory path: my_keyspace/my_table-some-uuid/backups/
In the rest of the discussion, when I refer to “table directory”, I explicitly 
mean the directory: my_keyspace/my_table-some-uuid/
When I refer to backups/ directory, I explicitly mean: 
my_keyspace/my_table-some-uuid/backups/

Suppose that you have an sstable-A that was either flushed from a memtable or 
streamed from another node.
At this point, you have a hardlink to sstable-A in your table directory, and a 
hardlink to sstable-A in your backups/ directory.
Suppose that you have another sstable-B that was also either flushed from a 
memtable or streamed from another node.
At this point, you have a hardlink to sstable-B in your table directory, and a 
hardlink to sstable-B in your backups/ directory.

Next, suppose compaction were to occur, where say sstable-A and sstable-B would 
be compacted to produce sstable-C, representing all the data from A and B.
Now, sstable-C will live in your main table directory, and the hardlinks to 
sstable-A and sstable-B will be deleted in the main table directory, but 
sstable-A and sstable-B will continue to exist in /backups.
At this point, in your main table directory, you will have a hardlink to 
sstable-C. In your backups/ directory you will have hardlinks to sstable-A, and 
sstable-B.

Thus, your main table directory is not cluttered with old un-compacted 
sstables, and only has the sstables along with other files that are actively 
being used.

To drive the point home, …
Suppose that you have another sstable-D that was either flushed from a memtable 
or streamed from another node.
At this point, in your main table directory, you will have sstable-C and 
sstable-D. In your backups/ directory you will have hardlinks to sstable-A, 
sstable-B, and sstable-D.

Next, suppose compaction were to occur where say sstable-C and sstable-D would 
be compacted to produce sstable-E, representing all the data from C and D.
Now, sstable-E will live in your main table directory, and the hardlinks to 
sstable-C and sstable-D will be deleted in the main table directory, but 
sstable-D will continue to exist in /backups.
At this point, in your main table directory, you will have a hardlink to 
sstable-E. In your backups/ directory you will have hardlinks to sstable-A, 
sstable-B and sstable-D.

As you can see, the /backups directory quickly accumulates with all 
un-compacted sstables and how it progressively used up more and more space.
Also, note that the /backups directory does not contain sstables generated from 
compaction, such as sstable-C and sstable-E.
It is safe to delete the entire backups/ directory because all the data is 
represented in the compacted sstable-E.
I hope this explanation was clear and gives you confidence in using rm to 
delete the directory for backups/.

Best regards,
-Razi



From: Kunal Gangakhedkar 
Reply-To: "user@cassandra.apache.org" 
Date: Wednesday, January 11, 2017 at 6:47 AM
To: "user@cassandra.apache.org" 
Subject: Re: Backups eating up disk space

Thanks for the reply, Razi.

As I mentioned earlier, we're not currently using snapshots - it's only the 
backups that are bothering me right now.

So my next question is pertaining to this statement of yours:

As far as I am aware, using rm is perfectly safe to delete the directories for 
snapshots/backups as long as you are careful not to delete your actively used 
sstable files and directories.

How do I find out which are the actively used sstables?
If by that you mean the main data files, does that mean I can safely remove all 
files ONLY under the "backups/" directory?
Or, removing any files that are current hard-links inside backups can 
potentially cause any issues?

Thanks,
Kunal

On 11 January 2017 at 01:06, Khaja, Raziuddin (NIH/NLM/NCBI) [C] 
mailto:raziuddin.kh...@nih.gov>> wrote:
Hello Kunal,

I would take a look at the following configuration options in the Cassandra.yaml

Common automatic backup settings
Incremental_backups:
http://docs.datastax.com/en/archived/c

Re: incremental repairs with -pr flag?

2017-01-11 Thread Paulo Motta
The objective of non-incremental primary-range repair is to avoid redoing
work, but with incremental repair anticompaction will segregate repaired
data so no extra work is done on the next repair.

You should run nodetool repair [ks] [table] in all nodes sequentially. The
more often you run, the smaller time repair will take, so just choose the
periodicity that suits you better provided it's below gc_grace_seconds.


2017-01-10 13:40 GMT-02:00 Bruno Lavoie :

>
>
> On 2016-10-24 13:39 (-0500), Alexander Dejanovski 
> wrote:
> > Hi Sean,
> >
> > In order to mitigate its impact, anticompaction is not fully executed
> when
> > incremental repair is run with -pr. What you'll observe is that running
> > repair on all nodes with -pr will leave sstables marked as unrepaired on
> > all of them.
> >
> > Then, if you think about it you realize it's no big deal as -pr is
> useless
> > with incremental repair : data is repaired only once with incremental
> > repair, which is what -pr intended to fix on full repair, by repairing
> all
> > token ranges only once instead of times the replication factor.
> >
> > Cheers,
> >
> > Le lun. 24 oct. 2016 18:05, Sean Bridges 
> a
> > écrit :
> >
> > > Hey,
> > >
> > > In the datastax documentation on repair [1], it says,
> > >
> > > "The partitioner range option is recommended for routine maintenance.
> Do
> > > not use it to repair a downed node. Do not use with incremental repair
> > > (default for Cassandra 3.0 and later)."
> > >
> > > Why is it not recommended to use -pr with incremental repairs?
> > >
> > > Thanks,
> > >
> > > Sean
> > >
> > > [1]
> > > https://docs.datastax.com/en/cassandra/3.x/cassandra/operations/
> opsRepairNodesManualRepair.html
> > > --
> > >
> > > Sean Bridges
> > >
> > > senior systems architect
> > > Global Relay
> > >
> > > *sean.brid...@globalrelay.net* 
> > >
> > > *866.484.6630 *
> > > New York | Chicago | Vancouver | London (+44.0800.032.9829) |
> Singapore
> > > (+65.3158.1301)
> > >
> > > Global Relay Archive supports email, instant messaging, BlackBerry,
> > > Bloomberg, Thomson Reuters, Pivot, YellowJacket, LinkedIn, Twitter,
> > > Facebook and more.
> > >
> > > Ask about *Global Relay Message*
> > >  - The Future of
> > > Collaboration in the Financial Services World
> > >
> > > All email sent to or from this address will be retained by Global
> Relay's
> > > email archiving system. This message is intended only for the use of
> the
> > > individual or entity to which it is addressed, and may contain
> information
> > > that is privileged, confidential, and exempt from disclosure under
> > > applicable law. Global Relay will not be liable for any compliance or
> > > technical information provided herein. All trademarks are the property
> of
> > > their respective owners.
> > >
> > > --
> > -
> > Alexander Dejanovski
> > France
> > @alexanderdeja
> >
> > Consultant
> > Apache Cassandra Consulting
> > http://www.thelastpickle.com
> >
>
> Hello,
>
> Was looking for exactly the same detail about the Datastax documentation,
> and not sure to understand everything from your response. I looked at my
> Cassandra: The Definitive Guide and nothing about this detail too.
>
> IIRC:
> - with incremental repair, it's safe to simply run 'nodetool repair' on
> each node, without any overhead or wasted resources (merkle trees building,
> compaction, etc)?
> - I've read that we must manually run manual anti-entropy repair on each
> node weekly or before the gc_grace_seconds (default 10 days)? Or only on
> returning dead node ?
>
> What's bad about running incremental repair on primary ranges only, node
> by node? Looks like a stepwise method to keep data consistent..
>
> In many sources I'm looking at, all examples are as «nodetool repair -pr»
> and no metion about using -full with -pr like here:
> http://www.datastax.com/dev/blog/repair-in-cassandra
>
>
> So, to keep a system healthy, with less impact:
> - what command to run nighly?
> - what command to run weekly?
>
> We're using C* 3.x
>
> Thanks
> Bruno Lavoie
>
>


Re: Strange issue wherein cassandra not being started from cron

2017-01-11 Thread Ajay Garg
On Wed, Jan 11, 2017 at 8:29 PM, Martin Schröder  wrote:

> 2017-01-11 15:42 GMT+01:00 Ajay Garg :
> > Tried everything.
>
> Then try
>service cassandra start
> or
>systemctl start cassandra
>
> You still haven't explained to us why you want to start cassandra every
> minute.
>

Hi Martin.

Sometimes, the cassandra-process gets killed (reason unknown as of now).
Doing a manual "service cassandra start" works then.

Adding this in cron would at least ensure that the maximum downtime is 59
seconds (till the time root-cause of cassandra-crashing is known).



>
> Best
>Martin
>



-- 
Regards,
Ajay


Re: Strange issue wherein cassandra not being started from cron

2017-01-11 Thread Ajay Garg
Hi Hannu.

On Wed, Jan 11, 2017 at 8:31 PM, Hannu Kröger  wrote:

> One possible reason is that cassandra process gets different user when run
> differently. Check who owns the data files and check also what gets written
> into the /var/log/cassandra/system.log (or whatever that was).
>

Absolutely nothing gets written to /var/log/cassandra/system.log (when
trying to invoke cassandra via cron).


>
> Hannu
>
>
> On 11 Jan 2017, at 16.42, Ajay Garg  wrote:
>
> Tried everything.
> Every other cron job/script I try works, just the cassandra-service does
> not.
>
> On Wed, Jan 11, 2017 at 8:51 AM, Edward Capriolo 
> wrote:
>
>>
>>
>> On Tuesday, January 10, 2017, Jonathan Haddad  wrote:
>>
>>> Last I checked, cron doesn't load the same, full environment you see
>>> when you log in. Also, why put Cassandra on a cron?
>>> On Mon, Jan 9, 2017 at 9:47 PM Bhuvan Rawal  wrote:
>>>
 Hi Ajay,

 Have you had a look at cron logs? - mine is in path /var/log/cron

 Thanks & Regards,

 On Tue, Jan 10, 2017 at 9:45 AM, Ajay Garg 
 wrote:

> Hi All.
>
> Facing a very weird issue, wherein the command
>
> */etc/init.d/cassandra start*
>
> causes cassandra to start when the command is run from command-line.
>
>
> However, if I put the above as a cron job
>
>
>
> ** * * * * /etc/init.d/cassandra start*
> cassandra never starts.
>
>
> I have checked, and "cron" service is running.
>
>
> Any ideas what might be wrong?
> I am pasting the cassandra script for brevity.
>
>
> Thanks and Regards,
> Ajay
>
>
> 
> 
> #! /bin/sh
> ### BEGIN INIT INFO
> # Provides:  cassandra
> # Required-Start:$remote_fs $network $named $time
> # Required-Stop: $remote_fs $network $named $time
> # Should-Start:  ntp mdadm
> # Should-Stop:   ntp mdadm
> # Default-Start: 2 3 4 5
> # Default-Stop:  0 1 6
> # Short-Description: distributed storage system for structured data
> # Description:   Cassandra is a distributed (peer-to-peer) system
> for
> #the management and storage of structured data.
> ### END INIT INFO
>
> # Author: Eric Evans 
>
> DESC="Cassandra"
> NAME=cassandra
> PIDFILE=/var/run/$NAME/$NAME.pid
> SCRIPTNAME=/etc/init.d/$NAME
> CONFDIR=/etc/cassandra
> WAIT_FOR_START=10
> CASSANDRA_HOME=/usr/share/cassandra
> FD_LIMIT=10
>
> [ -e /usr/share/cassandra/apache-cassandra.jar ] || exit 0
> [ -e /etc/cassandra/cassandra.yaml ] || exit 0
> [ -e /etc/cassandra/cassandra-env.sh ] || exit 0
>
> # Read configuration variable file if it is present
> [ -r /etc/default/$NAME ] && . /etc/default/$NAME
>
> # Read Cassandra environment file.
> . /etc/cassandra/cassandra-env.sh
>
> if [ -z "$JVM_OPTS" ]; then
> echo "Initialization failed; \$JVM_OPTS not set!" >&2
> exit 3
> fi
>
> export JVM_OPTS
>
> # Export JAVA_HOME, if set.
> [ -n "$JAVA_HOME" ] && export JAVA_HOME
>
> # Load the VERBOSE setting and other rcS variables
> . /lib/init/vars.sh
>
> # Define LSB log_* functions.
> # Depend on lsb-base (>= 3.0-6) to ensure that this file is present.
> . /lib/lsb/init-functions
>
> #
> # Function that returns 0 if process is running, or nonzero if not.
> #
> # The nonzero value is 3 if the process is simply not running, and 1
> if the
> # process is not running but the pidfile exists (to match the exit
> codes for
> # the "status" command; see LSB core spec 3.1, section 20.2)
> #
> CMD_PATT="cassandra.+CassandraDaemon"
> is_running()
> {
> if [ -f $PIDFILE ]; then
> pid=`cat $PIDFILE`
> grep -Eq "$CMD_PATT" "/proc/$pid/cmdline" 2>/dev/null &&
> return 0
> return 1
> fi
> return 3
> }
> #
> # Function that starts the daemon/service
> #
> do_start()
> {
> # Return
> #   0 if daemon has been started
> #   1 if daemon was already running
> #   2 if daemon could not be started
>
> ulimit -l unlimited
> ulimit -n "$FD_LIMIT"
>
> cassandra_home=`getent passwd cassandra | awk -F ':' '{ print $6;
> }'`
> heap_dump_f="$cassandra_home/java_`date +%s`.hprof"
> error_log_f="$cassandra_home/hs_err_`date +%s`.log"
>
> [ -e `dirname "$PIDFILE"` ] || \
> install -d -ocassandra -gcassandra -m755 `dirname $PIDFILE`
>
>
>
> start-stop-daemon -S -c cassandra -a /usr/sbin/cassandra -q -p
> "$PIDFILE" -t >/dev/null || return 1
>
> start-stop-daemon -S -c cassandra -a /usr/sbin/cassand

Re: Strange issue wherein cassandra not being started from cron

2017-01-11 Thread Benjamin Roth
I think you should take a look at supervisord or sth similar. This is a
much more reliable solution than using crons.

Am 12.01.2017 06:12 schrieb "Ajay Garg" :



On Wed, Jan 11, 2017 at 8:29 PM, Martin Schröder  wrote:

> 2017-01-11 15:42 GMT+01:00 Ajay Garg :
> > Tried everything.
>
> Then try
>service cassandra start
> or
>systemctl start cassandra
>
> You still haven't explained to us why you want to start cassandra every
> minute.
>

Hi Martin.

Sometimes, the cassandra-process gets killed (reason unknown as of now).
Doing a manual "service cassandra start" works then.

Adding this in cron would at least ensure that the maximum downtime is 59
seconds (till the time root-cause of cassandra-crashing is known).



>
> Best
>Martin
>



-- 
Regards,
Ajay


Re: Backups eating up disk space

2017-01-11 Thread Prasenjit Sarkar
Hi Kunal,

Razi's post does give a very lucid description of how cassandra manages the
hard links inside the backup directory.

Where it needs clarification is the following:
--> incremental backups is a system wide setting and so its an all or
nothing approach

--> as multiple people have stated, incremental backups do not create hard
links to compacted sstables. however, this can bloat the size of your
backups

--> again as stated, it is a general industry practice to place backups in
a different secondary storage location than the main production site. So
best to move it to the secondary storage before applying rm on the backups
folder

In my experience with production clusters, managing the backups folder
across multiple nodes can be painful if the objective is to ever recover
data. With the usual disclaimers, better to rely on third party vendors to
accomplish the needful rather than scripts/tablesnap.

Regards
Prasenjit

On Wed, Jan 11, 2017 at 7:49 AM, Khaja, Raziuddin (NIH/NLM/NCBI) [C] <
raziuddin.kh...@nih.gov> wrote:

> Hello Kunal,
>
>
>
> Caveat: I am not a super-expert on Cassandra, but it helps to explain to
> others, in order to eventually become an expert, so if my explanation is
> wrong, I would hope others would correct me. J
>
>
>
> The active sstables/data files are are all the files located in the
> directory for the table.
>
> You can safely remove all files under the backups/ directory and the
> directory itself.
>
> Removing any files that are current hard-links inside backups won’t cause
> any issues, and I will explain why.
>
>
>
> Have you looked at your Cassandra.yaml file and checked the setting for
> incremental_backups?  If it is set to true, and you don’t want to make new
> backups, you can set it to false, so that after you clean up, you will not
> have to clean up the backups again.
>
>
>
> Explanation:
>
> Lets look at the the definition of incremental backups again: “Cassandra
> creates a hard link to each SSTable flushed or streamed locally in
> a backups subdirectory of the keyspace data.”
>
>
>
> Suppose we have a directory path: my_keyspace/my_table-some-uuid/backups/
>
> In the rest of the discussion, when I refer to “table directory”, I
> explicitly mean the directory: my_keyspace/my_table-some-uuid/
>
> When I refer to backups/ directory, I explicitly mean:
> my_keyspace/my_table-some-uuid/backups/
>
>
>
> Suppose that you have an sstable-A that was either flushed from a memtable
> or streamed from another node.
>
> At this point, you have a hardlink to sstable-A in your table directory,
> and a hardlink to sstable-A in your backups/ directory.
>
> Suppose that you have another sstable-B that was also either flushed from
> a memtable or streamed from another node.
>
> At this point, you have a hardlink to sstable-B in your table directory,
> and a hardlink to sstable-B in your backups/ directory.
>
>
>
> Next, suppose compaction were to occur, where say sstable-A and sstable-B
> would be compacted to produce sstable-C, representing all the data from A
> and B.
>
> Now, sstable-C will live in your main table directory, and the hardlinks
> to sstable-A and sstable-B will be deleted in the main table directory, but
> sstable-A and sstable-B will continue to exist in /backups.
>
> At this point, in your main table directory, you will have a hardlink to
> sstable-C. In your backups/ directory you will have hardlinks to sstable-A,
> and sstable-B.
>
>
>
> Thus, your main table directory is not cluttered with old un-compacted
> sstables, and only has the sstables along with other files that are
> actively being used.
>
>
>
> To drive the point home, …
>
> Suppose that you have another sstable-D that was either flushed from a
> memtable or streamed from another node.
>
> At this point, in your main table directory, you will have sstable-C and
> sstable-D. In your backups/ directory you will have hardlinks to sstable-A,
> sstable-B, and sstable-D.
>
>
>
> Next, suppose compaction were to occur where say sstable-C and sstable-D
> would be compacted to produce sstable-E, representing all the data from C
> and D.
>
> Now, sstable-E will live in your main table directory, and the hardlinks
> to sstable-C and sstable-D will be deleted in the main table directory, but
> sstable-D will continue to exist in /backups.
>
> At this point, in your main table directory, you will have a hardlink to
> sstable-E. In your backups/ directory you will have hardlinks to sstable-A,
> sstable-B and sstable-D.
>
>
>
> As you can see, the /backups directory quickly accumulates with all
> un-compacted sstables and how it progressively used up more and more space.
>
> Also, note that the /backups directory does not contain sstables generated
> from compaction, such as sstable-C and sstable-E.
>
> It is safe to delete the entire backups/ directory because all the data is
> represented in the compacted sstable-E.
>
> I hope this explanation was clear and gives you confidence in using rm to
> delete 

Re: Strange issue wherein cassandra not being started from cron

2017-01-11 Thread Martin Schröder
2017-01-12 6:12 GMT+01:00 Ajay Garg :
> Sometimes, the cassandra-process gets killed (reason unknown as of now).

That's why you have a cluster of them.

Best
   Martin


Re: Strange issue wherein cassandra not being started from cron

2017-01-11 Thread Benjamin Roth
Yes, but it is legitimate to supervise and monitor nodes. I only doubt that
cron is the best tool for it.

2017-01-12 7:42 GMT+01:00 Martin Schröder :

> 2017-01-12 6:12 GMT+01:00 Ajay Garg :
> > Sometimes, the cassandra-process gets killed (reason unknown as of now).
>
> That's why you have a cluster of them.
>
> Best
>Martin
>



-- 
Benjamin Roth
Prokurist

Jaumo GmbH · www.jaumo.com
Wehrstraße 46 · 73035 Göppingen · Germany
Phone +49 7161 304880-6 · Fax +49 7161 304880-1
AG Ulm · HRB 731058 · Managing Director: Jens Kammerer


WriteTimeoutException When only One Node is Down

2017-01-11 Thread Shalom Sagges
Hi Everyone,

I'm using C* v3.0.9 for a cluster of 3 DCs with RF 3 in each DC. All
read/write queries are set to consistency LOCAL_QUORUM.
The relevant keyspace is built as follows:

*CREATE KEYSPACE mykeyspace WITH replication = {'class':
'NetworkTopologyStrategy', 'DC1': '3', 'DC2': '3', 'DC3': '3'}  AND
durable_writes = true;*

I use* Datastax driver 3.0.1*


When I performed a resiliency test for the application, each time I dropped
one node, the client got the following error:


com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra
timeout during write query at consistency TWO (2 replica were required but
only 1 acknowledged the write)
at
com.datastax.driver.core.exceptions.WriteTimeoutException.copy(WriteTimeoutException.java:73)
at
com.datastax.driver.core.exceptions.WriteTimeoutException.copy(WriteTimeoutException.java:26)
at
com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)
at
com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:245)
at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:63)
at
humanclick.ldap.commImpl.siteData.CassandraSiteDataDaoSpring.updateJprunDomains(CassandraSiteDataDaoSpring.java:121)
at
humanclick.ldap.commImpl.siteData.CassandraSiteDataDaoSpring.createOrUpdate(CassandraSiteDataDaoSpring.java:97)
at
humanclick.ldapAdapter.dataUpdater.impl.SiteDataToLdapUpdater.update(SiteDataToLdapUpdater.java:280)


After a few seconds the error no longer recurs. I have no idea why there's
a timeout since there are additional replicas that satisfy the consistency
level, and I'm more baffled when the error showed *"Cassandra timeout
during write query at consistency TWO (2 replica were required but only 1
acknowledged the write)"*

Any ideas?  I'm quite at a loss here.

Thanks!



Shalom Sagges
DBA
T: +972-74-700-4035
 
 We Create Meaningful Connections

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.