Check snapshot / sstable integrity

2017-01-12 Thread Jérôme Mainaud
Hello,

Is there any tool to test the integrity of a snapshot?

Suppose I have a snapshot based backup stored in an external low cost
storage system that I want to restore to a database after someone deleted
important data by mistake.

Before restoring the files, I will truncate the table to remove the
problematic tombstones.

But my Op retains my arm and asks: "Are you sure that the snapshot is safe
and will be restored before truncating data we have?"

If this scenario is a theoretical, the question is good. How can I verify
that a snapshot is clean?

Thank you,

-- 
Jérôme Mainaud
jer...@mainaud.com


Re: Check snapshot / sstable integrity

2017-01-12 Thread Alain RODRIGUEZ
Hi Jérôme,

About this concern:

But my Op retains my arm and asks: "Are you sure that the snapshot is safe
> and will be restored before truncating data we have?"


Make sure to enable snapshot on truncate (cassandra.yaml) or do it
manually. This way if the restored dataset is worst than the current one
(the one you plan to truncate), you can always rollback this truncate /
restore action. This way you can tell your "Op" that this is perfectly safe
anyway, no data would be lost, even in the worst case scenario (not
considering the downtime that would be induced). Plus this snapshot is
cheap (hard links) and do not need to be moved around or kept once you are
sure the old backup fits your need.

Truncate is definitely the way to go before restoring a backup. Parsing the
data to delete it all is not really an option imho.

Then about the technical question "how to know that a snapshot is clean" it
would be good to define "clean". You can make sure the backup is readable,
consistent enough and correspond to what you want by inserting all  the
sstables into a testing cluster and performing some reads there before
doing it in production. You can use for example AWS EC2 machines with big
EBS attached or whatever and use the sstableloader to load data into it.

If you are just worried about SSTables format validity, there is no tool I
am aware of to check sstables well formatted but it might exist or be
doable. An other option might be to do a checksum on each sstable before
uploading it elsewhere and make sure it matches when downloaded back.
That's the first things that come to my mind.

Hope that is helpful. Hopefully, someone else will be able to point you to
an existing tool to do this work.

Cheers,
---
Alain Rodriguez - @arodream - al...@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2017-01-12 11:33 GMT+01:00 Jérôme Mainaud :

> Hello,
>
> Is there any tool to test the integrity of a snapshot?
>
> Suppose I have a snapshot based backup stored in an external low cost
> storage system that I want to restore to a database after someone deleted
> important data by mistake.
>
> Before restoring the files, I will truncate the table to remove the
> problematic tombstones.
>
> But my Op retains my arm and asks: "Are you sure that the snapshot is safe
> and will be restored before truncating data we have?"
>
> If this scenario is a theoretical, the question is good. How can I verify
> that a snapshot is clean?
>
> Thank you,
>
> --
> Jérôme Mainaud
> jer...@mainaud.com
>


Re: Is this normal!?

2017-01-12 Thread Alain RODRIGUEZ
 Hi,

Nodetool repair always list lots of data and never stays repaired. I think.
>

This might be the reason:

"incremental: true"


Incremental repairs is the default in your version. It marks data as being
repaired in order to only repair each data only once. It is a clever
feature, but with some caveats. I would read about it as it is not trivial
to understand impacts and in some cases it can create issues and not be
such a good idea to use incremental repairs. Make sure to run a full repair
instead when a node goes down for example.

C*heers,
---
Alain Rodriguez - @arodream - al...@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com



2017-01-11 15:21 GMT+01:00 Cogumelos Maravilha :

> Nodetool repair always list lots of data and never stays repaired. I think.
>
> Cheers
>
>
> On 01/11/2017 02:15 PM, Hannu Kröger wrote:
> > Just to understand:
> >
> > What exactly is the problem?
> >
> > Cheers,
> > Hannu
> >
> >> On 11 Jan 2017, at 16.07, Cogumelos Maravilha <
> cogumelosmaravi...@sapo.pt> wrote:
> >>
> >> Cassandra 3.9.
> >>
> >> nodetool status
> >> Datacenter: dc1
> >> ===
> >> Status=Up/Down
> >> |/ State=Normal/Leaving/Joining/Moving
> >> --  Address   Load   Tokens   Owns (effective)  Host
> >> ID   Rack
> >> UN  10.0.120.145  1.21 MiB   256  49.5%
> >> da6683cd-c3cf-4c14-b3cc-e7af4080c24f  rack1
> >> UN  10.0.120.179  1020.51 KiB  256  48.1%
> >> fb695bea-d5e8-4bde-99db-9f756456a035  rack1
> >> UN  10.0.120.55   1.02 MiB   256  53.3%
> >> eb911989-3555-4aef-b11c-4a684a89a8c4  rack1
> >> UN  10.0.120.46   1.01 MiB   256  49.1%
> >> 8034c30a-c1bc-44d4-bf84-36742e0ec21c  rack1
> >>
> >> nodetool repair
> >> [2017-01-11 13:58:27,274] Replication factor is 1. No repair is needed
> >> for keyspace 'system_auth'
> >> [2017-01-11 13:58:27,284] Starting repair command #4, repairing keyspace
> >> system_traces with repair options (parallelism: parallel, primary range:
> >> false, incremental: true, job threads: 1, ColumnFamilies: [],
> >> dataCenters: [], hosts: [], # of ranges: 515)
> >> [2017-01-11 14:01:55,628] Repair session
> >> 82a25960-d806-11e6-8ac4-73b93fe4986d for range
> >> [(-1278992819359672027,-1209509957304098060],
> >> (-2593749995021251600,-2592266543457887959],
> >> (-6451044457481580778,-6438233936014720969],
> >> (-1917989291840804877,-1912580903456869648],
> >> (-3693090304802198257,-3681923561719364766],
> >> (-380426998894740867,-350094836653869552],
> >> (1890591246410309420,1899294587910578387],
> >> (6561031217224224632,6580230317350171440],
> >> ... 4 pages of data
> >> , (6033828815719998292,6079920177089043443]] finished (progress: 1%)
> >> [2017-01-11 13:58:27,986] Repair completed successfully
> >> [2017-01-11 13:58:27,988] Repair command #4 finished in 0 seconds
> >>
> >> nodetool gcstats
> >> Interval (ms) Max GC Elapsed (ms)Total GC Elapsed (ms)Stdev GC Elapsed
> >> (ms)   GC Reclaimed (MB) Collections  Direct Memory Bytes
> >>360134  23
> >> 23   0   333975216
> >> 1   -1
> >>
> >> (wait)
> >> nodetool gcstats
> >> Interval (ms) Max GC Elapsed (ms)Total GC Elapsed (ms)Stdev GC Elapsed
> >> (ms)   GC Reclaimed (MB) Collections  Direct Memory Bytes
> >>   60016   0   0
> >> NaN   0   0   -1
> >>
> >> nodetool repair
> >> [2017-01-11 14:00:45,888] Replication factor is 1. No repair is needed
> >> for keyspace 'system_auth'
> >> [2017-01-11 14:00:45,896] Starting repair command #5, repairing keyspace
> >> system_traces with repair options (parallelism: parallel, primary range:
> >> false, incremental: true, job threads: 1, ColumnFamilies: [],
> >> dataCenters: [], hosts: [], # of ranges: 515)
> >> ... 4 pages of data
> >> , (94613607632078948,219237792837906432],
> >> (6033828815719998292,6079920177089043443]] finished (progress: 1%)
> >> [2017-01-11 14:00:46,567] Repair completed successfully
> >> [2017-01-11 14:00:46,576] Repair command #5 finished in 0 seconds
> >>
> >> nodetool gcstats
> >> Interval (ms) Max GC Elapsed (ms)Total GC Elapsed (ms)Stdev GC Elapsed
> >> (ms)   GC Reclaimed (MB) Collections  Direct Memory Bytes
> >>   9169  25  25
> >> 0   330518688   1   -1
> >>
> >>
> >> Always in loop, I think!
> >>
> >> Thanks in advance.
> >>
>
>


Re: Strange issue wherein cassandra not being started from cron

2017-01-12 Thread Alain RODRIGUEZ
Hi Ajay, honestly I would try to fix the main issue:

Sometimes, the cassandra-process gets killed (reason unknown as of now).


As focusing on how to restart Apache Cassandra every minute  sounds like a
wrong approach to me:

Adding this in cron would at least ensure that the maximum downtime is 59
> seconds (till the time root-cause of cassandra-crashing is known).


What happens if 2,3 or 10 nodes go down at once? Also, hints due to this
issue, read_repairs and other anti-entropy system will continuously be
triggered. It doesn't sound healthy, predictable or even a working solution.

When Cassandra stops it can be due to a heap OOM (see system.log or gc.log)
or a native OOM (see kernel / system logs) or some other stuff logged in
system.log. "nodetool tpstats" is also often useful. If you want to fix
this we would probably be able to help you with it.

About your cron issue, have you tried using sudo? As it is a quick-fix, it
probably can be dirty. Just make sure new sstables are being written with
the proper user ("cassandra"  and not "root"). But again, I would not go
that path, but rather fix the issue in Cassandra without loosing 1 minute
with crontab.

C*heers,
---
Alain Rodriguez - @arodream - al...@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2017-01-12 7:55 GMT+01:00 Benjamin Roth :

> Yes, but it is legitimate to supervise and monitor nodes. I only doubt
> that cron is the best tool for it.
>
> 2017-01-12 7:42 GMT+01:00 Martin Schröder :
>
>> 2017-01-12 6:12 GMT+01:00 Ajay Garg :
>> > Sometimes, the cassandra-process gets killed (reason unknown as of now).
>>
>> That's why you have a cluster of them.
>>
>> Best
>>Martin
>>
>
>
>
> --
> Benjamin Roth
> Prokurist
>
> Jaumo GmbH · www.jaumo.com
> Wehrstraße 46 · 73035 Göppingen · Germany
> Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161 304880-1
> <+49%207161%203048801>
> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>


Metric to monitor partition size

2017-01-12 Thread Saumitra S
Is there any metric or way to find out if any partition has grown beyond a
certain size or certain row count?

If a partition reaches a certain size or limit, I want to stop sending
further write requests to it. Is it possible?


Re: Is this normal!?

2017-01-12 Thread Romain Hardouin
Just a side note: increase system_auth keyspace replication factor if you're 
using authentication. 

Le Jeudi 12 janvier 2017 14h52, Alain RODRIGUEZ  a 
écrit :
 

  Hi,

Nodetool repair always list lots of data and never stays repaired. I think.


This might be the reason:


"incremental: true"

Incremental repairs is the default in your version. It marks data as being 
repaired in order to only repair each data only once. It is a clever feature, 
but with some caveats. I would read about it as it is not trivial to understand 
impacts and in some cases it can create issues and not be such a good idea to 
use incremental repairs. Make sure to run a full repair instead when a node 
goes down for example.
C*heers,---Alain Rodriguez - @arodream - 
alain@thelastpickle.comFrance
The Last Pickle - Apache Cassandra Consultinghttp://www.thelastpickle.com
 
2017-01-11 15:21 GMT+01:00 Cogumelos Maravilha :

Nodetool repair always list lots of data and never stays repaired. I think.

Cheers


On 01/11/2017 02:15 PM, Hannu Kröger wrote:
> Just to understand:
>
> What exactly is the problem?
>
> Cheers,
> Hannu
>
>> On 11 Jan 2017, at 16.07, Cogumelos Maravilha  
>> wrote:
>>
>> Cassandra 3.9.
>>
>> nodetool status
>> Datacenter: dc1
>> ===
>> Status=Up/Down
>> |/ State=Normal/Leaving/Joining/ Moving
>> --  Address       Load       Tokens       Owns (effective)  Host
>> ID                               Rack
>> UN  10.0.120.145  1.21 MiB   256          49.5%
>> da6683cd-c3cf-4c14-b3cc- e7af4080c24f  rack1
>> UN  10.0.120.179  1020.51 KiB  256          48.1%
>> fb695bea-d5e8-4bde-99db- 9f756456a035  rack1
>> UN  10.0.120.55   1.02 MiB   256          53.3%
>> eb911989-3555-4aef-b11c- 4a684a89a8c4  rack1
>> UN  10.0.120.46   1.01 MiB   256          49.1%
>> 8034c30a-c1bc-44d4-bf84- 36742e0ec21c  rack1
>>
>> nodetool repair
>> [2017-01-11 13:58:27,274] Replication factor is 1. No repair is needed
>> for keyspace 'system_auth'
>> [2017-01-11 13:58:27,284] Starting repair command #4, repairing keyspace
>> system_traces with repair options (parallelism: parallel, primary range:
>> false, incremental: true, job threads: 1, ColumnFamilies: [],
>> dataCenters: [], hosts: [], # of ranges: 515)
>> [2017-01-11 14:01:55,628] Repair session
>> 82a25960-d806-11e6-8ac4- 73b93fe4986d for range
>> [(-1278992819359672027,- 1209509957304098060],
>> (-2593749995021251600,- 2592266543457887959],
>> (-6451044457481580778,- 6438233936014720969],
>> (-1917989291840804877,- 1912580903456869648],
>> (-3693090304802198257,- 3681923561719364766],
>> (-380426998894740867,- 350094836653869552],
>> (1890591246410309420, 1899294587910578387],
>> (6561031217224224632, 6580230317350171440],
>> ... 4 pages of data
>> , (6033828815719998292, 6079920177089043443]] finished (progress: 1%)
>> [2017-01-11 13:58:27,986] Repair completed successfully
>> [2017-01-11 13:58:27,988] Repair command #4 finished in 0 seconds
>>
>> nodetool gcstats
>> Interval (ms) Max GC Elapsed (ms)Total GC Elapsed (ms)Stdev GC Elapsed
>> (ms)   GC Reclaimed (MB)         Collections      Direct Memory Bytes
>>            360134                  23
>> 23                   0           333975216
>> 1                       -1
>>
>> (wait)
>> nodetool gcstats
>> Interval (ms) Max GC Elapsed (ms)Total GC Elapsed (ms)Stdev GC Elapsed
>> (ms)   GC Reclaimed (MB)         Collections      Direct Memory Bytes
>>           60016                   0                   0
>> NaN                   0                   0                       -1
>>
>> nodetool repair
>> [2017-01-11 14:00:45,888] Replication factor is 1. No repair is needed
>> for keyspace 'system_auth'
>> [2017-01-11 14:00:45,896] Starting repair command #5, repairing keyspace
>> system_traces with repair options (parallelism: parallel, primary range:
>> false, incremental: true, job threads: 1, ColumnFamilies: [],
>> dataCenters: [], hosts: [], # of ranges: 515)
>> ... 4 pages of data
>> , (94613607632078948, 219237792837906432],
>> (6033828815719998292, 6079920177089043443]] finished (progress: 1%)
>> [2017-01-11 14:00:46,567] Repair completed successfully
>> [2017-01-11 14:00:46,576] Repair command #5 finished in 0 seconds
>>
>> nodetool gcstats
>> Interval (ms) Max GC Elapsed (ms)Total GC Elapsed (ms)Stdev GC Elapsed
>> (ms)   GC Reclaimed (MB)         Collections      Direct Memory Bytes
>>       9169                  25                  25
>> 0           330518688                   1                       -1
>>
>>
>> Always in loop, I think!
>>
>> Thanks in advance.
>>





   

Re: Backups eating up disk space

2017-01-12 Thread Alain RODRIGUEZ
My 2 cents,

As I mentioned earlier, we're not currently using snapshots - it's only the
> backups that are bothering me right now.


I believe backups folder is just the new name for the previously called
snapshots folder. But I can be completely wrong, I haven't played that much
with snapshots in new versions yet.

Anyway, some operations in Apache Cassandra can trigger a snapshot:

- Repair (when not using parallel option but sequential repairs instead)
- Truncating a table (by default)
- Dropping a table (by default)
- Maybe other I can't think of... ?

If you want to clean space but still keep a backup you can run:

"nodetool clearsnapshots"
"nodetool snapshot "

This way and for a while, data won't be taking space as old files will be
cleaned and new files will be only hardlinks as detailed above. Then you
might want to work at a proper backup policy, probably implying getting
data out of production server (a lot of people uses S3 or similar
services). Or just do that from time to time, meaning you only keep a
backup and disk space behaviour will be hard to predict.

C*heers,
---
Alain Rodriguez - @arodream - al...@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2017-01-12 6:42 GMT+01:00 Prasenjit Sarkar :

> Hi Kunal,
>
> Razi's post does give a very lucid description of how cassandra manages
> the hard links inside the backup directory.
>
> Where it needs clarification is the following:
> --> incremental backups is a system wide setting and so its an all or
> nothing approach
>
> --> as multiple people have stated, incremental backups do not create hard
> links to compacted sstables. however, this can bloat the size of your
> backups
>
> --> again as stated, it is a general industry practice to place backups in
> a different secondary storage location than the main production site. So
> best to move it to the secondary storage before applying rm on the backups
> folder
>
> In my experience with production clusters, managing the backups folder
> across multiple nodes can be painful if the objective is to ever recover
> data. With the usual disclaimers, better to rely on third party vendors to
> accomplish the needful rather than scripts/tablesnap.
>
> Regards
> Prasenjit
>
>
> On Wed, Jan 11, 2017 at 7:49 AM, Khaja, Raziuddin (NIH/NLM/NCBI) [C] <
> raziuddin.kh...@nih.gov> wrote:
>
>> Hello Kunal,
>>
>>
>>
>> Caveat: I am not a super-expert on Cassandra, but it helps to explain to
>> others, in order to eventually become an expert, so if my explanation is
>> wrong, I would hope others would correct me. J
>>
>>
>>
>> The active sstables/data files are are all the files located in the
>> directory for the table.
>>
>> You can safely remove all files under the backups/ directory and the
>> directory itself.
>>
>> Removing any files that are current hard-links inside backups won’t cause
>> any issues, and I will explain why.
>>
>>
>>
>> Have you looked at your Cassandra.yaml file and checked the setting for
>> incremental_backups?  If it is set to true, and you don’t want to make new
>> backups, you can set it to false, so that after you clean up, you will not
>> have to clean up the backups again.
>>
>>
>>
>> Explanation:
>>
>> Lets look at the the definition of incremental backups again: “Cassandra
>> creates a hard link to each SSTable flushed or streamed locally in
>> a backups subdirectory of the keyspace data.”
>>
>>
>>
>> Suppose we have a directory path: my_keyspace/my_table-some-uuid/backups/
>>
>> In the rest of the discussion, when I refer to “table directory”, I
>> explicitly mean the directory: my_keyspace/my_table-some-uuid/
>>
>> When I refer to backups/ directory, I explicitly mean:
>> my_keyspace/my_table-some-uuid/backups/
>>
>>
>>
>> Suppose that you have an sstable-A that was either flushed from a
>> memtable or streamed from another node.
>>
>> At this point, you have a hardlink to sstable-A in your table directory,
>> and a hardlink to sstable-A in your backups/ directory.
>>
>> Suppose that you have another sstable-B that was also either flushed from
>> a memtable or streamed from another node.
>>
>> At this point, you have a hardlink to sstable-B in your table directory,
>> and a hardlink to sstable-B in your backups/ directory.
>>
>>
>>
>> Next, suppose compaction were to occur, where say sstable-A and sstable-B
>> would be compacted to produce sstable-C, representing all the data from A
>> and B.
>>
>> Now, sstable-C will live in your main table directory, and the hardlinks
>> to sstable-A and sstable-B will be deleted in the main table directory, but
>> sstable-A and sstable-B will continue to exist in /backups.
>>
>> At this point, in your main table directory, you will have a hardlink to
>> sstable-C. In your backups/ directory you will have hardlinks to sstable-A,
>> and sstable-B.
>>
>>
>>
>> Thus, your main table directory is not cluttered with old un-compacted
>> sstables, and only has the sstables

Re: Backups eating up disk space

2017-01-12 Thread Khaja, Raziuddin (NIH/NLM/NCBI) [C]
Thanks, Prasenjit, I appreciate the compliment ☺

Kunal,  To add to Prasenjit’s comment, it doesn’t make sense to make backups 
unless it is moved to secondary storage.  This means that if you don’t plan to 
move the backups to secondary storage, you should set incremental_backups: 
false, and instead rely on replication and full repair in order to rebuild a 
node that has had catastrophic failure.

I assume that you are not moving backups to secondary storage, so to save 
space, I would turn it off.

Best regards,
-Razi


From: Prasenjit Sarkar 
Reply-To: "user@cassandra.apache.org" 
Date: Thursday, January 12, 2017 at 12:42 AM
To: "user@cassandra.apache.org" 
Subject: Re: Backups eating up disk space

Hi Kunal,

Razi's post does give a very lucid description of how cassandra manages the 
hard links inside the backup directory.

Where it needs clarification is the following:
--> incremental backups is a system wide setting and so its an all or nothing 
approach

--> as multiple people have stated, incremental backups do not create hard 
links to compacted sstables. however, this can bloat the size of your backups

--> again as stated, it is a general industry practice to place backups in a 
different secondary storage location than the main production site. So best to 
move it to the secondary storage before applying rm on the backups folder

In my experience with production clusters, managing the backups folder across 
multiple nodes can be painful if the objective is to ever recover data. With 
the usual disclaimers, better to rely on third party vendors to accomplish the 
needful rather than scripts/tablesnap.

Regards
Prasenjit

On Wed, Jan 11, 2017 at 7:49 AM, Khaja, Raziuddin (NIH/NLM/NCBI) [C] 
mailto:raziuddin.kh...@nih.gov>> wrote:
Hello Kunal,

Caveat: I am not a super-expert on Cassandra, but it helps to explain to 
others, in order to eventually become an expert, so if my explanation is wrong, 
I would hope others would correct me. ☺

The active sstables/data files are are all the files located in the directory 
for the table.
You can safely remove all files under the backups/ directory and the directory 
itself.
Removing any files that are current hard-links inside backups won’t cause any 
issues, and I will explain why.

Have you looked at your Cassandra.yaml file and checked the setting for 
incremental_backups?  If it is set to true, and you don’t want to make new 
backups, you can set it to false, so that after you clean up, you will not have 
to clean up the backups again.

Explanation:
Lets look at the the definition of incremental backups again: “Cassandra 
creates a hard link to each SSTable flushed or streamed locally in a backups 
subdirectory of the keyspace data.”

Suppose we have a directory path: my_keyspace/my_table-some-uuid/backups/
In the rest of the discussion, when I refer to “table directory”, I explicitly 
mean the directory: my_keyspace/my_table-some-uuid/
When I refer to backups/ directory, I explicitly mean: 
my_keyspace/my_table-some-uuid/backups/

Suppose that you have an sstable-A that was either flushed from a memtable or 
streamed from another node.
At this point, you have a hardlink to sstable-A in your table directory, and a 
hardlink to sstable-A in your backups/ directory.
Suppose that you have another sstable-B that was also either flushed from a 
memtable or streamed from another node.
At this point, you have a hardlink to sstable-B in your table directory, and a 
hardlink to sstable-B in your backups/ directory.

Next, suppose compaction were to occur, where say sstable-A and sstable-B would 
be compacted to produce sstable-C, representing all the data from A and B.
Now, sstable-C will live in your main table directory, and the hardlinks to 
sstable-A and sstable-B will be deleted in the main table directory, but 
sstable-A and sstable-B will continue to exist in /backups.
At this point, in your main table directory, you will have a hardlink to 
sstable-C. In your backups/ directory you will have hardlinks to sstable-A, and 
sstable-B.

Thus, your main table directory is not cluttered with old un-compacted 
sstables, and only has the sstables along with other files that are actively 
being used.

To drive the point home, …
Suppose that you have another sstable-D that was either flushed from a memtable 
or streamed from another node.
At this point, in your main table directory, you will have sstable-C and 
sstable-D. In your backups/ directory you will have hardlinks to sstable-A, 
sstable-B, and sstable-D.

Next, suppose compaction were to occur where say sstable-C and sstable-D would 
be compacted to produce sstable-E, representing all the data from C and D.
Now, sstable-E will live in your main table directory, and the hardlinks to 
sstable-C and sstable-D will be deleted in the main table directory, but 
sstable-D will continue to exist in /backups.
At this point, in your main table directory, you will have a hardlink to 
sstable-E. In your backups/ dir

Re: Backups eating up disk space

2017-01-12 Thread Khaja, Raziuddin (NIH/NLM/NCBI) [C]
snapshots are slightly different than backups.

In my explanation of the hardlinks created in the backups folder, notice that 
compacted sstables, never end up in the backups folder.

On the other hand, a snapshot is meant to represent the data at a particular 
moment in time. Thus, the snapshots directory contains hardlinks to all active 
sstables at the time the snapshot was taken, which would include: compacted 
sstables; and any sstables from memtable flush or streamed from other nodes 
that both exist in the table directory and the backups directory.

So, that would be the difference between snapshots and backups.

Best regards,
-Razi


From: Alain RODRIGUEZ 
Reply-To: "user@cassandra.apache.org" 
Date: Thursday, January 12, 2017 at 9:16 AM
To: "user@cassandra.apache.org" 
Subject: Re: Backups eating up disk space

My 2 cents,

As I mentioned earlier, we're not currently using snapshots - it's only the 
backups that are bothering me right now.

I believe backups folder is just the new name for the previously called 
snapshots folder. But I can be completely wrong, I haven't played that much 
with snapshots in new versions yet.

Anyway, some operations in Apache Cassandra can trigger a snapshot:

- Repair (when not using parallel option but sequential repairs instead)
- Truncating a table (by default)
- Dropping a table (by default)
- Maybe other I can't think of... ?

If you want to clean space but still keep a backup you can run:

"nodetool clearsnapshots"
"nodetool snapshot "

This way and for a while, data won't be taking space as old files will be 
cleaned and new files will be only hardlinks as detailed above. Then you might 
want to work at a proper backup policy, probably implying getting data out of 
production server (a lot of people uses S3 or similar services). Or just do 
that from time to time, meaning you only keep a backup and disk space behaviour 
will be hard to predict.

C*heers,
---
Alain Rodriguez - @arodream - 
al...@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2017-01-12 6:42 GMT+01:00 Prasenjit Sarkar 
mailto:prasenjit.sar...@datos.io>>:
Hi Kunal,

Razi's post does give a very lucid description of how cassandra manages the 
hard links inside the backup directory.

Where it needs clarification is the following:
--> incremental backups is a system wide setting and so its an all or nothing 
approach

--> as multiple people have stated, incremental backups do not create hard 
links to compacted sstables. however, this can bloat the size of your backups

--> again as stated, it is a general industry practice to place backups in a 
different secondary storage location than the main production site. So best to 
move it to the secondary storage before applying rm on the backups folder

In my experience with production clusters, managing the backups folder across 
multiple nodes can be painful if the objective is to ever recover data. With 
the usual disclaimers, better to rely on third party vendors to accomplish the 
needful rather than scripts/tablesnap.

Regards
Prasenjit


On Wed, Jan 11, 2017 at 7:49 AM, Khaja, Raziuddin (NIH/NLM/NCBI) [C] 
mailto:raziuddin.kh...@nih.gov>> wrote:
Hello Kunal,

Caveat: I am not a super-expert on Cassandra, but it helps to explain to 
others, in order to eventually become an expert, so if my explanation is wrong, 
I would hope others would correct me. ☺

The active sstables/data files are are all the files located in the directory 
for the table.
You can safely remove all files under the backups/ directory and the directory 
itself.
Removing any files that are current hard-links inside backups won’t cause any 
issues, and I will explain why.

Have you looked at your Cassandra.yaml file and checked the setting for 
incremental_backups?  If it is set to true, and you don’t want to make new 
backups, you can set it to false, so that after you clean up, you will not have 
to clean up the backups again.

Explanation:
Lets look at the the definition of incremental backups again: “Cassandra 
creates a hard link to each SSTable flushed or streamed locally in a backups 
subdirectory of the keyspace data.”

Suppose we have a directory path: my_keyspace/my_table-some-uuid/backups/
In the rest of the discussion, when I refer to “table directory”, I explicitly 
mean the directory: my_keyspace/my_table-some-uuid/
When I refer to backups/ directory, I explicitly mean: 
my_keyspace/my_table-some-uuid/backups/

Suppose that you have an sstable-A that was either flushed from a memtable or 
streamed from another node.
At this point, you have a hardlink to sstable-A in your table directory, and a 
hardlink to sstable-A in your backups/ directory.
Suppose that you have another sstable-B that was also either flushed from a 
memtable or streamed from another node.
At this point, you have a hardlink to sstable-B in your table directory, and a 
hardlink to sstable-B i

Queries execution time

2017-01-12 Thread D. Salvatore
Hi,
Does anyone know if there is a way to record in a log file the queries
total or partial execution time? I am interested in something similar to
the tracing option but on file.

Thanks
Best Regards
Salvatore


Re: Queries execution time

2017-01-12 Thread Benjamin Roth
Hi Salvatore,

1. Cassandra offers tons of metrics through JMX to monitor performance on
keyspace and CF level
2. There is a config option to log slow queries, unfortunately JIRA is
currently down, so I can't find the ticket with more details

2017-01-12 19:21 GMT+01:00 D. Salvatore :

> Hi,
> Does anyone know if there is a way to record in a log file the queries
> total or partial execution time? I am interested in something similar to
> the tracing option but on file.
>
> Thanks
> Best Regards
> Salvatore
>



-- 
Benjamin Roth
Prokurist

Jaumo GmbH · www.jaumo.com
Wehrstraße 46 · 73035 Göppingen · Germany
Phone +49 7161 304880-6 · Fax +49 7161 304880-1
AG Ulm · HRB 731058 · Managing Director: Jens Kammerer


Re: Queries execution time

2017-01-12 Thread Jonathan Haddad
You're likely to benefit a lot more if you log query times from your
application, as you can customize the metadata that you add around logging
to increase its relevancy.

On Thu, Jan 12, 2017 at 10:24 AM Benjamin Roth 
wrote:

> Hi Salvatore,
>
> 1. Cassandra offers tons of metrics through JMX to monitor performance on
> keyspace and CF level
> 2. There is a config option to log slow queries, unfortunately JIRA is
> currently down, so I can't find the ticket with more details
>
> 2017-01-12 19:21 GMT+01:00 D. Salvatore :
>
> Hi,
> Does anyone know if there is a way to record in a log file the queries
> total or partial execution time? I am interested in something similar to
> the tracing option but on file.
>
> Thanks
> Best Regards
> Salvatore
>
>
>
>
> --
> Benjamin Roth
> Prokurist
>
> Jaumo GmbH · www.jaumo.com
> Wehrstraße 46 · 73035 Göppingen · Germany
> Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161 304880-1
> <+49%207161%203048801>
> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>


Re: Queries execution time

2017-01-12 Thread Voytek Jarnot
We use QueryLogger which is baked in to the datastax java driver; gives you
basic query execution times (and bind params) in your logs, can be tweaked
using log levels.

On Thu, Jan 12, 2017 at 12:31 PM, Jonathan Haddad  wrote:

> You're likely to benefit a lot more if you log query times from your
> application, as you can customize the metadata that you add around logging
> to increase its relevancy.
>
> On Thu, Jan 12, 2017 at 10:24 AM Benjamin Roth 
> wrote:
>
>> Hi Salvatore,
>>
>> 1. Cassandra offers tons of metrics through JMX to monitor performance on
>> keyspace and CF level
>> 2. There is a config option to log slow queries, unfortunately JIRA is
>> currently down, so I can't find the ticket with more details
>>
>> 2017-01-12 19:21 GMT+01:00 D. Salvatore :
>>
>> Hi,
>> Does anyone know if there is a way to record in a log file the queries
>> total or partial execution time? I am interested in something similar to
>> the tracing option but on file.
>>
>> Thanks
>> Best Regards
>> Salvatore
>>
>>
>>
>>
>> --
>> Benjamin Roth
>> Prokurist
>>
>> Jaumo GmbH · www.jaumo.com
>> Wehrstraße 46 · 73035 Göppingen · Germany
>> Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161 304880-1
>> <+49%207161%203048801>
>> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>>
>


Re: WriteTimeoutException When only One Node is Down

2017-01-12 Thread Yuji Ito
Hi Shalom,

I also got WriteTimeoutException in my destructive test like your test.

When did you drop a node?
A coordinator node sends a write request to all replicas.
When one of nodes was down while the request is executed, sometimes
WriteTimeOutException happens.

cf. http://www.datastax.com/dev/blog/cassandra-error-handling-done-right

Thanks,
Yuji



On Thu, Jan 12, 2017 at 4:26 PM, Shalom Sagges 
wrote:

> Hi Everyone,
>
> I'm using C* v3.0.9 for a cluster of 3 DCs with RF 3 in each DC. All
> read/write queries are set to consistency LOCAL_QUORUM.
> The relevant keyspace is built as follows:
>
> *CREATE KEYSPACE mykeyspace WITH replication = {'class':
> 'NetworkTopologyStrategy', 'DC1': '3', 'DC2': '3', 'DC3': '3'}  AND
> durable_writes = true;*
>
> I use* Datastax driver 3.0.1*
>
>
> When I performed a resiliency test for the application, each time I
> dropped one node, the client got the following error:
>
>
> com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra
> timeout during write query at consistency TWO (2 replica were required but
> only 1 acknowledged the write)
> at com.datastax.driver.core.exceptions.WriteTimeoutException.copy(
> WriteTimeoutException.java:73)
> at com.datastax.driver.core.exceptions.WriteTimeoutException.copy(
> WriteTimeoutException.java:26)
> at com.datastax.driver.core.DriverThrowables.propagateCause(
> DriverThrowables.java:37)
> at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(
> DefaultResultSetFuture.java:245)
> at com.datastax.driver.core.AbstractSession.execute(
> AbstractSession.java:63)
> at humanclick.ldap.commImpl.siteData.CassandraSiteDataDaoSpring.
> updateJprunDomains(CassandraSiteDataDaoSpring.java:121)
> at humanclick.ldap.commImpl.siteData.CassandraSiteDataDaoSpring.
> createOrUpdate(CassandraSiteDataDaoSpring.java:97)
> at humanclick.ldapAdapter.dataUpdater.impl.SiteDataToLdapUpdater.update(
> SiteDataToLdapUpdater.java:280)
>
>
> After a few seconds the error no longer recurs. I have no idea why there's
> a timeout since there are additional replicas that satisfy the consistency
> level, and I'm more baffled when the error showed *"Cassandra timeout
> during write query at consistency TWO (2 replica were required but only 1
> acknowledged the write)"*
>
> Any ideas?  I'm quite at a loss here.
>
> Thanks!
>
>
>
> Shalom Sagges
> DBA
> T: +972-74-700-4035 <+972%2074-700-4035>
>  
>  We Create Meaningful Connections
>
>
>
> This message may contain confidential and/or privileged information.
> If you are not the addressee or authorized to receive this on behalf of
> the addressee you must not use, copy, disclose or take action based on this
> message or any information herein.
> If you have received this message in error, please advise the sender
> immediately by reply email and delete this message. Thank you.
>