JBOD disk failure

2018-08-14 Thread Christian Lorenz
Hi,

given a cluster with RF=3 and CL=LOCAL_ONE and application is deleting data, 
what happens if the nodes are setup with JBOD and one disk fails? Do I get 
consistent results while the broken drive is replaced and a nodetool repair is 
running on the node with the replaced drive?

Kind regards,
Christian


data loss

2018-08-14 Thread onmstester onmstester
I am inserting to Cassandra by a simple insert query and an update counter 
query for every input record. input rate is so high. I've configured the update 
query with idempotent = true (no config for insert query, default is false 
IMHO) I've seen multiple records having rows in counter table (idempotent one) 
while not having any row in the table with simple insert! I'm using 
executeAsync and in catch i will retry the insert/uodate for whole batch of 
statements (while-true so retry until all satements been inserted) and using 
this i was sure that everything would be persisted in Cassandra. If a non 
idempotent insert timed out, wouldn't it should throw exception and be retried 
in my java code?

Cassandra 2.2.7 Compaction after Truncate issue

2018-08-14 Thread David Payne
Scenario: Cassandra 2.2.7, 3 nodes, RF=3 keyspace.


1.   Truncate a table.

2.   More than 24 hours later… FileCacheService is still reporting cold 
readers for sstables of truncated data for node 2 and 3, but not node 1.

3.   The output of nodeool compactionstats shows stuck compaction for the 
truncated table for node 2 and 3, but not node 1.

This appears to be a defect that was fixed in 2.1.0. 
https://issues.apache.org/jira/browse/CASSANDRA-7803

Any ideas?

Thanks,
David Payne
| ̄ ̄|
_☆☆☆_
( ´_⊃`)
c. 303-717-0548
dav...@cqg.com



90million reads

2018-08-14 Thread Abdul Patel
Currently our cassandra prod is 18 node 3 dc cluster and application does
55 million reads per day and want to add load and make it 90 millon reads
per day.they need a guestimate of resources which we need to bump without
testing ..on top of my head we can increase heap and  native trasport value
..any other paramters i should be concern?


Re: 90million reads

2018-08-14 Thread kurt greaves
Not a great idea to make config changes without testing. For a lot of
changes you can make the change on one node and measure of three is an
improvement however.

You'd probably be best to add nodes (double should be sufficient), do
tuning and testing afterwards, and then decommission a few nodes if you can.

On Wed., 15 Aug. 2018, 05:00 Abdul Patel,  wrote:

> Currently our cassandra prod is 18 node 3 dc cluster and application does
> 55 million reads per day and want to add load and make it 90 millon reads
> per day.they need a guestimate of resources which we need to bump without
> testing ..on top of my head we can increase heap and  native trasport value
> ..any other paramters i should be concern?


Improve data load performance

2018-08-14 Thread Abdul Patel
How can we improve data load performance?


Re: JBOD disk failure

2018-08-14 Thread daemeon reiydelle
you have to explain what you mean by "JBOD". All in one large vdisk?
Separate drives?

At the end of the day, if a device fails in a way that the data housed on
that device (or array) is no longer available, that HDFS storage is marked
down. HDFS now needs to create a 3rd replicant. Various timers control how
long HDFS waits to see if the device comes back on line. But assume
immediately for convenience. Remember that a write is to a (random) copy of
the data, and that datanode then replicates to the next node, and so forth.
The in-process-of-being-created 3rd copy will also get those delete
"updates". Have you read up on how "deleting" a record works?

<==>
Be the reason someone smiles today.
Or the reason they need a drink.
Whichever works.

*Daemeon C.M. Reiydelle*

*email: daeme...@gmail.com *
*San Francisco 1.415.501.0198/London 44 020 8144 9872/Skype
daemeon.c.m.reiydelle*



On Tue, Aug 14, 2018 at 6:10 AM Christian Lorenz <
christian.lor...@webtrekk.com> wrote:

> Hi,
>
>
>
> given a cluster with RF=3 and CL=LOCAL_ONE and application is deleting
> data, what happens if the nodes are setup with JBOD and one disk fails? Do
> I get consistent results while the broken drive is replaced and a nodetool
> repair is running on the node with the replaced drive?
>
>
>
> Kind regards,
>
> Christian
>


Re: Improve data load performance

2018-08-14 Thread @Nandan@
Bro, Please explain your question as much as possible.
This is not a single line Q&A session where we will able to understand your
in-depth queries in a single line.
For better and suitable reply, Please ask a question and elaborate what
steps you took for your question and what issue are you getting and all..

I hope I am making it clear. Don't take it personally.

Thanks

On Wed, Aug 15, 2018 at 8:25 AM Abdul Patel  wrote:

> How can we improve data load performance?


Re: JBOD disk failure

2018-08-14 Thread Jeff Jirsa
Depends on version

For versions without the fix from Cassandra-6696, the only safe option on 
single disk failure is to stop and replace the whole instance - this is 
important because in older versions of Cassandra, you could have data in one 
sstable, a tombstone shadowing it in another disk, and it could be very far 
behind gc_grace_seconds. On disk failure in this scenario, if the disk holding 
the tombstone is lost, repair will propagate the (deleted/resurrected) data to 
the other replicas, which probably isn’t what you want to happen.

With 6696, you should be safe to replace the disk and run repair - 6696 will 
keep data for a given token range all on the same disks, so the resurrection 
problem is solved. 


-- 
Jeff Jirsa


> On Aug 14, 2018, at 6:10 AM, Christian Lorenz  
> wrote:
> 
> Hi,
>  
> given a cluster with RF=3 and CL=LOCAL_ONE and application is deleting data, 
> what happens if the nodes are setup with JBOD and one disk fails? Do I get 
> consistent results while the broken drive is replaced and a nodetool repair 
> is running on the node with the replaced drive?
>  
> Kind regards,
> Christian


Re: JBOD disk failure

2018-08-14 Thread kurt greaves
If that disk had important data in the system tables however you might have
some trouble and need to replace the entire instance anyway.

On 15 August 2018 at 12:20, Jeff Jirsa  wrote:

> Depends on version
>
> For versions without the fix from Cassandra-6696, the only safe option on
> single disk failure is to stop and replace the whole instance - this is
> important because in older versions of Cassandra, you could have data in
> one sstable, a tombstone shadowing it in another disk, and it could be very
> far behind gc_grace_seconds. On disk failure in this scenario, if the disk
> holding the tombstone is lost, repair will propagate the
> (deleted/resurrected) data to the other replicas, which probably isn’t what
> you want to happen.
>
> With 6696, you should be safe to replace the disk and run repair - 6696
> will keep data for a given token range all on the same disks, so the
> resurrection problem is solved.
>
>
> --
> Jeff Jirsa
>
>
> On Aug 14, 2018, at 6:10 AM, Christian Lorenz <
> christian.lor...@webtrekk.com> wrote:
>
> Hi,
>
>
>
> given a cluster with RF=3 and CL=LOCAL_ONE and application is deleting
> data, what happens if the nodes are setup with JBOD and one disk fails? Do
> I get consistent results while the broken drive is replaced and a nodetool
> repair is running on the node with the replaced drive?
>
>
>
> Kind regards,
>
> Christian
>
>