from:"shalom sagges"

Delete System_Traces Table

2018-03-19 Thread shalom sagges

Hi All,

I accidentally created a test table on the system_traces keyspace.

When I tried to drop the table with the Cassandra user, I got the following
error:
*Unauthorized: Error from server: code=2100 [Unauthorized] message="Cannot
DROP "*

Is there a way to drop this table permanently?

Thanks!

Re: Delete System_Traces Table

2018-03-19 Thread shalom sagges

Yes, that's correct.

I'd definitely like to keep the default tables.

On Mon, Mar 19, 2018 at 4:10 PM, Rahul Singh 
wrote:

> I think he just wants to delete the test table not the whole keyspace. Is
> that correct?
>
> --
> Rahul Singh
> rahul.si...@anant.us
>
> Anant Corporation
>
> On Mar 19, 2018, 9:08 AM -0500, Chris Lohfink , wrote:
>
> No.
>
> Why do you want to? If you don't use tracing they will be empty, and if
> were able to drop them you will no longer be able to use tracing in
> debugging.
>
> Chris
>
> On Mar 19, 2018, at 7:52 AM, shalom sagges  wrote:
>
> Hi All,
>
> I accidentally created a test table on the system_traces keyspace.
>
> When I tried to drop the table with the Cassandra user, I got the
> following error:
> *Unauthorized: Error from server: code=2100 [Unauthorized] message="Cannot
> DROP "*
>
> Is there a way to drop this table permanently?
>
> Thanks!
>
>
>

Re: Delete System_Traces Table

2018-03-19 Thread shalom sagges

That's weird... I'm using 3.0.12, so I should've still been able to drop
it, no?

Also, if I intend to upgrade to version 3.11.2, will the existence of the
table cause any issues?

Thanks!

On Mon, Mar 19, 2018 at 4:30 PM, Chris Lohfink  wrote:

> Oh I misread original, I see.
>
> With https://issues.apache.org/jira/browse/CASSANDRA-13813 you wont be
> able to drop the table, but would be worth a ticket to prevent creation in
> those keyspaces or allow some sort of override if allowing create.
>
> Chris
>
>
> On Mar 19, 2018, at 9:15 AM, shalom sagges  wrote:
>
> Yes, that's correct.
>
> I'd definitely like to keep the default tables.
>
> On Mon, Mar 19, 2018 at 4:10 PM, Rahul Singh  > wrote:
>
>> I think he just wants to delete the test table not the whole keyspace. Is
>> that correct?
>>
>> --
>> Rahul Singh
>> rahul.si...@anant.us
>>
>> Anant Corporation
>>
>> On Mar 19, 2018, 9:08 AM -0500, Chris Lohfink ,
>> wrote:
>>
>> No.
>>
>> Why do you want to? If you don't use tracing they will be empty, and if
>> were able to drop them you will no longer be able to use tracing in
>> debugging.
>>
>> Chris
>>
>> On Mar 19, 2018, at 7:52 AM, shalom sagges 
>> wrote:
>>
>> Hi All,
>>
>> I accidentally created a test table on the system_traces keyspace.
>>
>> When I tried to drop the table with the Cassandra user, I got the
>> following error:
>> *Unauthorized: Error from server: code=2100 [Unauthorized]
>> message="Cannot DROP "*
>>
>> Is there a way to drop this table permanently?
>>
>> Thanks!
>>
>>
>>
>
>

Re: Delete System_Traces Table

2018-03-19 Thread shalom sagges

Thanks a lot Chris and Rahul!

On Mon, Mar 19, 2018 at 5:54 PM, Chris Lohfink  wrote:

> traces and auth in that version have a whitelist of tables that can be
> dropped (legacy auth tables).
>
> https://github.com/apache/cassandra/blob/cassandra-3.0.
> 12/src/java/org/apache/cassandra/service/ClientState.java#L367
>
> It does make sense to allowing CREATEs in the distributed tables, mostly
> because of auth. That way if the auth tables are changed in later version
> you can pre-prime them before an upgrade. Might be a bit of overstep in
> protecting users from themselves but it doesnt hurt anything to have the
> table there.  Just ignore it and its existence will not cause any issues.
>
> Chris
>
>
> On Mar 19, 2018, at 10:27 AM, shalom sagges 
> wrote:
>
> That's weird... I'm using 3.0.12, so I should've still been able to drop
> it, no?
>
> Also, if I intend to upgrade to version 3.11.2, will the existence of the
> table cause any issues?
>
> Thanks!
>
> On Mon, Mar 19, 2018 at 4:30 PM, Chris Lohfink  wrote:
>
>> Oh I misread original, I see.
>>
>> With https://issues.apache.org/jira/browse/CASSANDRA-13813 you wont be
>> able to drop the table, but would be worth a ticket to prevent creation in
>> those keyspaces or allow some sort of override if allowing create.
>>
>> Chris
>>
>>
>> On Mar 19, 2018, at 9:15 AM, shalom sagges 
>> wrote:
>>
>> Yes, that's correct.
>>
>> I'd definitely like to keep the default tables.
>>
>> On Mon, Mar 19, 2018 at 4:10 PM, Rahul Singh <
>> rahul.xavier.si...@gmail.com> wrote:
>>
>>> I think he just wants to delete the test table not the whole keyspace.
>>> Is that correct?
>>>
>>> --
>>> Rahul Singh
>>> rahul.si...@anant.us
>>>
>>> Anant Corporation
>>>
>>> On Mar 19, 2018, 9:08 AM -0500, Chris Lohfink ,
>>> wrote:
>>>
>>> No.
>>>
>>> Why do you want to? If you don't use tracing they will be empty, and if
>>> were able to drop them you will no longer be able to use tracing in
>>> debugging.
>>>
>>> Chris
>>>
>>> On Mar 19, 2018, at 7:52 AM, shalom sagges 
>>> wrote:
>>>
>>> Hi All,
>>>
>>> I accidentally created a test table on the system_traces keyspace.
>>>
>>> When I tried to drop the table with the Cassandra user, I got the
>>> following error:
>>> *Unauthorized: Error from server: code=2100 [Unauthorized]
>>> message="Cannot DROP "*
>>>
>>> Is there a way to drop this table permanently?
>>>
>>> Thanks!
>>>
>>>
>>>
>>
>>
>
>

Re: compaction stuck at 99.99%

2018-03-21 Thread shalom sagges

If the problem is recurring, then you might have a corrupted SSTable.
Check the system log. If a certain file is corrupted, you'll find it.

grep -i corrupt /system.log*


On Wed, Mar 21, 2018 at 2:18 PM, Jerome Basa  wrote:

> hi,
>
> when i run `nodetool compactionstats` there’s this one compaction
> that’s stuck at 99.99% and the CPU load on the node is high compared
> to other nodes. i tried stopping the compaction but nothing happens.
> aside from restarting cassandra; what else can be done with this
> issue? thanks
>
>
> $ nodetool version
> ReleaseVersion: 3.0.14
>
>
> $ nodetool compactionstats -H
> pending tasks: 1
>  id   compaction type
> keyspacetable   completed  totalunit
> progress
>6fb294d0-264a-11e8-ad75-b98b064c302bCompaction
> some_keyspace   some_table90.42 MB   90.43 MB   bytes
>99.99%
>
>
>
> regards,
> -jerome
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>

How to Protect Tracing Requests From Client Side

2018-03-22 Thread shalom sagges

Hi All,

Is there a way to protect C* on the server side from tracing commands that
are executed from clients?

Thanks!

Re: How to Protect Tracing Requests From Client Side

2018-03-22 Thread shalom sagges

Thanks a lot Rahul! :-)

On Thu, Mar 22, 2018 at 8:03 PM, Rahul Singh 
wrote:

> Execute ‘nodetool settraceprobability 0’ on all nodes. It does zero
> percentage of he tracing.
>
> --
> Rahul Singh
> rahul.si...@anant.us
>
> Anant Corporation
>
> On Mar 22, 2018, 11:10 AM -0500, shalom sagges ,
> wrote:
>
> Hi All,
>
> Is there a way to protect C* on the server side from tracing commands that
> are executed from clients?
>
> Thanks!
>
>

Re: How to Protect Tracing Requests From Client Side

2018-03-24 Thread shalom sagges

Thanks Guys!

This really helps!



On Fri, Mar 23, 2018 at 7:10 AM, Mick Semb Wever 
wrote:

> Is there a way to protect C* on the server side from tracing commands that
>> are executed from clients?
>>
>
>
> If you really needed a way to completely disable all and any possibility
> of tracing you could start each C* node with tracing switched to a noop
> implementation.
>
> eg, add to the jvm.options file the line
>
> -Dcassandra.custom_tracing_class=somepackage.NoOpTracing
>
>
> while also putting into each $CASSANDRA_HOME/lib/ a jar file containing
> this NoOpTracing class…
>
> ```
> package somepackage;
>
> import java.net.InetAddress;
> import java.nio.ByteBuffer;
> import java.util.Map;
> import java.util.UUID;
> import org.apache.cassandra.tracing.*;
> import org.apache.cassandra.utils.FBUtilities;
>
> /** Starting Cassandra with '-Dcassandra.custom_tracing_
> class=org.apache.cassandra.tracing.NoOpTracing'
>  * will forcibly disable all tracing.
>  *
>  * This can be useful in defensive environments.
>  */
> public final class NoOpTracing extends Tracing {
>
> @Override
> protected void stopSessionImpl() {}
>
> @Override
> public TraceState begin(String request, InetAddress client,
> Map parameters) {
> return NoOpTraceState.INSTANCE;
> }
>
> @Override
> protected TraceState newTraceState(InetAddress coordinator, UUID
> sessionId, TraceType traceType) {
> return NoOpTraceState.INSTANCE;
> }
>
> @Override
> public void trace(ByteBuffer sessionId, String message, int ttl) {}
>
> private static class NoOpTraceState extends TraceState {
> private static final NoOpTraceState INSTANCE = new
> NoOpTraceState();
> private NoOpTraceState() {
> super(FBUtilities.getBroadcastAddress(), UUID.randomUUID(),
> TraceType.NONE);
> }
> @Override
> protected void traceImpl(String message) {}
> }
> }
> ```
>
> regards,
> Mick
>
>
> --
> Mick Semb Wever
> Australia
>
> The Last Pickle
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>

Large Partitions

2018-04-02 Thread shalom sagges

Hi All,

I ran nodetool cfstats (v2.0.14) on a keyspace and found that there are a
few large partitions. I assume that since "Compacted partition maximum
bytes": 802187438 (~800 MB) and since
"Compacted partition mean bytes": 100465 (~100 KB), it means that most
partitions are in okay size and only a few are large. Am I assuming
correctly?

If so, can anyone suggest how to find those large partitions and how to
deal with them? (cfstats output below)

Thanks!


nodetool cfstats keyspace1;

Table: table1
SSTable count: 16
Space used (live), bytes: 453844035587
Space used (total), bytes: 453844035587
Off heap memory used (total), bytes: 440787635
SSTable Compression Ratio: 0.17417149031966575
Number of keys (estimate): 33651200
Memtable cell count: 27966
Memtable data size, bytes: 41698140
Memtable switch count: 199727
Local read count: 86494530
Local read latency: 2.646 ms
Local write count: 247712138
Local write latency: 0.030 ms
Pending tasks: 0
Bloom filter false positives: 2182242
Bloom filter false ratio: 0.02251
Bloom filter space used, bytes: 53135136
Bloom filter off heap memory used, bytes: 53135008
Index summary off heap memory used, bytes: 11560419
Compression metadata off heap memory used, bytes: 376092208
Compacted partition minimum bytes: 373

*Compacted partition maximum bytes: 802187438Compacted partition mean
bytes: 100465*
Average live cells per slice (last five minutes): 37.0
Average tombstones per slice (last five minutes): 0.0

Re: Large Partitions

2018-04-02 Thread shalom sagges

Thanks Ali!

I use a 13 months TTL on this table. I guess I need to remodel this table.
And I'll definitely try this tool.



On Tue, Apr 3, 2018 at 1:28 AM, Ali Hubail  wrote:

> system.log should show you some warnings about wide rows. Do a grep on
> system.log for 'Writing large partition' The message could be different for
> the c* version you're using though. Plus, this doesn't show you all of the
> large partitions.
>
> There is a nice tool that analyzes sstables and can show the large
> partitions:
> https://github.com/tolbertam/sstable-tools
>
>
> By "how to deal with them?" it depends. If you don't need those
> partitions then you can delete them. You can also use TTL if it fits you or
> remodel your table to only hold upto 100k rows or 100mb per partition
> (whichever comes first). If you're going to remodel the table, aim for much
> less than 100k/100mb per partition.
>
> *Ali Hubail*
>
> Confidentiality warning: This message and any attachments are intended
> only for the persons to whom this message is addressed, are confidential,
> and may be privileged. If you are not the intended recipient, you are
> hereby notified that any review, retransmission, conversion to hard copy,
> copying, modification, circulation or other use of this message and any
> attachments is strictly prohibited. If you receive this message in error,
> please notify the sender immediately by return email, and delete this
> message and any attachments from your system. Petrolink International
> Limited its subsidiaries, holding companies and affiliates disclaims all
> responsibility from and accepts no liability whatsoever for the
> consequences of any unauthorized person acting, or refraining from acting,
> on any information contained in this message. For security purposes, staff
> training, to assist in resolving complaints and to improve our customer
> service, email communications may be monitored and telephone calls may be
> recorded.
>
>
> *shalom sagges >*
>
> 04/02/2018 03:57 PM
> Please respond to
> user@cassandra.apache.org
>
> To
> user@cassandra.apache.org,
>
> cc
> Subject
> Large Partitions
>
>
>
>
> Hi All,
>
> I ran nodetool cfstats (v2.0.14) on a keyspace and found that there are a
> few large partitions. I assume that since "Compacted partition maximum
> bytes": 802187438 (~800 MB) and since
> "Compacted partition mean bytes": 100465 (~100 KB), it means that most
> partitions are in okay size and only a few are large. Am I assuming
> correctly?
>
> If so, can anyone suggest how to find those large partitions and how to
> deal with them? (cfstats output below)
>
> Thanks!
>
>
> nodetool cfstats keyspace1;
>
> Table: table1
> SSTable count: 16
> Space used (live), bytes: 453844035587
> Space used (total), bytes: 453844035587
> Off heap memory used (total), bytes: 440787635
> SSTable Compression Ratio: 0.17417149031966575
> Number of keys (estimate): 33651200
> Memtable cell count: 27966
> Memtable data size, bytes: 41698140
> Memtable switch count: 199727
> Local read count: 86494530
> Local read latency: 2.646 ms
> Local write count: 247712138
> Local write latency: 0.030 ms
> Pending tasks: 0
> Bloom filter false positives: 2182242
> Bloom filter false ratio: 0.02251
> Bloom filter space used, bytes: 53135136
> Bloom filter off heap memory used, bytes: 53135008
> Index summary off heap memory used, bytes: 11560419
> Compression metadata off heap memory used, bytes: 376092208
> Compacted partition minimum bytes: 373
>
> * Compacted partition maximum bytes: 802187438 Compacted partition mean
> bytes: 100465*
> Average live cells per slice (last five minutes): 37.0
> Average tombstones per slice (last five minutes): 0.0
>
>

Text or....

2018-04-04 Thread shalom sagges

Hi All,

A certain application is writing ~55,000 characters for a single row. Most
of these characters are entered to one column with "text" data type.

This looks insanely large for one row.
Would you suggest to change the data type from "text" to BLOB or any other
option that might fit this scenario?

Thanks!

Re: Text or....

2018-04-04 Thread shalom sagges

Thanks DuyHai!

I'm using the default table compression. Is there anything else I should
look into?
Regarding the table compression, I understand that for write heavy tables,
it's best to keep the default and not compress it further. Have I
understood correctly?

On Wed, Apr 4, 2018 at 3:28 PM, DuyHai Doan  wrote:

> Compress it and stores it as a blob.
> Unless you ever need to index it but I guess even with SASI indexing a so
> huge text block is not a good idea
>
> On Wed, Apr 4, 2018 at 2:25 PM, shalom sagges 
> wrote:
>
>> Hi All,
>>
>> A certain application is writing ~55,000 characters for a single row.
>> Most of these characters are entered to one column with "text" data type.
>>
>> This looks insanely large for one row.
>> Would you suggest to change the data type from "text" to BLOB or any
>> other option that might fit this scenario?
>>
>> Thanks!
>>
>
>

Dropped Mutations

2018-04-18 Thread shalom sagges

Hi All,

I have a 44 node cluster (22 nodes on each DC).
Each node has 24 cores and 130 GB RAM, 3 TB HDDs.
Version 2.0.14 (soon to be upgraded)
~10K writes per second per node.
Heap size: 8 GB max, 2.4 GB newgen

I deployed Reaper and GC started to increase rapidly. I'm not sure if it's
because there was a lot of inconsistency in the data, but I decided to
increase the heap to 16 GB and new gen to 6 GB. I increased the max tenure
from 1 to 5.

I tested on a canary node and everything was fine but when I changed the
entire DC, I suddenly saw a lot of dropped mutations in the logs on most of
the nodes. (Reaper was not running on the cluster yet but a manual repair
was running).

Can the heap increment cause lots of dropped mutations?
When is a mutation considered as dropped? Is it during flush? Is it during
the write to the commit log or memtable?

Thanks!

Re: Dropped Mutations

2018-04-19 Thread Shalom Sagges

Thanks a lot Hitesh!

I'll try to re-tune the heap to a lower level


Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://liveperson.docsend.com/view/8iiswfp>


On Thu, Apr 19, 2018 at 12:42 AM, hitesh dua  wrote:

> Hi ,
>
> I'll recommend tuning you heap size further( preferably lower) as large
> Heap size can lead to Large Garbage collection pauses also known as also
> known as a stop-the-world event. A pause occurs when a region of memory is
> full and the JVM needs to make space to continue. During a pause all
> operations are suspended. Because a pause affects networking, the node can
> appear as down to other nodes in the cluster. Additionally, any Select and
> Insert statements will wait, which increases read and write latencies.
>
> Any pause of more than a second, or multiple pauses within a second that
> add to a large fraction of that second, should be avoided. The basic cause
> of the problem is the rate of data stored in memory outpaces the rate at
> which data can be removed
>
> MUTATION : If a write message is processed after its timeout
> (write_request_timeout_in_ms) it either sent a failure to the client or it
> met its requested consistency level and will relay on hinted handoff and
> read repairs to do the mutation if it succeeded.
>
> Another possible cause of the Issue could be you HDDs as that could too
> be a bottleneck.
>
> *MAX_HEAP_SIZE*
> The recommended maximum heap size depends on which GC is used:
> Hardware setupRecommended MAX_HEAP_SIZE
> Older computers Typically 8 GB.
> CMS for newer computers (8+ cores) with up to 256 GB RAM No more 14 GB.
>
>
> Thanks,
> Hitesh dua
> hiteshd...@gmail.com
>
> On Wed, Apr 18, 2018 at 10:07 PM, shalom sagges 
> wrote:
>
>> Hi All,
>>
>> I have a 44 node cluster (22 nodes on each DC).
>> Each node has 24 cores and 130 GB RAM, 3 TB HDDs.
>> Version 2.0.14 (soon to be upgraded)
>> ~10K writes per second per node.
>> Heap size: 8 GB max, 2.4 GB newgen
>>
>> I deployed Reaper and GC started to increase rapidly. I'm not sure if
>> it's because there was a lot of inconsistency in the data, but I decided to
>> increase the heap to 16 GB and new gen to 6 GB. I increased the max tenure
>> from 1 to 5.
>>
>> I tested on a canary node and everything was fine but when I changed the
>> entire DC, I suddenly saw a lot of dropped mutations in the logs on most of
>> the nodes. (Reaper was not running on the cluster yet but a manual repair
>> was running).
>>
>> Can the heap increment cause lots of dropped mutations?
>> When is a mutation considered as dropped? Is it during flush? Is it
>> during the write to the commit log or memtable?
>>
>> Thanks!
>>
>>
>>
>>
>

-- 
This message may contain confidential and/or privileged information. 
If 
you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in 
error, please advise the sender immediately by reply email and delete this 
message. Thank you.

Re: Does LOCAL_ONE still replicate data?

2018-05-08 Thread shalom sagges

 It's advisable to set the RF to 3 regardless of the consistency level.

If using RF=1, Read CL=LOCAL_ONE and a node goes down in the local DC, you
will not be able to read data related to this node until it goes back up.
For writes and CL=LOCAL_ONE, the write will fail (if it falls on the token
ranges of the downed node).



On Tue, May 8, 2018 at 3:05 PM, Lucas Benevides  wrote:

> Yes, but remind that there is Write Consistency and Read Consistency.
> To prevent the reads from reaching the other DC, you should set the Read
> Consistency LOCAL_ONE.
> As Hannu Kroger said, the LOCAL_ONE may be enough to you but maybe not if
> you want to be sure that your data was written also in another DC.
>
> Lucas B. Dias
>
>
> 2018-05-08 7:26 GMT-03:00 Hannu Kröger :
>
>> Writes are always replicated to all nodes (if they are online).
>>
>> LOCAL_ONE in writes just means that client will get an “OK” for the write
>> only after at least node in local datacenter has acknowledged that the
>> write is done.
>>
>> If all local replicas are offline, then the write will fail even if it
>> gets written in your other DC.
>>
>> Hannu
>>
>>
>> On 8 May 2018, at 13:24, Jakub Lida  wrote:
>>
>> Hi,
>>
>> I want to add a new DC to an existing cluster (RF=1 per DC).
>> Will setting consistency to LOCAL_ONE on all machines make it still
>> replicate write requests sent to online DCs to all DCs (including the new
>> one being rebuilt) and only isolate read requests from reaching the new DC?
>> That is basically want I want to accomplish.
>>
>> Thanks in advance, Jakub
>>
>>
>>
>

Re: saving distinct data in cassandra result in many tombstones

2018-06-19 Thread shalom sagges

 1. How to use sharding partition key in a way that partitions end up in
different nodes?
You could, for example, create a table with a bucket column added to the
partition key:
Table distinct(
hourNumber int,
bucket int, //could be a 5 minute bucket for example
key text,
distinctValue long
primary key ((hourNumber,bucket))
)

2. if i set gc_grace_seconds to 0, would it replace the row at memtable
(not saving repeated rows in sstables) or it would be done at first
compaction?
Overlapping rows in the memtables are merged regardless of the
gc_grace_seconds period. Setting gc_grace_seconds to 0 will immediately
evict tombstones during compaction but will disable hints delivery. You
should set gc_grace_seconds>max_hint_window_in_ms



On Tue, Jun 19, 2018 at 7:23 AM, onmstester onmstester 
wrote:

> Two other questions:
> 1. How to use sharding partition key in a way that partitions end up in
> different nodes?
> 2. if i set gc_grace_seconds to 0, would it replace the row at memtable
> (not saving repeated rows in sstables) or it would be done at first
> compaction?
>
> Sent using Zoho Mail 
>
>
>  On Tue, 19 Jun 2018 08:16:28 +0430 *onmstester onmstester
> >* wrote 
>
> Can i set gc_grace_seconds to 0 in this case? because reappearing deleted
> data has no impact on my Business Logic, i'm just either creating a new row
> or replacing the exactly same row.
>
> Sent using Zoho Mail 
>
>
>  On Wed, 13 Jun 2018 03:41:51 +0430 *Elliott Sims
> >* wrote 
>
>
>
> If this is data that expires after a certain amount of time, you probably
> want to look into using TWCS and TTLs to minimize the number of tombstones.
> Decreasing gc_grace_seconds then compacting will reduce the number of
> tombstones, but at the cost of potentially resurrecting deleted data if the
> table hasn't been repaired during the grace interval.  You can also just
> increase the tombstone thresholds, but the queries will be pretty
> expensive/wasteful.
>
> On Tue, Jun 12, 2018 at 2:02 AM, onmstester onmstester <
> onmstes...@zoho.com> wrote:
>
>
> Hi,
>
> I needed to save a distinct value for a key in each hour, the problem with
> saving everything and computing distincts in memory is that there
> are too many repeated data.
> Table schema:
> Table distinct(
> hourNumber int,
> key text,
> distinctValue long
> primary key (hourNumber)
> )
>
> I want to retrieve distinct count of all keys in a specific hour and using
> this data model it would be achieved by reading a single partition.
> The problem : i can't read from this table, system.log indicates that more
> than 100K tombstones read and no live data in it. The gc_grace time is
> the default (10 days), so i thought decreasing it to 1 hour and run
> compaction, but is this a right approach at all? i mean the whole idea of
> replacing
> some millions of rows. each  10 times in a partition again and again that
> creates alot of tombstones just to achieve distinct behavior?
>
> Thanks in advance
>
> Sent using Zoho Mail 
>
>
>
>

Re: Cassandra didn't order data according to clustering order

2018-07-15 Thread shalom sagges

The clustering column is ordered per partition key.

So if for example I create the following table:
create table desc_test (
   id text,
   name text,
   PRIMARY KEY (id,name)
) WITH CLUSTERING ORDER BY (name DESC );


I insert a few rows:

insert into desc_test (id , name ) VALUES ( 'abc', 'abc');
insert into desc_test (id , name ) VALUES ( 'abc', 'bcd');
insert into desc_test (id , name ) VALUES ( 'abc', 'aaa');
insert into desc_test (id , name ) VALUES ( 'fgh', 'aaa');
insert into desc_test (id , name ) VALUES ( 'fgh', 'bcd');
insert into desc_test (id , name ) VALUES ( 'fgh', 'abc');


And then read:
select * from desc_test;

 id  | name
-+--
  fgh |  bcd
  fgh |  abc
  fgh |  aaa
 abc |  bcd
 abc |  abc
 abc |  aaa

(6 rows)


You can see that the data is properly ordered in descending mode, BUT
*for each partition key. *
So in order to achieve what you want, you will have to add the relevant
partition key for each select query.

Hope this helps


On Sun, Jul 15, 2018 at 2:16 PM, Soheil Pourbafrani 
wrote:

> I created table using the command:
> CREATE TABLE correlated_data (
> processing_timestamp bigint,
> generating_timestamp bigint,
> data text,
> PRIMARY KEY (processing_timestamp, generating_timestamp)
> ) WITH CLUSTERING ORDER BY (generating_timestamp DESC);
>
>
> When I get data using the command :
> SELECT * FROM correlated_data LIMIT 1 ;
>
> I expect it return the row with the biggest field "generating_timestamp",
> but I got the same row every time I run the query, while row with bigger "
> generating_timestamp" exists. What's the problem?
>

Re: Stumped By Cassandra delays

2018-07-22 Thread shalom sagges

Hi Gareth,

If you're using batches for multiple partitions, this may be the root cause
you've been looking for.

https://inoio.de/blog/2016/01/13/cassandra-to-batch-or-not-to-batch/

If batches are optimally used and only one node is misbehaving, check if
NTP on the node is properly synced.

Hope this helps!


On Sat, Jul 21, 2018 at 9:31 PM, Gareth Collins 
wrote:

> Hello,
>
> We are running Cassandra 2.1.14 in AWS, with c5.4xlarge machines
> (initially these were m4.xlarge) for our cassandra servers and
> m4.xlarge for our application servers. On one of our clusters having
> problems we have 6 C* nodes and 6 AS nodes (two nodes for C*/AS in
> each availability zone).
>
> In the deployed application it seems to be a common use-case to one of
> the following. These use cases are having periodic errors:
> (1) Copy one Cassandra table to another table using the application server.
> (2) Export from a Cassandra table to file using the application server.
>
> The application server is reading from the table via token range, the
> token range queries being calculated to ensure the whole token range
> for a query falls on the same node. i.e. the query looks like this:
>
> select * from  where token(key) > ? and token(key) <= ?
>
> This was probably initially done on the assumption that the driver
> would be able to figure out which nodes contained the data. As we
> realized now the driver only supports routing to the right node if the
> partition key is defined in the where clause.
>
> When we do the read we are doing a lot of queries in parallel to
> maximize performance. I believe when the copy is being run there are
> currently 5 threads per machine doing the copy for a max of 30
> concurrent read requests across the cluster.
>
> Specifically these tasks been periodically having a few of these errors:
>
> INFO  [ScheduledTasks:1] 2018-07-13 20:03:20,124
> MessagingService.java:929 - REQUEST_RESPONSE messages were dropped in
> last 5000 ms: 1 for internal timeout and 0 for cross node timeout
>
> Which are causing errors in the read by token range queries.
>
> Running "nodetool settraceprobability 1" and running the test when
> failing we could see that this timeout would occur when using a
> coordinator on the read query (i.e. the co-ordinator sent the message
> but didn't get a response to the query from the other node within the
> time limit). We were seeing these timeouts periodically even if we set
> the timeouts to 60 seconds.
>
> As I mentioned at the beginning we had initially been using m4.xlarge
> for our Cassandra servers. After discussion with AWS it was suggested
> that we could be hitting performance limits (i.e. either network or
> disk - I believe more likely network as I didn't see the disk getting
> hit very hard) so we upgraded the Cassandra servers and everything was
> fine for a while.
>
> But then the problems started to re-occur recently...pretty
> consistently failing on these copy or export jobs running overnight.
> Having looked at resource usage statistics graphs it appeared that the
> C* servers were not heavily loaded at all (the app servers were being
> maxed out) and I did not see any significant garbage collections in
> the logs that could explain the delays.
>
> As a last resort I decided to turn up the logging on the server and
> client, datastax client set to debug and server set to the following
> logs via nodetool...the goal being to maximize logging while cutting
> out the very verbose stuff (e.g. Message.java appears to print out the
> whole message in 2.1.14 when put into debug -> it looks like that was
> moved to trace in a later 2.1.x release):
>
> bin/nodetool setlogginglevel org.apache.cassandra.tracing.Tracing INFO
> bin/nodetool setlogginglevel org.apache.cassandra.transport.Message INFO
> bin/nodetool setlogginglevel org.apache.cassandra.db.ColumnFamilyStore
> DEBUG
> bin/nodetool setlogginglevel org.apache.cassandra.gms.Gossiper DEBUG
> bin/nodetool setlogginglevel
> org.apache.cassandra.db.filter.SliceQueryFilter DEBUG
> bin/nodetool setlogginglevel
> org.apache.cassandra.service.pager.AbstractQueryPager INFO
> bin/nodetool setlogginglevel org.apache.cassandra TRACE
>
> Of course when we did this (as part of turning on the logging the
> application servers were restarted) the problematic export to file
> jobs which had failed every time for the last week succeeded and ran
> much faster than they had run usually (47 minutes vs 1 1/2 hours) so I
> decided to look for the biggest delay (which turned out to be ~9
> seconds and see what I could find in the log - outside of this time,
> the response times were up to perhaps 20ms). Here is what I found:
>
> (1) Only one Cassandra node had delays at a time.
>
> (2) On the Cassandra node that did had delays there was no significant
> information from the GCInspector (the system stopped processing client
> requests between 05:32:33 - 05:32:43). If anything it confirmed my
> belief that the system was lightly loaded
>
> D

User Defined Types?

2018-08-05 Thread shalom sagges

Hi All,

Are there any known caveats for User Defined Types in Cassandra (version
3.0)?
One of our teams wants to start using them. I wish to assess it and see if
it'd be wise (or not) to refrain from using UDTs.


Thanks!

Re: User Defined Types?

2018-08-06 Thread shalom sagges

Thanks a lot Anup! :-)



On Mon, Aug 6, 2018 at 5:45 AM, Anup Shirolkar <
anup.shirol...@instaclustr.com> wrote:

> Hi,
>
> Few of the caveats can be found here:
> https://issues.apache.org/jira/browse/CASSANDRA-7423
>
> The JIRA is implemented in version *3.6* and you are on 3.0,
> So you are affected by UDT behaviour (stored as BLOB) mentioned in the
> JIRA.
>
> Cheers,
> Anup
>
> On 5 August 2018 at 23:29, shalom sagges  wrote:
>
>> Hi All,
>>
>> Are there any known caveats for User Defined Types in Cassandra (version
>> 3.0)?
>> One of our teams wants to start using them. I wish to assess it and see
>> if it'd be wise (or not) to refrain from using UDTs.
>>
>>
>> Thanks!
>>
>
>
>
> --
>
> Anup Shirolkar
>
> Consultant
>
> +61 420 602 338
>
> <https://www.instaclustr.com/solutions/managed-apache-kafka/>
>
> <https://www.facebook.com/instaclustr>   <https://twitter.com/instaclustr>
><https://www.linkedin.com/company/instaclustr>
>
> Read our latest technical blog posts here
> <https://www.instaclustr.com/blog/>.
>

Re: Large sstables

2018-09-01 Thread shalom sagges

If there are a lot of droppable tombstones, you could also run User Defined
Compaction on that (and on other) SSTable(s).

This blog post explains it well:
http://thelastpickle.com/blog/2016/10/18/user-defined-compaction.html

On Fri, Aug 31, 2018 at 12:04 AM Mohamadreza Rostami <
mohamadrezarosta...@gmail.com> wrote:

> Hi,Dear Vitali
> The best option for you is migrating data to the new table and change
> portion key patterns to a better distribution of data and you sstables
> become smaller but if your data already have good distribution and your
> data is really big you must add new server to your datacenter, if you
> change compassion strategy it has some risk.
>
> > On Shahrivar 8, 1397 AP, at 19:54, Jeff Jirsa  wrote:
> >
> > Either of those are options, but there’s also sstablesplit to break it
> up a bit
> >
> > Switching to LCS can be a problem depending on how many sstables
> /overlaps you have
> >
> > --
> > Jeff Jirsa
> >
> >
> >> On Aug 30, 2018, at 8:05 AM, Vitali Dyachuk  wrote:
> >>
> >> Hi,
> >> Some of the sstables got too big 100gb and more so they are not
> compactiong any more so some of the disks are running out of space. I'm
> running C* 3.0.17, RF3 with 10 disks/jbod with STCS.
> >> What are my options? Completely delete all data on this node and rejoin
> it to the cluster, change CS to LCS then run repair?
> >> Vitali.
> >>
> >
> > -
> > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: user-h...@cassandra.apache.org
> >
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>

Re: Read timeouts when performing rolling restart

2018-09-12 Thread shalom sagges

Hi Riccardo,

Does this issue occur when performing a single restart or after several
restarts during a rolling restart (as mentioned in your original post)?
We have a cluster that when performing a rolling restart, we prefer to wait
~10-15 minutes between each restart because we see an increase of GC for a
few minutes.
If we keep restarting the nodes quickly one after the other, the
applications experience timeouts (probably due to GC and hints).

Hope this helps!

On Thu, Sep 13, 2018 at 2:20 AM Riccardo Ferrari  wrote:

> A little update on the progress.
>
> First:
> Thank you Thomas. I checked the code in the patch and briefly skimmed
> through the 3.0.6 code. Yup it should be fixed.
> Thank you Surbhi. At the moment we don't need authentication as the
> instances are locked down.
>
> Now:
> - Unfortunately the start_transport_native trick does not always work. On
> some nodes works on other don't. What do I mean? I still experience
> timeouts and dropped messages during startup.
> - I realized that cutting the concurrent_compactors to 1 was not really a
> good idea, minimum vlaue should be 2, currently testing 4 (that is the
> min(n_cores, n_disks))
> - After rising the compactors to 4 I still see some dropped messages for
> HINT and MUTATIONS. This happens during startup. Reason is "for internal
> timeout". Maybe too many compactors?
>
> Thanks!
>
>
> On Wed, Sep 12, 2018 at 7:09 PM, Surbhi Gupta 
> wrote:
>
>> Another thing to notice is :
>>
>> system_auth WITH replication = {'class': 'SimpleStrategy',
>> 'replication_factor': '1'}
>>
>> system_auth has a replication factor of 1 and even if one node is down it
>> may impact the system because of the replication factor.
>>
>>
>>
>> On Wed, 12 Sep 2018 at 09:46, Steinmaurer, Thomas <
>> thomas.steinmau...@dynatrace.com> wrote:
>>
>>> Hi,
>>>
>>>
>>>
>>> I remember something that a client using the native protocol gets
>>> notified too early by Cassandra being ready due to the following issue:
>>>
>>> https://issues.apache.org/jira/browse/CASSANDRA-8236
>>>
>>>
>>>
>>> which looks similar, but above was marked as fixed in 2.2.
>>>
>>>
>>>
>>> Thomas
>>>
>>>
>>>
>>> *From:* Riccardo Ferrari 
>>> *Sent:* Mittwoch, 12. September 2018 18:25
>>> *To:* user@cassandra.apache.org
>>> *Subject:* Re: Read timeouts when performing rolling restart
>>>
>>>
>>>
>>> Hi Alain,
>>>
>>>
>>>
>>> Thank you for chiming in!
>>>
>>>
>>>
>>> I was thinking to perform the 'start_native_transport=false' test as
>>> well and indeed the issue is not showing up. Starting the/a node with
>>> native transport disabled and letting it cool down lead to no timeout
>>> exceptions no dropped messages, simply a crystal clean startup. Agreed it
>>> is a workaround
>>>
>>>
>>>
>>> # About upgrading:
>>>
>>> Yes, I desperately want to upgrade despite is a long and slow task. Just
>>> reviewing all the changes from 3.0.6 to 3.0.17
>>> is going to be a huge pain, top of your head, any breaking change I
>>> should absolutely take care of reviewing ?
>>>
>>>
>>>
>>> # describecluster output: YES they agree on the same schema version
>>>
>>>
>>>
>>> # keyspaces:
>>>
>>> system WITH replication = {'class': 'LocalStrategy'}
>>>
>>> system_schema WITH replication = {'class': 'LocalStrategy'}
>>>
>>> system_auth WITH replication = {'class': 'SimpleStrategy',
>>> 'replication_factor': '1'}
>>>
>>> system_distributed WITH replication = {'class': 'SimpleStrategy',
>>> 'replication_factor': '3'}
>>>
>>> system_traces WITH replication = {'class': 'SimpleStrategy',
>>> 'replication_factor': '2'}
>>>
>>>
>>>
>>>  WITH replication = {'class': 'SimpleStrategy',
>>> 'replication_factor': '3'}
>>>
>>>   WITH replication = {'class': 'SimpleStrategy',
>>> 'replication_factor': '3'}
>>>
>>>
>>>
>>> # Snitch
>>>
>>> Ec2Snitch
>>>
>>>
>>>
>>> ## About Snitch and replication:
>>>
>>> - We have the default DC and all nodes are in the same RACK
>>>
>>> - We are planning to move to GossipingPropertyFileSnitch configuring the
>>> cassandra-rackdc accortingly.
>>>
>>> -- This should be a transparent change, correct?
>>>
>>>
>>>
>>> - Once switched to GPFS, we plan to move to 'NetworkTopologyStrategy'
>>> with 'us-' DC and replica counts as before
>>>
>>> - Then adding a new DC inside the VPC, but this is another story...
>>>
>>>
>>>
>>> Any concerns here ?
>>>
>>>
>>>
>>> # nodetool status 
>>>
>>> --  Address Load   Tokens   Owns (effective)  Host
>>> ID   Rack
>>> UN  10.x.x.a  177 GB 256  50.3%
>>> d8bfe4ad-8138-41fe-89a4-ee9a043095b5  rr
>>> UN  10.x.x.b152.46 GB  256  51.8%
>>> 7888c077-346b-4e09-96b0-9f6376b8594f  rr
>>> UN  10.x.x.c   159.59 GB  256  49.0%
>>> 329b288e-c5b5-4b55-b75e-fbe9243e75fa  rr
>>> UN  10.x.x.d  162.44 GB  256  49.3%
>>> 07038c11-d200-46a0-9f6a-6e2465580fb1  rr
>>> UN  10.x.x.e174.9 GB   256  50.5%
>>> c35b5d51-2d14-4334-9ffc-726f9dd8a214  rr
>>> UN  10.x.x.f  194.71 GB  256  49.2%
>>>

Re: Re: High CPU usage on some of the nodes due to message coalesce

2018-10-21 Thread shalom sagges

What takes the most CPU? System or User?
Did you try removing a problematic node and installing a brand new one
(instead of re-adding)?
When you decommissioned these nodes, did the high CPU "move" to other nodes
(probably data model/query issues) or was it completely gone? (server
issues)


On Sun, Oct 21, 2018 at 3:52 PM onmstester onmstester
 wrote:

> I don't think that root cause is related to Cassandra config, because the
> nodes are homogeneous and config for all of them are the same (16GB heap
> with default gc), also mutation counter and Native Transport counter is the
> same in all of the nodes, but only these 3 nodes experiencing 100% CPU
> usage (others have less than 20% CPU usage)
> I even decommissioned these 3 nodes from cluster and re-add them, but
> still the same
> The cluster is OK without these 3 nodes (in a state that these nodes are
> decommissioned)
>
> Sent using Zoho Mail 
>
>
>  Forwarded message 
> From : Chris Lohfink 
> To : 
> Date : Sat, 20 Oct 2018 23:24:03 +0330
> Subject : Re: High CPU usage on some of the nodes due to message coalesce
>  Forwarded message 
>
> 1s young gcs are horrible and likely cause of *some* of your bad metrics.
> How large are your mutations/query results and what gc/heap settings are
> you using?
>
> You can use https://github.com/aragozin/jvm-tools to see the threads
> generating allocation pressure and using the cpu (ttop) and what garbage is
> being created (hh --dead-young).
>
> Just a shot in the dark, I would *guess* you have rather large mutations
> putting pressure on commitlog and heap. G1 with a larger heap might help in
> that scenario to reduce fragmentation and adjust its eden and survivor
> regions to the allocation rate better (but give it a bigger reserve space)
> but theres limits to what can help if you cant change your workload.
> Without more info on schema etc its hard to tell but maybe that can help
> give you some ideas on places to look. It could just as likely be repair
> coordination, wide partition reads, or compactions so need to look more at
> what within the app is causing the pressure to know if its possible to
> improve with settings or if the load your application is producing exceeds
> what your cluster can handle (needs more nodes).
>
> Chris
>
> On Oct 20, 2018, at 5:18 AM, onmstester onmstester <
> onmstes...@zoho.com.INVALID> wrote:
>
> 3 nodes in my cluster have 100% cpu usage and most of it is used by
> org.apache.cassandra.util.coalesceInternal and SepWorker.run?
> The most active threads are the messaging-service-incomming.
> Other nodes are normal, having 30 nodes, using Rack Aware strategy. with
> 10 rack each having 3 nodes. The problematic nodes are configured for one
> rack, on normal write load, system.log reports too many hint message
> dropped (cross node). also there are alot of parNewGc with about 700-1000ms
> and commit log isolated disk, is utilized about 80-90%. on startup of these
> 3 nodes, there are alot of "updateing topology" logs (1000s of them
> pending).
> Using iperf, i'm sure that network is OK
> checking NTPs and mutations on each node, load is balanced among the nodes.
> using apache cassandra 3.11.2
> I can not not figure out the root cause of the problem, although there are
> some obvious symptoms.
>
> Best Regards
>
> Sent using Zoho Mail 
>
>
>
>

Re: Re: High CPU usage on some of the nodes due to message coalesce

2018-10-21 Thread shalom sagges

I guess the code experts could shed more light on
org.apache.cassandra.util.coalesceInternal and SepWorker.run.
I'll just add anything I can think of

Any cron or other scheduler running on those nodes?
Lots of Java processes running simultaneously?
Heavy repair continuously running?
Lots of pending compactions?
Is the number of CPU cores the same in all the nodes?
Did you try rebooting one of the nodes?

On Sun, Oct 21, 2018 at 4:55 PM onmstester onmstester
 wrote:

>
> What takes the most CPU? System or User?
>
>
>  most of it is used by org.apache.cassandra.util.coalesceInternal and
> SepWorker.run
>
> Did you try removing a problematic node and installing a brand new one
> (instead of re-adding)?
>
> I did not install a new node, but did remove the problematic node and CPU
> load in all the cluster became normal again
>
> When you decommissioned these nodes, did the high CPU "move" to other
> nodes (probably data model/query issues) or was it completely gone? (server
> issues)
>
> it was completely gone
>
>

Query With Limit Clause

2018-11-05 Thread shalom sagges

Hi All,

If I run for example:
select * from myTable limit 3;

Does Cassandra do a full table scan regardless of the limit?

Thanks!

Re: Query With Limit Clause

2018-11-07 Thread shalom sagges

Thanks a lot for the info :)

On Tue, Nov 6, 2018 at 11:11 AM DuyHai Doan  wrote:

> Cassandra will execute such request using a Partition Range Scan.
>
> See more details here http://www.doanduyhai.com/blog/?p=13191, chapter E
> Cluster Read Path (look at the formula of Concurrency Factor)
>
>
>
> On Tue, Nov 6, 2018 at 8:21 AM shalom sagges 
> wrote:
>
>> Hi All,
>>
>> If I run for example:
>> select * from myTable limit 3;
>>
>> Does Cassandra do a full table scan regardless of the limit?
>>
>> Thanks!
>>
>

Upgrade to v3.11.3

2019-01-16 Thread shalom sagges

Hi All,

I'm about to start a rolling upgrade process from version 2.0.14 to version
3.11.3.
I have a few small questions:

   1. The upgrade process that I know of is from 2.0.14 to 2.1.x (higher
   than 2.1.9 I think) and then from 2.1.x to 3.x. Do I need to upgrade first
   to 3.0.x or can I upgraded directly from 2.1.x to 3.11.3?

   2. Can I run upgradesstables on several nodes in parallel? Is it crucial
   to run it one node at a time?

   3. When running upgradesstables on a node, does that node still serves
   writes and reads?

   4. Can I use open JDK 8 (instead of Oracle JDK) with C* 3.11.3?

   5. Is there a way to speed up the upgradesstables process? (besides
   compaction_throughput)


Thanks!

Re: Upgrade to v3.11.3

2019-01-17 Thread shalom sagges

Thanks a lot Anuj!



On Wed, Jan 16, 2019 at 4:56 PM Anuj Wadehra  wrote:

> Hi Shalom,
>
> Just a suggestion. Before upgrading to 3.11.3 make sure you are not
> impacted by any open crtitical defects especially related to RT which may
> cause data loss e.g.14861.
>
> Please find my response below:
>
> The upgrade process that I know of is from 2.0.14 to 2.1.x (higher than
> 2.1.9 I think) and then from 2.1.x to 3.x. Do I need to upgrade first to
> 3.0.x or can I upgraded directly from 2.1.x to 3.11.3?
>
> Response: Yes, you can upgrade from 2.0.14 to some latest stable version
> of 2.1.x (only 2.1.9+)  and then upgrade to 3.11.3.
>
> Can I run upgradesstables on several nodes in parallel? Is it crucial to
> run it one node at a time?
>
> Response: Yes, you can run in parallel.
>
>
> When running upgradesstables on a node, does that node still serves writes
> and reads?
>
> Response: Yes.
>
>
> Can I use open JDK 8 (instead of Oracle JDK) with C* 3.11.3?
>
> Response: We have not tried but it should be okay. See
> https://issues.apache.org/jira/plugins/servlet/mobile#issue/CASSANDRA-13916
> .
>
>
> Is there a way to speed up the upgradesstables process? (besides
> compaction_throughput)
>
>
> Response: If clearing pending compactions caused by rewriting sstable is a
> concern,probably you can also try increasing concurrent compactors.
>
>
>
> Disclaimer: The information provided in above response is my personal
> opinion based on the best of my knowledge and experience. We do
> not take any responsibility and we are not liable for any damage caused by
> actions taken based on above information.
> Thanks
> Anuj
>
>
> On Wed, 16 Jan 2019 at 19:15, shalom sagges
>  wrote:
> Hi All,
>
> I'm about to start a rolling upgrade process from version 2.0.14 to
> version 3.11.3.
> I have a few small questions:
>
>1. The upgrade process that I know of is from 2.0.14 to 2.1.x (higher
>than 2.1.9 I think) and then from 2.1.x to 3.x. Do I need to upgrade first
>to 3.0.x or can I upgraded directly from 2.1.x to 3.11.3?
>
>2. Can I run upgradesstables on several nodes in parallel? Is it
>crucial to run it one node at a time?
>
>3. When running upgradesstables on a node, does that node still serves
>writes and reads?
>
>4. Can I use open JDK 8 (instead of Oracle JDK) with C* 3.11.3?
>
>5. Is there a way to speed up the upgradesstables process? (besides
>compaction_throughput)
>
>
> Thanks!
>
>

Upgrade From 2.0 to 2.1

2019-02-11 Thread shalom sagges

Hi All,

I've successfully upgraded a 2.0 cluster to 2.1 on the way to upgrade to
3.11 (hopefully 3.11.4 if it'd be released very soon).

I have 2 small questions:

   1. Currently the Datastax clients are enforcing Protocol Version 2 to
   prevent mixed cluster issues. Do I need now to enforce Protocol Version 3
   while upgrading from 2.1 to 3.11 or can I still use Protocol Version 2?

   2. After the upgrade, I found that system table NodeIdInfo has not been
   upgraded, i.e. I still see it in *-jb-* convention. Does this mean that
   this table is obsolete and can be removed?


Thanks!

Re: Upgrade From 2.0 to 2.1

2019-02-11 Thread shalom sagges

Very soon. If not today, it will be up tomorrow. :)
Yayyy, just saw the release of 3.11.4.  :-)

You'll need to go to v3 for 3.11. Congratulations on being aware enough to
do this - advanced upgrade coordination, it's absolutely the right thing to
do, but most people don't know it's possible or useful.
Thanks a lot Jeff for clarifying this.
I really hoped the answer would be different. Now I need to nag our R&D
teams again :-)

Thanks!

On Mon, Feb 11, 2019 at 8:21 PM Michael Shuler 
wrote:

> On 2/11/19 9:24 AM, shalom sagges wrote:
> > I've successfully upgraded a 2.0 cluster to 2.1 on the way to upgrade to
> > 3.11 (hopefully 3.11.4 if it'd be released very soon).
>
> Very soon. If not today, it will be up tomorrow. :)
>
> --
> Michael
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>

Re: forgot to run nodetool cleanup

2019-02-14 Thread shalom sagges

Cleanup is a great way to free up disk space.

Just note you might run into
https://issues.apache.org/jira/browse/CASSANDRA-9036 if you use a version
older than 2.0.15.



On Thu, Feb 14, 2019 at 10:20 AM Oleksandr Shulgin <
oleksandr.shul...@zalando.de> wrote:

> On Wed, Feb 13, 2019 at 6:47 PM Jeff Jirsa  wrote:
>
>> Depending on how bad data resurrection is, you should run it for any host
>> that loses a range. In vnodes, that's usually all hosts.
>>
>> Cleanup with LCS is very cheap. Cleanup with STCS/TWCS is a bit more work.
>>
>
> Wait, doesn't cleanup just rewrite every SSTable one by one?  Why would
> compaction strategy matter?  Do you mean that after cleanup STCS may pick
> some resulting tables to re-compact them due to the min/max size
> difference, which would not be the case with LCS?
>
>
>> If you're just TTL'ing all data, it may not be worth the effort.
>>
>
> Indeed, but in our case the main reason to scale out is that the nodes are
> running out of disk space, so we really want to get rid of the extra copies.
>
> --
> Alex
>
>

Re: Question on changing node IP address

2019-02-27 Thread shalom sagges

If you're using the PropertyFileSnitch, well... you shouldn't as it's a
rather dangerous and tedious snitch to use

I inherited Cassandra clusters that use the PropertyFileSnitch. It's been
working fine, but you've kinda scared me :-)
Why is it dangerous to use?
If I decide to change the snitch, is it seamless or is there a specific
procedure one must follow?

Thanks!


On Wed, Feb 27, 2019 at 10:08 AM Alexander Dejanovski <
a...@thelastpickle.com> wrote:

> I confirm what Oleksandr said.
> Just stop Cassandra, change the IP, and restart Cassandra.
> If you're using the GossipingPropertyFileSnitch, the node will redeclare
> its new IP through Gossip and that's it.
> If you're using the PropertyFileSnitch, well... you shouldn't as it's a
> rather dangerous and tedious snitch to use. But if you are, it'll require
> to change the file containing all the IP addresses across the cluster.
>
> I've been changing IPs on a whole cluster back in 2.1 this way and it went
> through seamlessly.
>
> Cheers,
>
> On Wed, Feb 27, 2019 at 8:54 AM Oleksandr Shulgin <
> oleksandr.shul...@zalando.de> wrote:
>
>> On Wed, Feb 27, 2019 at 4:15 AM wxn...@zjqunshuo.com <
>> wxn...@zjqunshuo.com> wrote:
>>
>>> >After restart with the new address the server will notice it and log a
>>> warning, but it will keep token ownership as long as it keeps the old host
>>> id (meaning it must use the same data directory as before restart).
>>>
>>> Based on my understanding, token range is binded to host id. As long as
>>> host id doesn't change, everything is ok. Besides data directory, any other
>>> thing can lead to host id change? And how host id is caculated? For
>>> example, if I upgrade Cassandra binary to a new version, after restart,
>>> will host id change?
>>>
>>
>> I believe host id is calculated once the new node is initialized and
>> never changes afterwards, even through major upgrades.  It is stored in
>> system keyspace in data directory, and is stable across restarts.
>>
>> --
>> Alex
>>
>> --
> -
> Alexander Dejanovski
> France
> @alexanderdeja
>
> Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>

Re: Question on changing node IP address

2019-02-27 Thread shalom sagges

Thanks for the info Alex!

I read
https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsSwitchSnitch.html
but still have a few questions:

Our clusters are comprised of 2 DCs with no rack configuration, RF=3 on
each DC.
In this scenario, if I wish to seamlessly change the snitch with 0
downtime, do I need to add the cassandra-rackdc.properties file, change the
snitch in cassandra.yaml and restart one by one?
Will this method cause problems?

Thanks!


On Wed, Feb 27, 2019 at 12:18 PM Alexander Dejanovski <
a...@thelastpickle.com> wrote:

> You'll be fine with the SimpleSnitch (which shouldn't be used either
> because it doesn't allow a cluster to use multiple datacenters or racks).
> Just change the IP and upon restart the node will redeclare itself in the
> ring. If your node is a seed node, you'll need to update your seed list
> across the cluster.
>
> On Wed, Feb 27, 2019 at 10:52 AM wxn...@zjqunshuo.com <
> wxn...@zjqunshuo.com> wrote:
>
>> I'm using SimpleSnitch. I have only one DC. Is there any problem to
>> follow the below procedure?
>>
>> -Simon
>>
>> *From:* Alexander Dejanovski 
>> *Date:* 2019-02-27 16:07
>> *To:* user 
>> *Subject:* Re: Question on changing node IP address
>>
>> I confirm what Oleksandr said.
>> Just stop Cassandra, change the IP, and restart Cassandra.
>> If you're using the GossipingPropertyFileSnitch, the node will redeclare
>> its new IP through Gossip and that's it.
>> If you're using the PropertyFileSnitch, well... you shouldn't as it's a
>> rather dangerous and tedious snitch to use. But if you are, it'll require
>> to change the file containing all the IP addresses across the cluster.
>>
>> I've been changing IPs on a whole cluster back in 2.1 this way and it
>> went through seamlessly.
>>
>> Cheers,
>>
>> On Wed, Feb 27, 2019 at 8:54 AM Oleksandr Shulgin <
>> oleksandr.shul...@zalando.de> wrote:
>>
>>> On Wed, Feb 27, 2019 at 4:15 AM wxn...@zjqunshuo.com <
>>> wxn...@zjqunshuo.com> wrote:
>>>
 >After restart with the new address the server will notice it and log a
 warning, but it will keep token ownership as long as it keeps the old host
 id (meaning it must use the same data directory as before restart).

 Based on my understanding, token range is binded to host id. As long as
 host id doesn't change, everything is ok. Besides data directory, any other
 thing can lead to host id change? And how host id is caculated? For
 example, if I upgrade Cassandra binary to a new version, after restart,
 will host id change?

>>>
>>> I believe host id is calculated once the new node is initialized and
>>> never changes afterwards, even through major upgrades.  It is stored in
>>> system keyspace in data directory, and is stable across restarts.
>>>
>>> --
>>> Alex
>>>
>>> --
>> -
>> Alexander Dejanovski
>> France
>> @alexanderdeja
>>
>> Consultant
>> Apache Cassandra Consulting
>> http://www.thelastpickle.com
>>
>> --
> -
> Alexander Dejanovski
> France
> @alexanderdeja
>
> Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>

A Question About Hints

2019-03-04 Thread shalom sagges

Hi All,

Does anyone know what is the most optimal hints configuration (multiple
DCs) in terms of
max_hints_delivery_threads and hinted_handoff_throttle_in_kb?
If it's different for various use cases, is there a rule of thumb I can
work with?

I found this post but it's quite old:
http://www.uberobert.com/bandwidth-cassandra-hinted-handoff/

Thanks!

Re: A Question About Hints

2019-03-04 Thread shalom sagges

Hi Kenneth,

The concern is that in some cases, hints accumulate on nodes, and it takes
a while until they are delivered (multi DCs).
I see that whenever there are  a lot of hints in play,like after a rolling
restart, the cluster works harder. That's why I want to decrease the hints
delivery time.
I didn't want to change the configuration blindly and thought the community
might have some experience on this subject.

I went over the cassandra.yaml file but didn't find any information on
optimizing these attributes, just that the max_throttle is divided between
nodes in the cluster and that I should increase the
max_hints_delivery_threads because I have multi-dc deployments.

# Maximum throttle in KBs per second, per delivery thread.  This will be
# reduced proportionally to the number of nodes in the cluster.  (If there
# are two nodes in the cluster, each delivery thread will use the maximum
# rate; if there are three, each will throttle to half of the maximum,
# since we expect two nodes to be delivering hints simultaneously.)
hinted_handoff_throttle_in_kb: 1024

# Number of threads with which to deliver hints;
# Consider increasing this number when you have multi-dc deployments, since
# cross-dc handoff tends to be slower
max_hints_delivery_threads: 2

Thanks for your help!

On Mon, Mar 4, 2019 at 6:44 PM Kenneth Brotman 
wrote:

> What is the concern?  Why are you looking there?  The casssandra.yml file
> has some notes about it.  Did you read them?
>
>
>
> *From:* shalom sagges [mailto:shalomsag...@gmail.com]
> *Sent:* Monday, March 04, 2019 7:22 AM
> *To:* user@cassandra.apache.org
> *Subject:* A Question About Hints
>
>
>
> Hi All,
>
>
>
> Does anyone know what is the most optimal hints configuration (multiple
> DCs) in terms of
>
> max_hints_delivery_threads and hinted_handoff_throttle_in_kb?
>
> If it's different for various use cases, is there a rule of thumb I can
> work with?
>
>
>
> I found this post but it's quite old:
>
> http://www.uberobert.com/bandwidth-cassandra-hinted-handoff/
>
>
>
> Thanks!
>

Re: A Question About Hints

2019-03-04 Thread shalom sagges

It varies...
Some clusters have 48 nodes, others 24 nodes and some 8 nodes.
Both settings are on default.

I’d try making a single conservative change to one or the other, measure
and reassess.  Then do same to other setting.

That's the plan, but I thought I might first get some valuable information
from someone in the community that has already experienced in this type of
change.


Thanks!


On Mon, Mar 4, 2019 at 8:27 PM Kenneth Brotman 
wrote:

> It sounds like your use case might be appropriate for tuning those two
> settings some.
>
>
>
> How many nodes are in the cluster?
>
> Are both settings definitely on the default values currently?
>
>
>
> I’d try making a single conservative change to one or the other, measure
> and reassess.  Then do same to other setting.
>
>
>
> Then of course share your results with us.
>
>
>
> *From:* shalom sagges [mailto:shalomsag...@gmail.com]
> *Sent:* Monday, March 04, 2019 9:54 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: A Question About Hints
>
>
>
> Hi Kenneth,
>
>
>
> The concern is that in some cases, hints accumulate on nodes, and it takes
> a while until they are delivered (multi DCs).
>
> I see that whenever there are  a lot of hints in play,like after a rolling
> restart, the cluster works harder. That's why I want to decrease the hints
> delivery time.
>
> I didn't want to change the configuration blindly and thought the
> community might have some experience on this subject.
>
>
>
> I went over the cassandra.yaml file but didn't find any information on
> optimizing these attributes, just that the max_throttle is divided between
> nodes in the cluster and that I should increase the
> max_hints_delivery_threads because I have multi-dc deployments.
>
>
>
> # Maximum throttle in KBs per second, per delivery thread.  This will be
> # reduced proportionally to the number of nodes in the cluster.  (If there
> # are two nodes in the cluster, each delivery thread will use the maximum
> # rate; if there are three, each will throttle to half of the maximum,
> # since we expect two nodes to be delivering hints simultaneously.)
> hinted_handoff_throttle_in_kb: 1024
>
> # Number of threads with which to deliver hints;
> # Consider increasing this number when you have multi-dc deployments, since
> # cross-dc handoff tends to be slower
> max_hints_delivery_threads: 2
>
>
>
>
>
> Thanks for your help!
>
>
>
>
>
> On Mon, Mar 4, 2019 at 6:44 PM Kenneth Brotman
>  wrote:
>
> What is the concern?  Why are you looking there?  The casssandra.yml file
> has some notes about it.  Did you read them?
>
>
>
> *From:* shalom sagges [mailto:shalomsag...@gmail.com]
> *Sent:* Monday, March 04, 2019 7:22 AM
> *To:* user@cassandra.apache.org
> *Subject:* A Question About Hints
>
>
>
> Hi All,
>
>
>
> Does anyone know what is the most optimal hints configuration (multiple
> DCs) in terms of
>
> max_hints_delivery_threads and hinted_handoff_throttle_in_kb?
>
> If it's different for various use cases, is there a rule of thumb I can
> work with?
>
>
>
> I found this post but it's quite old:
>
> http://www.uberobert.com/bandwidth-cassandra-hinted-handoff/
>
>
>
> Thanks!
>
>

Re: A Question About Hints

2019-03-04 Thread shalom sagges

See my comments inline.

Do the 8 nodes clusters have the problem too?
Yes

To the same extent?

It depends on the throughput, but basically the smaller clusters get low
throughput, so the problem is naturally smaller.


Is it any cluster across multi-DC’s?

Yes


Do all the clusters use nodes with similar specs?

All nodes have similar specs within a cluster but different specs on
different clusters.


The version of Cassandra you are on can make a difference.  What version
are you on?

Currently I'm on various versions, 2.0.14, 2.1.15 and 3.0.12. In the
process of upgrading to 3.11.4


Did you see Edward Capriolo’s presentation at 26:19 into the YouTube video
at: https://www.youtube.com/watch?v=uN4FtAjYmLU where he briefly mentions
you can get into trouble if you go to fast or two slow?

I guess you can say it about almost any parameter you change :)


BTW, I thought the comments at the end of the article you mentioned were
really good.

The entire article is very good, but I wonder if it's still valid since it
was created around 4 years ago.


Thanks!





On Mon, Mar 4, 2019 at 9:37 PM Kenneth Brotman 
wrote:

> Makes sense.  If you have time and don’t mind, could you answer the
> following:
>
> Do the 8 nodes clusters have the problem too?
>
> To the same extent?
>
> Is it just the clusters with the large node count?
>
> Is it any cluster across multi-DC’s?
>
> Do all the clusters use nodes with similar specs?
>
>
>
> The version of Cassandra you are on can make a difference.  What version
> are you on?
>
>
>
> Did you see Edward Capriolo’s presentation at 26:19 into the YouTube video
> at: https://www.youtube.com/watch?v=uN4FtAjYmLU where he briefly mentions
> you can get into trouble if you go to fast or two slow?
>
> BTW, I thought the comments at the end of the article you mentioned were
> really good.
>
>
>
>
>
>
>
> *From:* shalom sagges [mailto:shalomsag...@gmail.com]
> *Sent:* Monday, March 04, 2019 11:04 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: A Question About Hints
>
>
>
> It varies...
>
> Some clusters have 48 nodes, others 24 nodes and some 8 nodes.
>
> Both settings are on default.
>
>
>
> I’d try making a single conservative change to one or the other, measure
> and reassess.  Then do same to other setting.
>
> That's the plan, but I thought I might first get some valuable information
> from someone in the community that has already experienced in this type of
> change.
>
>
>
> Thanks!
>
>
>
> On Mon, Mar 4, 2019 at 8:27 PM Kenneth Brotman
>  wrote:
>
> It sounds like your use case might be appropriate for tuning those two
> settings some.
>
>
>
> How many nodes are in the cluster?
>
> Are both settings definitely on the default values currently?
>
>
>
> I’d try making a single conservative change to one or the other, measure
> and reassess.  Then do same to other setting.
>
>
>
> Then of course share your results with us.
>
>
>
> *From:* shalom sagges [mailto:shalomsag...@gmail.com]
> *Sent:* Monday, March 04, 2019 9:54 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: A Question About Hints
>
>
>
> Hi Kenneth,
>
>
>
> The concern is that in some cases, hints accumulate on nodes, and it takes
> a while until they are delivered (multi DCs).
>
> I see that whenever there are  a lot of hints in play,like after a rolling
> restart, the cluster works harder. That's why I want to decrease the hints
> delivery time.
>
> I didn't want to change the configuration blindly and thought the
> community might have some experience on this subject.
>
>
>
> I went over the cassandra.yaml file but didn't find any information on
> optimizing these attributes, just that the max_throttle is divided between
> nodes in the cluster and that I should increase the
> max_hints_delivery_threads because I have multi-dc deployments.
>
>
>
> # Maximum throttle in KBs per second, per delivery thread.  This will be
> # reduced proportionally to the number of nodes in the cluster.  (If there
> # are two nodes in the cluster, each delivery thread will use the maximum
> # rate; if there are three, each will throttle to half of the maximum,
> # since we expect two nodes to be delivering hints simultaneously)
> hinted_handoff_throttle_in_kb: 1024
>
> # Number of threads with which to deliver hints;
> # Consider increasing this number when you have multi-dc deployments, since
> # cross-dc handoff tends to be slower
> max_hints_delivery_threads: 2
>
>
>
>
>
> Thanks for your help!
>
>
>
>
>
> On Mon, Mar 4, 2019 at 6:44 PM Kenneth Brotman
>  wrote:
>

Re: A Question About Hints

2019-03-04 Thread shalom sagges

Everyone really should move off of the 2.x versions just like you are doing.

Tell me about it... But since there are a lot of groups involved, these
things take time unfortunately.


Thanks for your assistance Kenneth


On Mon, Mar 4, 2019 at 11:04 PM Kenneth Brotman
 wrote:

> Since you are in the process of upgrading, I’d do nothing on the settings
> right now.  But if you wanted to do something on the settings in the
> meantime, based on my read of the information available, I’d maybe double
> the default settings. The upgrade will help a lot of things as you know.
>
>
>
> Everyone really should move off of the 2.x versions just like you are
> doing.
>
>
>
> *From:* shalom sagges [mailto:shalomsag...@gmail.com]
> *Sent:* Monday, March 04, 2019 12:34 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: A Question About Hints
>
>
>
> See my comments inline.
>
>
>
> Do the 8 nodes clusters have the problem too?
>
> Yes
>
>
>
> To the same extent?
>
> It depends on the throughput, but basically the smaller clusters get low
> throughput, so the problem is naturally smaller.
>
>
>
> Is it any cluster across multi-DC’s?
>
> Yes
>
>
>
> Do all the clusters use nodes with similar specs?
>
> All nodes have similar specs within a cluster but different specs on
> different clusters.
>
>
>
> The version of Cassandra you are on can make a difference.  What version
> are you on?
>
> Currently I'm on various versions, 2.0.14, 2.1.15 and 3.0.12. In the
> process of upgrading to 3.11.4
>
>
>
> Did you see Edward Capriolo’s presentation at 26:19 into the YouTube video
> at: https://www.youtube.com/watch?v=uN4FtAjYmLU where he briefly mentions
> you can get into trouble if you go to fast or two slow?
>
> I guess you can say it about almost any parameter you change :)
>
>
>
> BTW, I thought the comments at the end of the article you mentioned were
> really good.
>
> The entire article is very good, but I wonder if it's still valid since it
> was created around 4 years ago.
>
>
>
> Thanks!
>
>
>
>
>
>
>
>
>
> On Mon, Mar 4, 2019 at 9:37 PM Kenneth Brotman 
> wrote:
>
> Makes sense  If you have time and don’t mind, could you answer the
> following:
>
> Do the 8 nodes clusters have the problem too?
>
> To the same extent?
>
> Is it just the clusters with the large node count?
>
> Is it any cluster across multi-DC’s?
>
> Do all the clusters use nodes with similar specs?
>
>
>
> The version of Cassandra you are on can make a difference.  What version
> are you on?
>
>
>
> Did you see Edward Capriolo’s presentation at 26:19 into the YouTube video
> at: https://www.youtube.com/watch?v=uN4FtAjYmLU where he briefly mentions
> you can get into trouble if you go to fast or two slow?
>
> BTW, I thought the comments at the end of the article you mentioned were
> really good.
>
>
>
>
>
>
>
> *From:* shalom sagges [mailto:shalomsag...@gmail.com]
> *Sent:* Monday, March 04, 2019 11:04 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: A Question About Hints
>
>
>
> It varies...
>
> Some clusters have 48 nodes, others 24 nodes and some 8 nodes.
>
> Both settings are on default.
>
>
>
> I’d try making a single conservative change to one or the other, measure
> and reassess.  Then do same to other setting.
>
> That's the plan, but I thought I might first get some valuable information
> from someone in the community that has already experienced in this type of
> change.
>
>
>
> Thanks!
>
>
>
> On Mon, Mar 4, 2019 at 8:27 PM Kenneth Brotman
>  wrote:
>
> It sounds like your use case might be appropriate for tuning those two
> settings some.
>
>
>
> How many nodes are in the cluster?
>
> Are both settings definitely on the default values currently?
>
>
>
> I’d try making a single conservative change to one or the other, measure
> and reassess.  Then do same to other setting.
>
>
>
> Then of course share your results with us.
>
>
>
> *From:* shalom sagges [mailto:shalomsag...@gmail.com]
> *Sent:* Monday, March 04, 2019 9:54 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: A Question About Hints
>
>
>
> Hi Kenneth,
>
>
>
> The concern is that in some cases, hints accumulate on nodes, and it takes
> a while until they are delivered (multi DCs).
>
> I see that whenever there are  a lot of hints in play,like after a rolling
> restart, the cluster works harder. That's why I want to decrease the hints
> delivery time.
>
> I didn't want to change the configurati

Re: Decommissioning a new node when the state is JOINING

2019-04-30 Thread shalom sagges

I would just stop the service of the joining node and then delete the data,
commit logs and saved caches.
After stopping the node while joining, the cluster will remove it from the
list (i.e. nodetool status) without the need to decommission.



On Tue, Apr 30, 2019 at 2:44 PM Akshay Bhardwaj <
akshay.bhardwaj1...@gmail.com> wrote:

> Hi Experts,
>
> I have a cassandra cluster running with 5 nodes. For some reason, I was
> creating a new cassandra cluster, but one of the nodes intended for new
> cluster had the same cassandra.yml file as the existing cluster. This
> resulted in the new node joining the existing cluster, making total no. of
> nodes as 6.
>
> As of now in "nodetool status" command, I see that the state of the new
> node is JOINING, and also rebalancing data with other nodes.
> What is the best way to decommission the node?
>
>1. Can I execute "nodetool decommission" immediately for the new node?
>2. Should I wait for the new node to finish sync, and decommission
>only after that?
>3. Any other quick approach without data loss for existing cluster?
>
>
> Thanks in advance!
>
> Akshay Bhardwaj
> +91-97111-33849
>

Re: Accidentaly removed SSTables of unneeded data

2019-05-02 Thread shalom sagges

Hi Simon,

If you haven't did that already, try to drain and restart the node you
deleted the data from.
Then run the repair again.

Regards,

On Thu, May 2, 2019 at 5:53 PM Simon ELBAZ  wrote:

> Hi,
>
> I am running Cassandra v2.1 on a 3 node cluster.
>
> *# yum list installed | grep cassa*
> *cassandra21.noarch2.1.12-1
> @datastax*
> *cassandra21-tools.noarch  2.1.12-1
> @datastax   *
>
> Unfortunately, I accidentally removed the SSTables (using rm) (older than
> 10 days) of a table on the 3 nodes.
>
> Running 'nodetool repair' on one of the 3 nodes returns error. Whereas, it
> does not on another.
>
> I don't need to recover the lost data but I would like 'nodetool repair'
> not returning an error.
>
> Thanks for any advice.
>
> Simon
>

Re: nodetool repair failing with "Validation failed in /X.X.X.X

2019-05-05 Thread shalom sagges

Hi Rhys,

I encountered this error after adding new SSTables to a cluster and running
nodetool refresh (v3.0.12).
The refresh worked, but after starting repairs on the cluster, I got the
"Validation failed in /X.X.X.X" error on the remote DC.
A rolling restart solved the issue for me.

Hope this helps!

On Sat, May 4, 2019 at 3:58 PM Rhys Campbell
 wrote:

>
> > Hello,
> >
> > I’m having issues running repair on an Apache Cassandra Cluster. I’m
> getting "Failed creating a merkle tree“ errors on the replication partner
> nodes. Anyone have any experience of this? I am running 2.2.13.
> >
> > Further details here…
> https://issues.apache.org/jira/projects/CASSANDRA/issues/CASSANDRA-15109?filter=allopenissues
> >
> > Best,
> >
> > Rhys
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>

Re: Python driver concistency problem

2019-05-22 Thread shalom sagges

In a lot of cases, the issue is with the data model.
Can you describe the table?
Can you provide the query you use to retrieve the data?
What's the load on your cluster?
Are there lots of tombstones?

You can set the consistency level to ONE, just to check if you get
responses. Although normally I would never use ALL unless I run a DDL
command.
I prefer local_quorum if I want my consistency to be strong while keeping
Cassandra's high availability.

Regards,

Re: Select in allow filtering stalls whole cluster. How to prevent such behavior?

2019-05-22 Thread shalom sagges

Hi Vsevolod,

1) Why such behavior? I thought any given SELECT request is handled by a
limited subset of C* nodes and not by all of them, as per connection
consistency/table replication settings, in case.
When you run a query with allow filtering, Cassandra doesn't know where the
data is located, so it has to go node by node, searching for the requested
data.

2) Is it possible to forbid ALLOW FILTERING flag for given users/groups?
I'm not familiar with such a flag. In my case, I just try to educate the
R&D teams.

Regards,

On Wed, May 22, 2019 at 5:01 PM Vsevolod Filaretov 
wrote:

> Hello everyone,
>
> We have an 8 node C* cluster with large volume of unbalanced data. Usual
> per-partition selects work somewhat fine, and are processed by limited
> number of nodes, but if user issues SELECT WHERE IN () ALLOW FILTERING,
> such command stalls all 8 nodes to halt and unresponsiveness to external
> requests while disk IO jumps to 100% across whole cluster. In several
> minutes all nodes seem to finish ptocessing the request and cluster goes
> back to being responsive. Replication level across whole data is 3.
>
> 1) Why such behavior? I thought any given SELECT request is handled by a
> limited subset of C* nodes and not by all of them, as per connection
> consistency/table replication settings, in case.
>
> 2) Is it possible to forbid ALLOW FILTERING flag for given users/groups?
>
> Thank you all very much in advance,
> Vsevolod Filaretov.
>

Re: Select in allow filtering stalls whole cluster. How to prevent such behavior?

2019-05-23 Thread shalom sagges

a) Interesting... But only in case you do not provide partitioning key
right? (so IN() is for partitioning key?)

I think you should ask yourself a different question. Why am I using ALLOW
FILTERING in the first place? What happens if I remove it from the query?
I prefer to denormalize the data to multiple tables or at least create an
index on the requested column (preferably queried together with a known
partition key).

b) Still does not explain or justify "all 8 nodes to halt and
unresponsiveness to external requests" behavior... Even if servers are busy
with the request seriously becoming non-responsive...?

I think it can justify the unresponsiveness. When using ALLOW FILTERING,
you are doing something like a full table scan in a relational database.

There is a lot of information on the internet regarding this subject such
as
https://www.instaclustr.com/apache-cassandra-scalability-allow-filtering-partition-keys/

Hope this helps.

Regards,

On Thu, May 23, 2019 at 7:33 AM Attila Wind  wrote:

> Hi,
>
> "When you run a query with allow filtering, Cassandra doesn't know where
> the data is located, so it has to go node by node, searching for the
> requested data."
>
> a) Interesting... But only in case you do not provide partitioning key
> right? (so IN() is for partitioning key?)
>
> b) Still does not explain or justify "all 8 nodes to halt and
> unresponsiveness to external requests" behavior... Even if servers are busy
> with the request seriously becoming non-responsive...?
>
> cheers
> Attila Wind
>
> http://www.linkedin.com/in/attilaw
> Mobile: +36 31 7811355
>
>
> On 2019. 05. 23. 0:37, shalom sagges wrote:
>
> Hi Vsevolod,
>
> 1) Why such behavior? I thought any given SELECT request is handled by a
> limited subset of C* nodes and not by all of them, as per connection
> consistency/table replication settings, in case.
> When you run a query with allow filtering, Cassandra doesn't know where
> the data is located, so it has to go node by node, searching for the
> requested data.
>
> 2) Is it possible to forbid ALLOW FILTERING flag for given users/groups?
> I'm not familiar with such a flag. In my case, I just try to educate the
> R&D teams.
>
> Regards,
>
> On Wed, May 22, 2019 at 5:01 PM Vsevolod Filaretov 
> wrote:
>
>> Hello everyone,
>>
>> We have an 8 node C* cluster with large volume of unbalanced data. Usual
>> per-partition selects work somewhat fine, and are processed by limited
>> number of nodes, but if user issues SELECT WHERE IN () ALLOW FILTERING,
>> such command stalls all 8 nodes to halt and unresponsiveness to external
>> requests while disk IO jumps to 100% across whole cluster. In several
>> minutes all nodes seem to finish ptocessing the request and cluster goes
>> back to being responsive. Replication level across whole data is 3.
>>
>> 1) Why such behavior? I thought any given SELECT request is handled by a
>> limited subset of C* nodes and not by all of them, as per connection
>> consistency/table replication settings, in case.
>>
>> 2) Is it possible to forbid ALLOW FILTERING flag for given users/groups?
>>
>> Thank you all very much in advance,
>> Vsevolod Filaretov.
>>
>

Re: Select in allow filtering stalls whole cluster. How to prevent such behavior?

2019-05-28 Thread shalom sagges

Hi Attila,

I'm definitely no guru, but I've experienced several cases where people at
my company used allow filtering and caused major performance issues.
As data size increases, the impact will be stronger. If you have large
partitions, performance will decrease.
GC can be affected. And if GC stops the world too long for too many times,
you will feel it.

I sincerely believe the best way would be to educate the users and remodel
the data. Perhaps you need to denormalize your tables or at least use
secondary indices (I prefer to keep it as simple as possible and
denormalize).
If it's a cluster for analytics, perhaps you need to build a designated
cluster only for that so if something does break or get too pressured,
normal activities wouldn't be affected, but there are pros and cons for
that idea too.

Hope this helps.

Regards,


On Tue, May 28, 2019 at 9:43 AM Attila Wind  wrote:

> Hi Gurus,
>
> Looks we stopped this thread. However I would be very much curious answers
> regarding b) ...
>
> Anyone any comments on that?
> I do see this as a potential production outage risk now... Especially as
> we are planning to run analysis queries by hand exactly like that over the
> cluster...
>
> thanks!
> Attila Wind
>
> http://www.linkedin.com/in/attilaw
> Mobile: +36 31 7811355
>
>
> On 2019. 05. 23. 11:42, shalom sagges wrote:
>
> a) Interesting... But only in case you do not provide partitioning key
> right? (so IN() is for partitioning key?)
>
> I think you should ask yourself a different question. Why am I using ALLOW
> FILTERING in the first place? What happens if I remove it from the query?
> I prefer to denormalize the data to multiple tables or at least create an
> index on the requested column (preferably queried together with a known
> partition key).
>
> b) Still does not explain or justify "all 8 nodes to halt and
> unresponsiveness to external requests" behavior... Even if servers are busy
> with the request seriously becoming non-responsive...?
>
> I think it can justify the unresponsiveness. When using ALLOW FILTERING,
> you are doing something like a full table scan in a relational database.
>
> There is a lot of information on the internet regarding this subject such
> as
> https://www.instaclustr.com/apache-cassandra-scalability-allow-filtering-partition-keys/
>
> Hope this helps.
>
> Regards,
>
> On Thu, May 23, 2019 at 7:33 AM Attila Wind 
>  wrote:
>
>> Hi,
>>
>> "When you run a query with allow filtering, Cassandra doesn't know where
>> the data is located, so it has to go node by node, searching for the
>> requested data."
>>
>> a) Interesting... But only in case you do not provide partitioning key
>> right? (so IN() is for partitioning key?)
>>
>> b) Still does not explain or justify "all 8 nodes to halt and
>> unresponsiveness to external requests" behavior... Even if servers are busy
>> with the request seriously becoming non-responsive...?
>>
>> cheers
>> Attila Wind
>>
>> http://www.linkedin.com/in/attilaw
>> Mobile: +36 31 7811355
>>
>>
>> On 2019. 05. 23. 0:37, shalom sagges wrote:
>>
>> Hi Vsevolod,
>>
>> 1) Why such behavior? I thought any given SELECT request is handled by a
>> limited subset of C* nodes and not by all of them, as per connection
>> consistency/table replication settings, in case.
>> When you run a query with allow filtering, Cassandra doesn't know where
>> the data is located, so it has to go node by node, searching for the
>> requested data.
>>
>> 2) Is it possible to forbid ALLOW FILTERING flag for given users/groups?
>> I'm not familiar with such a flag. In my case, I just try to educate the
>> R&D teams.
>>
>> Regards,
>>
>> On Wed, May 22, 2019 at 5:01 PM Vsevolod Filaretov 
>> wrote:
>>
>>> Hello everyone,
>>>
>>> We have an 8 node C* cluster with large volume of unbalanced data. Usual
>>> per-partition selects work somewhat fine, and are processed by limited
>>> number of nodes, but if user issues SELECT WHERE IN () ALLOW FILTERING,
>>> such command stalls all 8 nodes to halt and unresponsiveness to external
>>> requests while disk IO jumps to 100% across whole cluster. In several
>>> minutes all nodes seem to finish ptocessing the request and cluster goes
>>> back to being responsive. Replication level across whole data is 3.
>>>
>>> 1) Why such behavior? I thought any given SELECT request is handled by a
>>> limited subset of C* nodes and not by all of them, as per connection
>>> consistency/table replication settings, in case.
>>>
>>> 2) Is it possible to forbid ALLOW FILTERING flag for given users/groups?
>>>
>>> Thank you all very much in advance,
>>> Vsevolod Filaretov.
>>>
>>

Collecting Latency Metrics

2019-05-29 Thread shalom sagges

Hi All,

I'm creating a dashboard that should collect read/write latency metrics on
C* 3.x.
In older versions (e.g. 2.0) I used to divide the total read latency in
microseconds with the read count.

Is there a metric attribute that shows read/write latency without the need
to do the math, such as in nodetool tablestats "Local read latency" output?
I saw there's a Mean attribute in org.apache.cassandra.metrics.ReadLatency
but I'm not sure this is the right one.

I'd really appreciate your help on this one.
Thanks!

Re: Collecting Latency Metrics

2019-05-29 Thread shalom sagges

If I only send ReadTotalLatency to Graphite/Grafana, can I run an average
on it and use "scale to seconds=1" ?
Will that do the trick?

Thanks!

On Wed, May 29, 2019 at 5:31 PM shalom sagges 
wrote:

> Hi All,
>
> I'm creating a dashboard that should collect read/write latency metrics on
> C* 3.x.
> In older versions (e.g. 2.0) I used to divide the total read latency in
> microseconds with the read count.
>
> Is there a metric attribute that shows read/write latency without the need
> to do the math, such as in nodetool tablestats "Local read latency" output?
> I saw there's a Mean attribute in org.apache.cassandra.metrics.ReadLatency
> but I'm not sure this is the right one.
>
> I'd really appreciate your help on this one.
> Thanks!
>
>
>

Re: Collecting Latency Metrics

2019-05-30 Thread shalom sagges

Thanks for your replies guys. I really appreciate it.

@Alain, I use Graphite for backend on top of Grafana. But the goal is to
move from Graphite to Prometheus eventually.

I tried to find a direct way of getting a specific Latency metric in
average and as Chris pointed out, then Mean value isn't that accurate.
I do not wish to use the percentile metrics either, but a single latency
metric like the *"Local read latency" *output in nodetool tablestats.
Looking at the code of nodetool tablestats, it seems that C* also divides
*ReadTotalLatency.Count* with *ReadLatency.Count *to get the latency
result.

So I guess I will have no choice but to run the calculation on my own via
Graphite:
divideSeries(averageSeries(keepLastValue(nonNegativeDerivative($env.path.to.host.$host.org_apache_cassandra_metrics.Table.$ks.$cf.ReadTotalLatency.Count))),averageSeries(keepLastValue(nonNegativeDerivative($env.path.to.host.$host.org_apache_cassandra_metrics.Table.$ks.$cf.ReadLatency.Count

Does this seem right to you?

Thanks!

On Thu, May 30, 2019 at 12:34 AM Paul Chandler  wrote:

> There are various attributes under
> org.apache.cassandra.metrics.ClientRequest.Latency.Read these measure the
> latency in milliseconds
>
> Thanks
>
> Paul
> www.redshots.com
>
> > On 29 May 2019, at 15:31, shalom sagges  wrote:
> >
> > Hi All,
> >
> > I'm creating a dashboard that should collect read/write latency metrics
> on C* 3.x.
> > In older versions (e.g. 2.0) I used to divide the total read latency in
> microseconds with the read count.
> >
> > Is there a metric attribute that shows read/write latency without the
> need to do the math, such as in nodetool tablestats "Local read latency"
> output?
> > I saw there's a Mean attribute in
> org.apache.cassandra.metrics.ReadLatency but I'm not sure this is the right
> one.
> >
> > I'd really appreciate your help on this one.
> > Thanks!
> >
> >
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>

Re: Collecting Latency Metrics

2019-05-30 Thread shalom sagges

Sorry for the duplicated emails but I just want to make sure I'm doing
it correctly:
To summarize, are both ways accurate or one is better than the other?

divideSeries(averageSeries(keepLastValue(nonNegativeDerivative($env.path.to.host.$host.org_apache_cassandra_metrics.Table.$ks.$cf.ReadTotalLatency.Count))),averageSeries(keepLastValue(nonNegativeDerivative($env.path.to.host.$host.org_apache_cassandra_metrics.Table.$ks.$cf.ReadLatency.Count

OR

alias(scaleToSeconds(averageSeriesWithWildcards(nonNegativeDerivative($env.path.to.host.$host.org_apache_cassandra_metrics.Table.$ks.$cf.ReadTotalLatency.Count),7,8,9),1),'test')

WDYT?


On Thu, May 30, 2019 at 2:29 PM shalom sagges 
wrote:

> Thanks for your replies guys. I really appreciate it.
>
> @Alain, I use Graphite for backend on top of Grafana. But the goal is to
> move from Graphite to Prometheus eventually.
>
> I tried to find a direct way of getting a specific Latency metric in
> average and as Chris pointed out, then Mean value isn't that accurate.
> I do not wish to use the percentile metrics either, but a single latency
> metric like the *"Local read latency" *output in nodetool tablestats.
> Looking at the code of nodetool tablestats, it seems that C* also divides
> *ReadTotalLatency.Count* with *ReadLatency.Count *to get the latency
> result.
>
> So I guess I will have no choice but to run the calculation on my own via
> Graphite:
>
> divideSeries(averageSeries(keepLastValue(nonNegativeDerivative($env.path.to.host.$host.org_apache_cassandra_metrics.Table.$ks.$cf.ReadTotalLatency.Count))),averageSeries(keepLastValue(nonNegativeDerivative($env.path.to.host.$host.org_apache_cassandra_metrics.Table.$ks.$cf.ReadLatency.Count
>
> Does this seem right to you?
>
> Thanks!
>
> On Thu, May 30, 2019 at 12:34 AM Paul Chandler  wrote:
>
>> There are various attributes under
>> org.apache.cassandra.metrics.ClientRequest.Latency.Read these measure the
>> latency in milliseconds
>>
>> Thanks
>>
>> Paul
>> www.redshots.com
>>
>> > On 29 May 2019, at 15:31, shalom sagges  wrote:
>> >
>> > Hi All,
>> >
>> > I'm creating a dashboard that should collect read/write latency metrics
>> on C* 3.x.
>> > In older versions (e.g. 2.0) I used to divide the total read latency in
>> microseconds with the read count.
>> >
>> > Is there a metric attribute that shows read/write latency without the
>> need to do the math, such as in nodetool tablestats "Local read latency"
>> output?
>> > I saw there's a Mean attribute in
>> org.apache.cassandra.metrics.ReadLatency but I'm not sure this is the right
>> one.
>> >
>> > I'd really appreciate your help on this one.
>> > Thanks!
>> >
>> >
>>
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>

Re: Collecting Latency Metrics

2019-06-03 Thread shalom sagges

Thanks a lot for your comments.
This mailing list is truly *the *definitive guide to Cassandra
*. *
The knowledge transferred here is invaluable.
So just wanted to give a big shout out to anyone who is helping out here.

Regards,

On Thu, May 30, 2019 at 6:10 PM Jon Haddad  wrote:

> Yep.  I would *never* use mean when it comes to performance to make any
> sort of decisions.  I prefer to graph all the p99 latencies as well as the
> max.
>
> Some good reading on the topic:
> https://bravenewgeek.com/everything-you-know-about-latency-is-wrong/
>
> On Thu, May 30, 2019 at 7:35 AM Chris Lohfink 
> wrote:
>
>> For what it is worth, generally I would recommend just using the mean vs
>> calculating it yourself. It's a lot easier and averages are meaningless for
>> anything besides trending anyway (which is really what this is useful for,
>> finding issues on the larger scale), especially with high volume clusters
>> so the loss in accuracy kinda moot. Your average for local reads/writes
>> will almost always be sub millisecond but you might end up having 500
>> millisecond requests or worse that the mean will hide.
>>
>> Chris
>>
>> On Thu, May 30, 2019 at 6:30 AM shalom sagges 
>> wrote:
>>
>>> Thanks for your replies guys. I really appreciate it.
>>>
>>> @Alain, I use Graphite for backend on top of Grafana. But the goal is to
>>> move from Graphite to Prometheus eventually.
>>>
>>> I tried to find a direct way of getting a specific Latency metric in
>>> average and as Chris pointed out, then Mean value isn't that accurate.
>>> I do not wish to use the percentile metrics either, but a single latency
>>> metric like the *"Local read latency" *output in nodetool tablestats.
>>> Looking at the code of nodetool tablestats, it seems that C* also
>>> divides *ReadTotalLatency.Count* with *ReadLatency.Count *to get the
>>> latency result.
>>>
>>> So I guess I will have no choice but to run the calculation on my own
>>> via Graphite:
>>>
>>> divideSeries(averageSeries(keepLastValue(nonNegativeDerivative($env.path.to.host.$host.org_apache_cassandra_metrics.Table.$ks.$cf.ReadTotalLatency.Count))),averageSeries(keepLastValue(nonNegativeDerivative($env.path.to.host.$host.org_apache_cassandra_metrics.Table.$ks.$cf.ReadLatency.Count
>>>
>>> Does this seem right to you?
>>>
>>> Thanks!
>>>
>>> On Thu, May 30, 2019 at 12:34 AM Paul Chandler 
>>> wrote:
>>>
>>>> There are various attributes under
>>>> org.apache.cassandra.metrics.ClientRequest.Latency.Read these measure the
>>>> latency in milliseconds
>>>>
>>>> Thanks
>>>>
>>>> Paul
>>>> www.redshots.com
>>>>
>>>> > On 29 May 2019, at 15:31, shalom sagges 
>>>> wrote:
>>>> >
>>>> > Hi All,
>>>> >
>>>> > I'm creating a dashboard that should collect read/write latency
>>>> metrics on C* 3.x.
>>>> > In older versions (e.g. 2.0) I used to divide the total read latency
>>>> in microseconds with the read count.
>>>> >
>>>> > Is there a metric attribute that shows read/write latency without the
>>>> need to do the math, such as in nodetool tablestats "Local read latency"
>>>> output?
>>>> > I saw there's a Mean attribute in
>>>> org.apache.cassandra.metrics.ReadLatency but I'm not sure this is the right
>>>> one.
>>>> >
>>>> > I'd really appreciate your help on this one.
>>>> > Thanks!
>>>> >
>>>> >
>>>>
>>>>
>>>> -
>>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>>>
>>>>

AbstractLocalAwareExecutorService Exception During Upgrade

2019-06-05 Thread shalom sagges

Hi All,

I'm having a bad situation where after upgrading 2 nodes (binaries only)
from 2.1.21 to 3.11.4 I'm getting a lot of warnings as follows:

AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread
Thread[ReadStage-5,5,main]: {}
java.lang.ArrayIndexOutOfBoundsException: null


I also see errors on repairs but no repair is running at all. I verified
this with ps -ef command and nodetool compactionstats. The error I see is:
Failed creating a merkle tree for [repair
#a95498f0-8783-11e9-b065-81cdbc6bee08 on system_auth/users, []], /1.2.3.4
(see log for details)

I saw repair errors on data tables as well.
nodetool status shows all are UN and nodetool describecluster shows two
schema versions as expected.


After the warnings appeared, clients started to get timed out read/write
queries.
Restarting the 2 nodes solved the clients' connection issues, but the
warnings are still being generated in the logs.

Did anyone encounter such an issue and knows what this means?

Thanks!

Re: AbstractLocalAwareExecutorService Exception During Upgrade

2019-06-05 Thread shalom sagges

If anyone has any idea on what might cause this issue, it'd be great.

I don't understand what could trigger this exception.
But what I really can't understand is why repairs started to run suddenly
:-\
There's no cron job running, no active repair process, no Validation
compactions, Reaper is turned off  I see repair running only in the
logs.

Thanks!


On Wed, Jun 5, 2019 at 2:32 PM shalom sagges  wrote:

> Hi All,
>
> I'm having a bad situation where after upgrading 2 nodes (binaries only)
> from 2.1.21 to 3.11.4 I'm getting a lot of warnings as follows:
>
> AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread
> Thread[ReadStage-5,5,main]: {}
> java.lang.ArrayIndexOutOfBoundsException: null
>
>
> I also see errors on repairs but no repair is running at all. I verified
> this with ps -ef command and nodetool compactionstats. The error I see is:
> Failed creating a merkle tree for [repair
> #a95498f0-8783-11e9-b065-81cdbc6bee08 on system_auth/users, []], /1.2.3.4
> (see log for details)
>
> I saw repair errors on data tables as well.
> nodetool status shows all are UN and nodetool describecluster shows two
> schema versions as expected.
>
>
> After the warnings appeared, clients started to get timed out read/write
> queries.
> Restarting the 2 nodes solved the clients' connection issues, but the
> warnings are still being generated in the logs.
>
> Did anyone encounter such an issue and knows what this means?
>
> Thanks!
>
>

Re: AbstractLocalAwareExecutorService Exception During Upgrade

2019-06-19 Thread shalom sagges

Hi Again,

Trying to push this up as I wasn't able to find the root cause of this
issue.
Perhaps I need to upgrade to 3.0 first?
Will be happy to get some ideas.

Opened https://issues.apache.org/jira/browse/CASSANDRA-15172 with more
details.

Thanks!

On Thu, Jun 6, 2019 at 5:31 AM Jonathan Koppenhofer 
wrote:

> Not sure about why repair is running, but we are also seeing the same
> merkle tree issue in a mixed version cluster in which we have intentionally
> started a repair against 2 upgraded DCs. We are currently researching, and
> can post back if we find the issue, but also would appreciate if someone
> has a suggestion. We have also run a local repair in an upgraded DC in this
> same mixed version cluster without issue.
>
> We are going 2.1.x to 3.0.x... and yes, we know you are not supposed to
> run repairs in mixed version clusters, so don't do it :) this is kind of a
> special circumstances where other things have gone wrong.
>
> Thanks
>
> On Wed, Jun 5, 2019, 5:23 PM shalom sagges  wrote:
>
>> If anyone has any idea on what might cause this issue, it'd be great.
>>
>> I don't understand what could trigger this exception.
>> But what I really can't understand is why repairs started to run suddenly
>> :-\
>> There's no cron job running, no active repair process, no Validation
>> compactions, Reaper is turned off  I see repair running only in the
>> logs.
>>
>> Thanks!
>>
>>
>> On Wed, Jun 5, 2019 at 2:32 PM shalom sagges 
>> wrote:
>>
>>> Hi All,
>>>
>>> I'm having a bad situation where after upgrading 2 nodes (binaries only)
>>> from 2.1.21 to 3.11.4 I'm getting a lot of warnings as follows:
>>>
>>> AbstractLocalAwareExecutorService.java:167 - Uncaught exception on
>>> thread Thread[ReadStage-5,5,main]: {}
>>> java.lang.ArrayIndexOutOfBoundsException: null
>>>
>>>
>>> I also see errors on repairs but no repair is running at all. I verified
>>> this with ps -ef command and nodetool compactionstats. The error I see is:
>>> Failed creating a merkle tree for [repair
>>> #a95498f0-8783-11e9-b065-81cdbc6bee08 on system_auth/users, []], /
>>> 1.2.3.4 (see log for details)
>>>
>>> I saw repair errors on data tables as well.
>>> nodetool status shows all are UN and nodetool describecluster shows two
>>> schema versions as expected.
>>>
>>>
>>> After the warnings appeared, clients started to get timed out read/write
>>> queries.
>>> Restarting the 2 nodes solved the clients' connection issues, but the
>>> warnings are still being generated in the logs.
>>>
>>> Did anyone encounter such an issue and knows what this means?
>>>
>>> Thanks!
>>>
>>>

Understanding TRACE logging

2019-09-25 Thread shalom sagges

Hi All,

I've been trying to find which queries are run on a Cassandra node.
I've enabled DEBUG and ran *nodetool setlogginglevel
org.apache.cassandra.transport TRACE*

I did get some queries, but it's definitely not all the queries that are
run on this database.
I've also found a lot of DEBUG [SharedPool-Worker-72] 2019-09-25
06:29:16,674 Message.java:437 - Received: EXECUTE
2a6022010ffaf55229262de917657d0f with 6 values at consistency LOCAL_QUORUM,
v=3 but I don't understand what information I can gain from that and why it
appears many times (a lot more then the queries I wish to track).

Can someone help me understand this type of logging?
Thanks!
DEBUG [SharedPool-Worker-88] 2019-09-25 06:29:16,793 Message.java:437 -
Received: EXECUTE 2a6022010ffaf55229262de917657d0f with 6 values at
consistency LOCAL_QUORUM, v=3
DEBUG [SharedPool-Worker-87] 2019-09-25 06:29:16,780 Message.java:437 -
Received: EXECUTE 447fdb9c8dfae53fafd78c7583aeb0f1 with 3 values at
consistency LOCAL_QUORUM, v=3
DEBUG [SharedPool-Worker-86] 2019-09-25 06:29:16,770 Message.java:437 -
Received: EXECUTE db812ac40b66c326f728452350eb0ab2 with 3 values at
consistency LOCAL_QUORUM, v=3
DEBUG [SharedPool-Worker-84] 2019-09-25 06:29:16,761 Message.java:437 -
Received: EXECUTE 7119db57e0a2041206f62c6d48fb4329 with 3 values at
consistency LOCAL_QUORUM, v=3
DEBUG [SharedPool-Worker-82] 2019-09-25 06:29:16,759 Message.java:437 -
Received: QUERY UPDATE tbl1 SET col6=?,col7=?,col8=?,col9=? WHERE col1=?
AND col2=? AND col3=? AND col4=? AND col5=?;, v=3
DEBUG [SharedPool-Worker-85] 2019-09-25 06:29:16,751 Message.java:437 -
Received: EXECUTE 2cddc1f6af3c6efbeaf435f9b7ec1c8a with 4 values at
consistency LOCAL_ONE, v=3
DEBUG [SharedPool-Worker-83] 2019-09-25 06:29:16,745 Message.java:437 -
Received: EXECUTE db812ac40b66c326f728452350eb0ab2 with 3 values at
consistency LOCAL_QUORUM, v=3
DEBUG [SharedPool-Worker-81] 2019-09-25 06:29:16,734 Message.java:437 -
Received: EXECUTE 7119db57e0a2041206f62c6d48fb4329 with 3 values at
consistency LOCAL_QUORUM, v=3
DEBUG [SharedPool-Worker-79] 2019-09-25 06:29:16,732 Message.java:437 -
Received: EXECUTE e779e97bc0de5e5e121db71c5cb2b727 with 11 values at
consistency LOCAL_QUORUM, v=3
DEBUG [SharedPool-Worker-80] 2019-09-25 06:29:16,731 Message.java:437 -
Received: EXECUTE 91af551f94a4394b96ef9afff71dfcc1 with 2 values at
consistency LOCAL_QUORUM, v=3
DEBUG [SharedPool-Worker-78] 2019-09-25 06:29:16,731 Message.java:437 -
Received: EXECUTE 2a6022010ffaf55229262de917657d0f with 6 values at
consistency LOCAL_QUORUM, v=3
DEBUG [SharedPool-Worker-75] 2019-09-25 06:29:16,720 Message.java:437 -
Received: EXECUTE b665e5f576dfe70845269d63b485c8ee with 2 values at
consistency LOCAL_QUORUM, v=3
DEBUG [SharedPool-Worker-77] 2019-09-25 06:29:16,715 Message.java:437 -
Received: EXECUTE ce545d85a7ee7c8ad58875afa72d9cf6 with 3 values at
consistency LOCAL_QUORUM, v=3
DEBUG [SharedPool-Worker-74] 2019-09-25 06:29:16,703 Message.java:437 -
Received: EXECUTE 7119db57e0a2041206f62c6d48fb4329 with 3 values at
consistency LOCAL_QUORUM, v=3
DEBUG [SharedPool-Worker-76] 2019-09-25 06:29:16,686 Message.java:437 -
Received: EXECUTE b665e5f576dfe70845269d63b485c8ee with 2 values at
consistency LOCAL_QUORUM, v=3
DEBUG [SharedPool-Worker-71] 2019-09-25 06:29:16,682 Message.java:437 -
Received: EXECUTE 2a6022010ffaf55229262de917657d0f with 6 values at
consistency LOCAL_QUORUM, v=3
DEBUG [SharedPool-Worker-73] 2019-09-25 06:29:16,675 Message.java:437 -
Received: EXECUTE b665e5f576dfe70845269d63b485c8ee with 2 values at
consistency LOCAL_QUORUM, v=3
DEBUG [SharedPool-Worker-72] 2019-09-25 06:29:16,674 Message.java:437 -
Received: EXECUTE 2a6022010ffaf55229262de917657d0f with 6 values at
consistency LOCAL_QUORUM, v=3
DEBUG [SharedPool-Worker-69] 2019-09-25 06:29:16,644 Message.java:437 -
Received: EXECUTE 2cddc1f6af3c6efbeaf435f9b7ec1c8a with 4 values at
consistency LOCAL_ONE, v=3
DEBUG [SharedPool-Worker-68] 2019-09-25 06:29:16,635 Message.java:437 -
Received: EXECUTE b665e5f576dfe70845269d63b485c8ee with 2 values at
consistency LOCAL_QUORUM, v=3
DEBUG [SharedPool-Worker-53] 2019-09-25 06:29:16,635 Message.java:437 -
Received: EXECUTE e779e97bc0de5e5e121db71c5cb2b727 with 11 values at
consistency LOCAL_QUORUM, v=3
DEBUG [SharedPool-Worker-66] 2019-09-25 06:29:16,635 Message.java:437 -
Received: EXECUTE 447fdb9c8dfae53fafd78c7583aeb0f1 with 3 values at
consistency LOCAL_QUORUM, v=3
DEBUG [SharedPool-Worker-65] 2019-09-25 06:29:16,623 Message.java:437 -
Received: EXECUTE d67e6a07c24b675f492686078b46c997 with 3 values at
consistency LOCAL_ONE, v=3
DEBUG [SharedPool-Worker-61] 2019-09-25 06:29:16,621 Message.java:437 -
Received: QUERY SELECT column4 FROM ks2.tbl2 WHERE column1='' AND
column2='' AND ts1>1569358692193;, v=3
DEBUG [SharedPool-Worker-62] 2019-09-25 06:29:16,618 Message.java:437 -
Received: EXECUTE d67e6a07c24b675f492686078b46c997 with 3 values at
consistency LOCAL_ONE, v=3

Re: Understanding TRACE logging

2019-09-26 Thread shalom sagges

Thanks for the quick response Jeff!

The EXECUTE lines are a prepared statement with the specified number of
parameters.
Is it possible to find out on which keyspace/table these prepared
statements run?
Can I get additional information from the prepared statement's ID? e.g.
EXECUTE *d67e6a07c24b675f492686078b46c9**97*

Thanks!

On Thu, Sep 26, 2019 at 11:14 AM Jeff Jirsa  wrote:

> The EXECUTE lines are a prepared statement with the specified number of
> parameters.
>
>
> On Wed, Sep 25, 2019 at 11:38 PM shalom sagges 
> wrote:
>
>> Hi All,
>>
>> I've been trying to find which queries are run on a Cassandra node.
>> I've enabled DEBUG and ran *nodetool setlogginglevel
>> org.apache.cassandra.transport TRACE*
>>
>> I did get some queries, but it's definitely not all the queries that are
>> run on this database.
>> I've also found a lot of DEBUG [SharedPool-Worker-72] 2019-09-25
>> 06:29:16,674 Message.java:437 - Received: EXECUTE
>> 2a6022010ffaf55229262de917657d0f with 6 values at consistency LOCAL_QUORUM,
>> v=3 but I don't understand what information I can gain from that and why it
>> appears many times (a lot more then the queries I wish to track).
>>
>> Can someone help me understand this type of logging?
>> Thanks!
>> DEBUG [SharedPool-Worker-88] 2019-09-25 06:29:16,793 Message.java:437 -
>> Received: EXECUTE 2a6022010ffaf55229262de917657d0f with 6 values at
>> consistency LOCAL_QUORUM, v=3
>> DEBUG [SharedPool-Worker-87] 2019-09-25 06:29:16,780 Message.java:437 -
>> Received: EXECUTE 447fdb9c8dfae53fafd78c7583aeb0f1 with 3 values at
>> consistency LOCAL_QUORUM, v=3
>> DEBUG [SharedPool-Worker-86] 2019-09-25 06:29:16,770 Message.java:437 -
>> Received: EXECUTE db812ac40b66c326f728452350eb0ab2 with 3 values at
>> consistency LOCAL_QUORUM, v=3
>> DEBUG [SharedPool-Worker-84] 2019-09-25 06:29:16,761 Message.java:437 -
>> Received: EXECUTE 7119db57e0a2041206f62c6d48fb4329 with 3 values at
>> consistency LOCAL_QUORUM, v=3
>> DEBUG [SharedPool-Worker-82] 2019-09-25 06:29:16,759 Message.java:437 -
>> Received: QUERY UPDATE tbl1 SET col6=?,col7=?,col8=?,col9=? WHERE col1=?
>> AND col2=? AND col3=? AND col4=? AND col5=?;, v=3
>> DEBUG [SharedPool-Worker-85] 2019-09-25 06:29:16,751 Message.java:437 -
>> Received: EXECUTE 2cddc1f6af3c6efbeaf435f9b7ec1c8a with 4 values at
>> consistency LOCAL_ONE, v=3
>> DEBUG [SharedPool-Worker-83] 2019-09-25 06:29:16,745 Message.java:437 -
>> Received: EXECUTE db812ac40b66c326f728452350eb0ab2 with 3 values at
>> consistency LOCAL_QUORUM, v=3
>> DEBUG [SharedPool-Worker-81] 2019-09-25 06:29:16,734 Message.java:437 -
>> Received: EXECUTE 7119db57e0a2041206f62c6d48fb4329 with 3 values at
>> consistency LOCAL_QUORUM, v=3
>> DEBUG [SharedPool-Worker-79] 2019-09-25 06:29:16,732 Message.java:437 -
>> Received: EXECUTE e779e97bc0de5e5e121db71c5cb2b727 with 11 values at
>> consistency LOCAL_QUORUM, v=3
>> DEBUG [SharedPool-Worker-80] 2019-09-25 06:29:16,731 Message.java:437 -
>> Received: EXECUTE 91af551f94a4394b96ef9afff71dfcc1 with 2 values at
>> consistency LOCAL_QUORUM, v=3
>> DEBUG [SharedPool-Worker-78] 2019-09-25 06:29:16,731 Message.java:437 -
>> Received: EXECUTE 2a6022010ffaf55229262de917657d0f with 6 values at
>> consistency LOCAL_QUORUM, v=3
>> DEBUG [SharedPool-Worker-75] 2019-09-25 06:29:16,720 Message.java:437 -
>> Received: EXECUTE b665e5f576dfe70845269d63b485c8ee with 2 values at
>> consistency LOCAL_QUORUM, v=3
>> DEBUG [SharedPool-Worker-77] 2019-09-25 06:29:16,715 Message.java:437 -
>> Received: EXECUTE ce545d85a7ee7c8ad58875afa72d9cf6 with 3 values at
>> consistency LOCAL_QUORUM, v=3
>> DEBUG [SharedPool-Worker-74] 2019-09-25 06:29:16,703 Message.java:437 -
>> Received: EXECUTE 7119db57e0a2041206f62c6d48fb4329 with 3 values at
>> consistency LOCAL_QUORUM, v=3
>> DEBUG [SharedPool-Worker-76] 2019-09-25 06:29:16,686 Message.java:437 -
>> Received: EXECUTE b665e5f576dfe70845269d63b485c8ee with 2 values at
>> consistency LOCAL_QUORUM, v=3
>> DEBUG [SharedPool-Worker-71] 2019-09-25 06:29:16,682 Message.java:437 -
>> Received: EXECUTE 2a6022010ffaf55229262de917657d0f with 6 values at
>> consistency LOCAL_QUORUM, v=3
>> DEBUG [SharedPool-Worker-73] 2019-09-25 06:29:16,675 Message.java:437 -
>> Received: EXECUTE b665e5f576dfe70845269d63b485c8ee with 2 values at
>> consistency LOCAL_QUORUM, v=3
>> DEBUG [SharedPool-Worker-72] 2019-09-25 06:29:16,674 Message.java:437 -
>> Received: EXECUTE 2a6022010ffaf55229262de917657d0f with 6 values at
>> consistency LOCAL_QUORUM, v=3
>> DEBUG [SharedPool-Wor

Re: Understanding TRACE logging

2019-10-02 Thread shalom sagges

Thanks Laxmikant and Paul.

@Laxmikant, Unfortunately, this cluster is still on 2.1 so ECAudit won't
support it, but will check it out once it's upgraded to 3.x (should happen
pretty soon).
@Paul, I will definitely try the Wireshark method.

Thanks a lot guys for your help!

On Thu, Sep 26, 2019 at 11:05 PM Paul Chandler  wrote:

> Hi Shalom,
>
> When tracking down specific queries I have used ngrep and fed the results
> into Wireshark, this will allow you to find out everything about the
> requests coming into the node from the client, as long as the connection is
> not encrypted.
>
> I wrote this up here a few months ago:
> http://www.redshots.com/finding-rogue-cassandra-queries/
>
> I hope this helps.
>
> Paul
>
>
>
>
>
> On 26 Sep 2019, at 10:21, Laxmikant Upadhyay 
> wrote:
>
> One of the way to figure out  what queries have run is to use audit
> logging  plugin supported in 3.x, 2.2
> https://github.com/Ericsson/ecaudit
>
> On Thu, Sep 26, 2019 at 2:19 PM shalom sagges 
> wrote:
>
>> Thanks for the quick response Jeff!
>>
>> The EXECUTE lines are a prepared statement with the specified number of
>> parameters.
>> Is it possible to find out on which keyspace/table these prepared
>> statements run?
>> Can I get additional information from the prepared statement's ID? e.g.
>> EXECUTE *d67e6a07c24b675f492686078b46c9**97*
>>
>> Thanks!
>>
>> On Thu, Sep 26, 2019 at 11:14 AM Jeff Jirsa  wrote:
>>
>>> The EXECUTE lines are a prepared statement with the specified number of
>>> parameters.
>>>
>>>
>>> On Wed, Sep 25, 2019 at 11:38 PM shalom sagges 
>>> wrote:
>>>
>>>> Hi All,
>>>>
>>>> I've been trying to find which queries are run on a Cassandra node.
>>>> I've enabled DEBUG and ran *nodetool setlogginglevel
>>>> org.apache.cassandra.transport TRACE*
>>>>
>>>> I did get some queries, but it's definitely not all the queries that
>>>> are run on this database.
>>>> I've also found a lot of DEBUG [SharedPool-Worker-72] 2019-09-25
>>>> 06:29:16,674 Message.java:437 - Received: EXECUTE
>>>> 2a6022010ffaf55229262de917657d0f with 6 values at consistency LOCAL_QUORUM,
>>>> v=3 but I don't understand what information I can gain from that and why it
>>>> appears many times (a lot more then the queries I wish to track).
>>>>
>>>> Can someone help me understand this type of logging?
>>>> Thanks!
>>>> DEBUG [SharedPool-Worker-88] 2019-09-25 06:29:16,793 Message.java:437 -
>>>> Received: EXECUTE 2a6022010ffaf55229262de917657d0f with 6 values at
>>>> consistency LOCAL_QUORUM, v=3
>>>> DEBUG [SharedPool-Worker-87] 2019-09-25 06:29:16,780 Message.java:437 -
>>>> Received: EXECUTE 447fdb9c8dfae53fafd78c7583aeb0f1 with 3 values at
>>>> consistency LOCAL_QUORUM, v=3
>>>> DEBUG [SharedPool-Worker-86] 2019-09-25 06:29:16,770 Message.java:437 -
>>>> Received: EXECUTE db812ac40b66c326f728452350eb0ab2 with 3 values at
>>>> consistency LOCAL_QUORUM, v=3
>>>> DEBUG [SharedPool-Worker-84] 2019-09-25 06:29:16,761 Message.java:437 -
>>>> Received: EXECUTE 7119db57e0a2041206f62c6d48fb4329 with 3 values at
>>>> consistency LOCAL_QUORUM, v=3
>>>> DEBUG [SharedPool-Worker-82] 2019-09-25 06:29:16,759 Message.java:437 -
>>>> Received: QUERY UPDATE tbl1 SET col6=?,col7=?,col8=?,col9=? WHERE col1=?
>>>> AND col2=? AND col3=? AND col4=? AND col5=?;, v=3
>>>> DEBUG [SharedPool-Worker-85] 2019-09-25 06:29:16,751 Message.java:437 -
>>>> Received: EXECUTE 2cddc1f6af3c6efbeaf435f9b7ec1c8a with 4 values at
>>>> consistency LOCAL_ONE, v=3
>>>> DEBUG [SharedPool-Worker-83] 2019-09-25 06:29:16,745 Message.java:437 -
>>>> Received: EXECUTE db812ac40b66c326f728452350eb0ab2 with 3 values at
>>>> consistency LOCAL_QUORUM, v=3
>>>> DEBUG [SharedPool-Worker-81] 2019-09-25 06:29:16,734 Message.java:437 -
>>>> Received: EXECUTE 7119db57e0a2041206f62c6d48fb4329 with 3 values at
>>>> consistency LOCAL_QUORUM, v=3
>>>> DEBUG [SharedPool-Worker-79] 2019-09-25 06:29:16,732 Message.java:437 -
>>>> Received: EXECUTE e779e97bc0de5e5e121db71c5cb2b727 with 11 values at
>>>> consistency LOCAL_QUORUM, v=3
>>>> DEBUG [SharedPool-Worker-80] 2019-09-25 06:29:16,731 Message.java:437 -
>>>> Received: EXECUTE 91af551f94a4394b96ef9afff71dfcc1 with 2 values at
>>>> consistenc

Can a Select Count(*) Affect Writes in Cassandra?

2016-11-10 Thread Shalom Sagges

Hi There!

I'm using C* 2.0.14.
I experienced a scenario where a "select count(*)" that ran every minute on
a table with practically no results limit (yes, this should definitely be
avoided), caused a huge increase in Cassandra writes to around 150 thousand
writes per second for that particular table.

Can anyone explain this behavior? Why would a Select query significantly
increase write count in Cassandra?

Thanks!

Shalom Sagges

<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

Re: Can a Select Count(*) Affect Writes in Cassandra?

2016-11-10 Thread Shalom Sagges

Thanks for the quick reply Vladimir.
Is it really possible that ~12,500 writes per second (per node in a 12
nodes DC) are caused by memory flushes?





Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>


On Thu, Nov 10, 2016 at 11:02 AM, Vladimir Yudovin 
wrote:

> Hi Shalom,
>
> so not sure, but probably excessive memory consumption by this SELECT
> causes C* to flush tables to free memory.
>
> Best regards, Vladimir Yudovin,
>
> *Winguzone <https://winguzone.com?from=list> - Hosted Cloud
> CassandraLaunch your cluster in minutes.*
>
>
>  On Thu, 10 Nov 2016 03:36:59 -0500*Shalom Sagges
> >* wrote 
>
> Hi There!
>
> I'm using C* 2.0.14.
> I experienced a scenario where a "select count(*)" that ran every minute
> on a table with practically no results limit (yes, this should definitely
> be avoided), caused a huge increase in Cassandra writes to around 150
> thousand writes per second for that particular table.
>
> Can anyone explain this behavior? Why would a Select query significantly
> increase write count in Cassandra?
>
> Thanks!
>
>
> Shalom Sagges
>
> <http://www.linkedin.com/company/164748>
> <http://twitter.com/liveperson>
> <http://www.facebook.com/LivePersonInc>
> We Create Meaningful Connections
>
> <https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>
>
>
>
> This message may contain confidential and/or privileged information.
> If you are not the addressee or authorized to receive this on behalf of
> the addressee you must not use, copy, disclose or take action based on this
> message or any information herein.
> If you have received this message in error, please advise the sender
> immediately by reply email and delete this message. Thank you.
>
>
>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

Re: Can a Select Count(*) Affect Writes in Cassandra?

2016-11-10 Thread Shalom Sagges

Yes, I know it's obsolete, but unfortunately this takes time.
We're in the process of upgrading to 2.2.8 and 3.0.9 in our clusters.

Thanks!



Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>


On Thu, Nov 10, 2016 at 1:31 PM, Vladimir Yudovin 
wrote:

> As I said I'm not sure about it, but it will be interesting to check
> memory heap state with any JMX tool, e.g. https://github.com/
> patric-r/jvmtop
>
> By a way, why Cassandra 2.0.14? It's quit old and unsupported version.
> Even in 2.0 branch there is 2.0.17 available.
>
> Best regards, Vladimir Yudovin,
>
> *Winguzone <https://winguzone.com?from=list> - Hosted Cloud
> CassandraLaunch your cluster in minutes.*
>
>
>  On Thu, 10 Nov 2016 05:47:37 -0500*Shalom Sagges
> >* wrote 
>
> Thanks for the quick reply Vladimir.
> Is it really possible that ~12,500 writes per second (per node in a 12
> nodes DC) are caused by memory flushes?
>
>
>
>
>
>
> Shalom Sagges
> DBA
> T: +972-74-700-4035
> <http://www.linkedin.com/company/164748>
> <http://twitter.com/liveperson>
> <http://www.facebook.com/LivePersonInc>
> We Create Meaningful Connections
>
> <https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>
>
>
>
> On Thu, Nov 10, 2016 at 11:02 AM, Vladimir Yudovin 
> wrote:
>
>
>
> This message may contain confidential and/or privileged information.
> If you are not the addressee or authorized to receive this on behalf of
> the addressee you must not use, copy, disclose or take action based on this
> message or any information herein.
> If you have received this message in error, please advise the sender
> immediately by reply email and delete this message. Thank you.
>
>
> Hi Shalom,
>
> so not sure, but probably excessive memory consumption by this SELECT
> causes C* to flush tables to free memory.
>
> Best regards, Vladimir Yudovin,
>
> *Winguzone <https://winguzone.com?from=list> - Hosted Cloud
> CassandraLaunch your cluster in minutes.*
>
>
>  On Thu, 10 Nov 2016 03:36:59 -0500*Shalom Sagges
> >* wrote 
>
> Hi There!
>
> I'm using C* 2.0.14.
> I experienced a scenario where a "select count(*)" that ran every minute
> on a table with practically no results limit (yes, this should definitely
> be avoided), caused a huge increase in Cassandra writes to around 150
> thousand writes per second for that particular table.
>
> Can anyone explain this behavior? Why would a Select query significantly
> increase write count in Cassandra?
>
> Thanks!
>
>
> Shalom Sagges
>
> <http://www.linkedin.com/company/164748>
> <http://twitter.com/liveperson>
> <http://www.facebook.com/LivePersonInc>
> We Create Meaningful Connections
>
> <https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>
>
>
>
> This message may contain confidential and/or privileged information.
> If you are not the addressee or authorized to receive this on behalf of
> the addressee you must not use, copy, disclose or take action based on this
> message or any information herein.
> If you have received this message in error, please advise the sender
> immediately by reply email and delete this message. Thank you.
>
>
>
>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

Re: Can a Select Count(*) Affect Writes in Cassandra?

2016-11-10 Thread Shalom Sagges

Hi Alexander,

I'm referring to Writes Count generated from JMX:
[image: Inline image 1]

The higher curve shows the total write count per second for all nodes in
the cluster and the lower curve is the average write count per second per
node.
The drop in the end is the result of shutting down one application node
that performed this kind of query (we still haven't removed the query
itself in this cluster).


On a different cluster, where we already removed the "select count(*)"
query completely, we can see that the issue was resolved (also verified
this with running nodetool cfstats a few times and checked the write count
difference):
[image: Inline image 2]


Naturally I asked how can a select query affect the write count of a node
but weird as it seems, the issue was resolved once the query was removed
from the code.

Another side note.. One of our developers that wrote the query in the code,
thought it would be nice to limit the query results to 560,000,000. Perhaps
the ridiculously high limit might have caused this?

Thanks!



Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>


On Thu, Nov 10, 2016 at 3:21 PM, Alexander Dejanovski <
a...@thelastpickle.com> wrote:

> Hi Shalom,
>
> Cassandra writes (mutations) are INSERTs, UPDATEs or DELETEs, it actually
> has nothing to do with flushes. A flush is the operation of moving data
> from memory (memtable) to disk (SSTable).
>
> The Cassandra write path and read path are two different things and, as
> far as I know, I see no way for a select count(*) to increase your write
> count (if you are indeed talking about actual Cassandra writes, and not I/O
> operations).
>
> Cheers,
>
> On Thu, Nov 10, 2016 at 1:21 PM Shalom Sagges 
> wrote:
>
>> Yes, I know it's obsolete, but unfortunately this takes time.
>> We're in the process of upgrading to 2.2.8 and 3.0.9 in our clusters.
>>
>> Thanks!
>>
>>
>>
>> Shalom Sagges
>> DBA
>> T: +972-74-700-4035 <+972%2074-700-4035>
>> <http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
>> <http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
>>
>> <https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>
>>
>>
>> On Thu, Nov 10, 2016 at 1:31 PM, Vladimir Yudovin 
>> wrote:
>>
>> As I said I'm not sure about it, but it will be interesting to check
>> memory heap state with any JMX tool, e.g. https://github.com/
>> patric-r/jvmtop
>>
>> By a way, why Cassandra 2.0.14? It's quit old and unsupported version.
>> Even in 2.0 branch there is 2.0.17 available.
>>
>> Best regards, Vladimir Yudovin,
>>
>> *Winguzone <https://winguzone.com?from=list> - Hosted Cloud
>> CassandraLaunch your cluster in minutes.*
>>
>>
>>  On Thu, 10 Nov 2016 05:47:37 -0500*Shalom Sagges
>> >* wrote 
>>
>> Thanks for the quick reply Vladimir.
>> Is it really possible that ~12,500 writes per second (per node in a 12
>> nodes DC) are caused by memory flushes?
>>
>>
>>
>>
>>
>>
>> Shalom Sagges
>> DBA
>> T: +972-74-700-4035
>> <http://www.linkedin.com/company/164748>
>> <http://twitter.com/liveperson>
>> <http://www.facebook.com/LivePersonInc>
>> We Create Meaningful Connections
>>
>> <https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>
>>
>>
>>
>> On Thu, Nov 10, 2016 at 11:02 AM, Vladimir Yudovin 
>> wrote:
>>
>>
>>
>> This message may contain confidential and/or privileged information.
>> If you are not the addressee or authorized to receive this on behalf of
>> the addressee you must not use, copy, disclose or take action based on this
>> message or any information herein.
>> If you have received this message in error, please advise the sender
>> immediately by reply email and delete this message. Thank you.
>>
>>
>> Hi Shalom,
>>
>> so not sure, but probably excessive memory consumption by this SELECT
>> causes C* to flush tables to free memory.
>>
>> Best regards, Vladimir Yudovin,
>>
>> *Winguzone <https://winguzone.com?from=list> - Hosted Cloud
>> CassandraLaunch your cluster in

Re: Some questions to updating and tombstone

2016-11-16 Thread Shalom Sagges

Hi Fabrice,

Just a small (out of the topic) question I couldn't find an answer to. What
is a slice in Cassandra? (e.g. Maximum tombstones per slice)

Thanks!


Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>


On Tue, Nov 15, 2016 at 6:38 PM, Fabrice Facorat 
wrote:

> If you don't want tombstones, don't generate them ;)
>
> More seriously, tombstones are generated when:
> - doing a DELETE
> - TTL expiration
> - set a column to NULL
>
> However tombstones are an issue only if for the same value, you have many
> tombstones (i.e you keep overwriting the same values with datas and
> tombstones). Having 1 tombstone for 1 value is not an issue, having 1000
> tombstone for 1 value is a problem. Do really your use case overwrite data
> with DELETE or  NULL ?
>
> So that's why what you may want to know is how many tombstones you have on
> average when reading a value. This is available in:
> - nodetool cfstats ks.cf : Average tombstones per slice/Maximum
> tombstones per slice
> - JMX : org.apache.cassandra.metrics:keyspace=,name=
> TombstoneScannedHistogram,scope=,type=ColumnFamily
> Max/Count/99thPercentile/Mean
>
>
> 2016-11-15 10:05 GMT+01:00 Lu, Boying :
>
>> Thanks a lot for your help.
>>
>>
>>
>> We are using STCS strategy and not using TTL
>>
>>
>>
>> Is there any API that we can use to query the current number of
>> tombstones in a CF?
>>
>>
>>
>>
>>
>>
>>
>> *From:* Anuj Wadehra [mailto:anujw_2...@yahoo.co.in]
>> *Sent:* 2016年11月14日 22:20
>> *To:* user@cassandra.apache.org
>> *Subject:* Re: Some questions to updating and tombstone
>>
>>
>>
>> Hi Boying,
>>
>>
>>
>> I agree with Vladimir.If compaction is not compacting the two sstables
>> with updates soon, disk space issues will be wasted. For example, if the
>> updates are not closer in time, first update might be in a big table by the
>> time second update is being written in a new small table. STCS wont compact
>> them together soon.
>>
>>
>>
>> Just adding column values with new timestamp shouldnt create any
>> tombstones. But if data is not merged for long, disk space issues may
>> arise. If you are STCS,just  yo get an idea about the extent of the problem
>> you can run major compaction and see the amount of disk space created with
>> that( dont do this in production as major compaction has its own side
>> effects).
>>
>>
>>
>> Which compaction strategy are you using?
>>
>> Are these updates done with TTL?
>>
>>
>>
>> Thanks
>> Anuj
>>
>>
>>
>> On Mon, 14 Nov, 2016 at 1:54 PM, Vladimir Yudovin
>>
>>  wrote:
>>
>> Hi Boying,
>>
>>
>>
>> UPDATE write new value with new time stamp. Old value is not tombstone,
>> but remains until compaction. gc_grace_period is not related to this.
>>
>>
>>
>> Best regards, Vladimir Yudovin,
>>
>>
>> *Winguzone <https://winguzone.com?from=list> - Hosted Cloud Cassandra
>> Launch your cluster in minutes.*
>>
>>
>>
>>
>>
>>  On Mon, 14 Nov 2016 03:02:21 -0500*Lu, Boying > >* wrote 
>>
>>
>>
>> Hi, All,
>>
>>
>>
>> Will the Cassandra generates a new tombstone when updating a column by
>> using CQL update statement?
>>
>>
>>
>> And is there any way to get the number of tombstones of a column family
>> since we want to void generating
>>
>> too many tombstones within gc_grace_period?
>>
>>
>>
>> Thanks
>>
>>
>>
>> Boying
>>
>>
>>
>>
>
>
> --
> Close the World, Open the Net
> http://www.linux-wizard.net
>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

WriteTimeoutExceptions from Storm Topology

2016-11-17 Thread Shalom Sagges

ssageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) at
org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
at
org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) at
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
... 1 more Caused by:
com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra
timeout during write query at consistency LOCAL_QUORUM (3 replica were
required but only 2 acknowledged the write) at
com.datastax.driver.core.Responses$Error$1.decode(Responses.java:58) at
com.datastax.driver.core.Responses$Error$1.decode(Responses.java:38) at
com.datastax.driver.core.Message$ProtocolDecoder.decode(Message.java:168)
at
org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:66)
... 21 more



Shalom Sagges


<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

Is Centos 7 Supported for Version 3.0

2016-11-20 Thread Shalom Sagges

Hi Guys,

A simple question for which I couldn't find an answer in the docs.
Is Centos 7 supported on DataStax Community Edition v3.0.9?

Thanks!


Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

Re: Is Centos 7 Supported for Version 3.0

2016-11-20 Thread Shalom Sagges

Thanks Vladimir!


Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>


On Sun, Nov 20, 2016 at 10:24 PM, Vladimir Yudovin 
wrote:

> Hi,
>
> Centos 7 has Java 8 available, so there shouldn't be any problem to run
> Casandra.
>
> Best regards, Vladimir Yudovin,
>
> *Winguzone <https://winguzone.com?from=list> - Hosted Cloud
> CassandraLaunch your cluster in minutes.*
>
>
>  On Sun, 20 Nov 2016 11:14:07 -0500*Shalom Sagges
> >* wrote 
>
> Hi Guys,
>
> A simple question for which I couldn't find an answer in the docs.
> Is Centos 7 supported on DataStax Community Edition v3.0.9?
>
> Thanks!
>
>
>
> Shalom Sagges
> DBA
> T: +972-74-700-4035
> <http://www.linkedin.com/company/164748>
> <http://twitter.com/liveperson>
> <http://www.facebook.com/LivePersonInc>
> We Create Meaningful Connections
>
> <https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>
>
>
>
> This message may contain confidential and/or privileged information.
> If you are not the addressee or authorized to receive this on behalf of
> the addressee you must not use, copy, disclose or take action based on this
> message or any information herein.
> If you have received this message in error, please advise the sender
> immediately by reply email and delete this message. Thank you.
>
>
>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

Re: data not replicated on new node

2016-11-20 Thread Shalom Sagges

I believe the logs should show you what the issue is.
Also, can the node "talk" with the others? (i.e. telnet to the other nodes
on port 7000).


Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>


On Sun, Nov 20, 2016 at 8:50 PM, Bertrand Brelier <
bertrand.brel...@gmail.com> wrote:

> Hello Jonathan,
>
> No, the new node is not a seed in my cluster.
>
> When I ran nodetool bootstrap resume
> Node is already bootstrapped.
>
> Cheers,
>
> Bertrand
>
> On Sun, Nov 20, 2016 at 1:43 PM, Jonathan Haddad 
> wrote:
>
>> Did you add the new node as a seed? If you did, it wouldn't bootstrap,
>> and you should run repair.
>> On Sun, Nov 20, 2016 at 10:36 AM Bertrand Brelier <
>> bertrand.brel...@gmail.com> wrote:
>>
>>> Hello everybody,
>>>
>>> I am using a 3-node Cassandra cluster with Cassandra 3.0.10.
>>>
>>> I recently added a new node (to make it a 3-node cluster).
>>>
>>> I am using a replication factor of 3 , so I expected to have a copy of
>>> the same data on each node :
>>>
>>> CREATE KEYSPACE mydata WITH replication = {'class': 'SimpleStrategy',
>>> 'replication_factor': '3'}  AND durable_writes = true;
>>>
>>> But the new node has  less data that the other 2 :
>>>
>>> Datacenter: datacenter1
>>> ===
>>> Status=Up/Down
>>> |/ State=Normal/Leaving/Joining/Moving
>>> --  Address   Load   Tokens   Owns (effective)  Host
>>> ID   Rack
>>> UN  XXX.XXX.XXX.XXX  53.28 GB   256  100.0% xx  rack1
>>> UN  XXX.XXX.XXX.XXX  64.7 GB256  100.0% xx  rack1
>>> UN  XXX.XXX.XXX.XXX  1.28 GB256  100.0% xx  rack1
>>>
>>>
>>> On the new node :
>>>
>>> /XX/data-6d674a40efab11e5b67e6d75503d5d02/:
>>> total 1.2G
>>>
>>> on one of the old nodes :
>>>
>>> /XX/data-6d674a40efab11e5b67e6d75503d5d02/:
>>> total 52G
>>>
>>>
>>> I am monitoring the amount of data on each node, and they grow at the
>>> same rate. So I suspect that my new data are replicated on the 3 nodes
>>> but the old data stored on the first 2 nodes are not replicated on the
>>> new node.
>>>
>>> I ran nodetool repair (on each node, one at a time), but the new node
>>> still does not have a copy of the old data.
>>>
>>> Could you please help me understand why the old data is not replicated
>>> to the new node ? Please let me know if you need further information.
>>>
>>> Thank you,
>>>
>>> Cheers,
>>>
>>> Bertrand
>>>
>>>
>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

Re: data not replicated on new node

2016-11-21 Thread Shalom Sagges

*I took that opportunity to upgrade from 3.1.1 to 3.0.9*

If my guess is right and you meant that you upgraded from 2.1.1 to 3.0.9
directly, then this might cause some issues (not necessarily the issue at
hand though). The proper upgrade process should be to 2.1.9 and from there
upgrade to 3.0.x.

https://docs.datastax.com/en/upgrade/doc/upgrade/cassandra/upgrdCassandra.html

Hope this helps.


Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>


On Tue, Nov 22, 2016 at 2:44 AM, Bertrand Brelier <
bertrand.brel...@gmail.com> wrote:

> Hello Shalom, Vladimir,
>
> Thanks for your help.
>
> I had initially 3 nodes, had a hardware failure and reinstalled Cassandra
> on the node (I took that opportunity to upgrade from 3.1.1 to 3.0.9). I ran
> nodetool upgradesstables and nodetool repair on each node once I updated
> Cassandra.
>
> The 3 nodes are in the same private network, I am using the private IPs
> for the seeds and the listen_address and the public IPs for rpc_address
>
> I am using ssl to encrypt the communication between the nodes, so I am
> using the port 7001 :
>
> telnet PRIVATEIP 7001
> Trying PRIVATEIP...
> Connected to PRIVATEIP.
>
> Each node can connect with any other node.
>
> I selected some old data from the new node :
>
> CONSISTENCY;
> Current consistency level is ONE.
> select count(*) from ;
>
>  count
> ---
>  0
>
> CONSISTENCY ALL;
> Consistency level set to ALL.
> count(*) from ;
>
>  count
> ---
> 64
>
> When I switched to ALL I could get the data while the initial level ONE
> did not have any data. I did not expect to get any data with ALL, am I
> missing something ?
>
> I do not know if this is related, but while I was inquiring the database,
> I had the following messages in the debug.log :
>
> DEBUG [ReadRepairStage:15292] 2016-11-21 18:15:59,719
> ReadCallback.java:234 - Digest mismatch:
> org.apache.cassandra.service.DigestMismatchException: Mismatch for key
> DecoratedKey(2288259866140251828, 0004002a04421500) (
> d41d8cd98f00b204e9800998ecf8427e vs ce211ac5533e1a146d9fee734fd8de26)
> at 
> org.apache.cassandra.service.DigestResolver.resolve(DigestResolver.java:85)
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at org.apache.cassandra.service.ReadCallback$
> AsyncRepairRunner.run(ReadCallback.java:225)
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [na:1.8.0_111]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [na:1.8.0_111]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]
>
>
> Thanks for your help,
>
> Cheers,
>
> Bertrand
>
>
> On 16-11-21 01:28 AM, Shalom Sagges wrote:
>
> I believe the logs should show you what the issue is.
> Also, can the node "talk" with the others? (i.e. telnet to the other nodes
> on port 7000).
>
>
> Shalom Sagges
> DBA
> T: +972-74-700-4035
> <http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
> <http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
>
> <https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>
>
>
> On Sun, Nov 20, 2016 at 8:50 PM, Bertrand Brelier <
> bertrand.brel...@gmail.com> wrote:
>
>> Hello Jonathan,
>>
>> No, the new node is not a seed in my cluster.
>>
>> When I ran nodetool bootstrap resume
>> Node is already bootstrapped.
>>
>> Cheers,
>>
>> Bertrand
>>
>> On Sun, Nov 20, 2016 at 1:43 PM, Jonathan Haddad 
>> wrote:
>>
>>> Did you add the new node as a seed? If you did, it wouldn't bootstrap,
>>> and you should run repair.
>>> On Sun, Nov 20, 2016 at 10:36 AM Bertrand Brelier <
>>> bertrand.brel...@gmail.com> wrote:
>>>
>>>> Hello everybody,
>>>>
>>>> I am using a 3-node Cassandra cluster with Cassandra 3.0.10.
>>>>
>>>> I recently added a new node (to make it a 3-node cluster).
>>>>
>>>> I am using a replication factor of 3 , so I expected to have a copy of
>>>> the same data on each node :
>>>>
>>>> CREATE KEYSPACE mydata WITH replication = {'class': 'SimpleStr

How to Choose a Version for Upgrade

2016-11-23 Thread Shalom Sagges

Hi Everyone,

I was wondering how to choose the proper, most stable Cassandra version for
a Production environment.
Should I follow the version that's used in Datastax Enterprise (in this
case 3.0.10) or is there a better way of figuring this out?

Thanks!


Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

Re: How to Choose a Version for Upgrade

2016-11-23 Thread Shalom Sagges

Thanks Vladimir!


Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>


On Wed, Nov 23, 2016 at 10:26 AM, Vladimir Yudovin 
wrote:

> Hi Shalom,
>
> there are a lot of discussion on this topic, but it seems that for know we
> can call 3.0.xx line as most stable. If you don't need specific feature
> from 3.x line take 3.0.10.
>
>
> Best regards, Vladimir Yudovin,
> *Winguzone <https://winguzone.com?from=list> - Cloud Cassandra Hosting*
>
>
>  On Wed, 23 Nov 2016 03:14:37 -0500*Shalom Sagges
> >* wrote 
>
> Hi Everyone,
>
> I was wondering how to choose the proper, most stable Cassandra version
> for a Production environment.
> Should I follow the version that's used in Datastax Enterprise (in this
> case 3.0.10) or is there a better way of figuring this out?
>
> Thanks!
>
>
>
> Shalom Sagges
> DBA
> T: +972-74-700-4035
> <http://www.linkedin.com/company/164748>
> <http://twitter.com/liveperson>
> <http://www.facebook.com/LivePersonInc>
> We Create Meaningful Connections
>
> <https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>
>
>
>
> This message may contain confidential and/or privileged information.
> If you are not the addressee or authorized to receive this on behalf of
> the addressee you must not use, copy, disclose or take action based on this
> message or any information herein.
> If you have received this message in error, please advise the sender
> immediately by reply email and delete this message. Thank you.
>
>
>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

Re: OperationTimedOutException (NoHostAvailableException)

2016-11-24 Thread Shalom Sagges

Do you get this error on specific column families or on all of the
environment?



Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>


On Thu, Nov 24, 2016 at 1:37 PM, techpyaasa .  wrote:

> I tried , that didnt work out.. :(
>
> On Thu, Nov 24, 2016 at 4:49 PM, Vladimir Yudovin 
> wrote:
>
>> >rpc_address: 0.0.0.0  , broadcast_address: 1.2.3.4
>> Did you try set rpc_address to node IP and not to 0.0.0.0 ?
>>
>> Best regards, Vladimir Yudovin,
>> *Winguzone <https://winguzone.com?from=list> - Cloud Cassandra Hosting*
>>
>>
>>  On Thu, 24 Nov 2016 04:50:08 -0500*Jeff Jirsa
>> >* wrote 
>>
>> Did you already try doing what the error message indicates you should try?
>>
>>
>>
>> Is there anything in the logs on the 3 cassandra boxes listed
>> (192.168.198.168, 192.168.198.169, 192.168.198.75) that indicates they had
>> problems at that time, perhaps GCInspector or StatusLogger messages about
>> pauses, or any drops in network utilization to indicate a networking
>> problem?
>>
>>
>>
>>
>>
>>
>> *From: *"techpyaasa ." 
>> *Reply-To: *"user@cassandra.apache.org" 
>> *Date: *Thursday, November 24, 2016 at 1:43 AM
>> *To: *"user@cassandra.apache.org" 
>> *Subject: *OperationTimedOutException (NoHostAvailableException)
>>
>>
>>
>> Hi all,
>>
>> Following exception thrown sometimes though all nodes are up.
>>
>>
>> * SEVERE : This error occurs if there are not enough Cassandra nodes for
>> the required QUORUM to persist data. Please make sure enough nodes are up
>> at this point of time. Error Count is at 150 Exception
>> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s)
>> tried for query failed (tried: /192.168.198.168:9042
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__192.168.198.168-3A9042&d=DgMFaQ&c=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M&r=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow&m=wZdIVDa-EbZuUZj0EJI6M9VoFmtc2eFxUOHbHRw45-I&s=kX9tE8vPTqL-rVGMeZYiH9aQoxDvhJo8goYI5u9vgxY&e=>
>> (com.datastax.driver.core.exceptions.DriverException: Timeout while trying
>> to acquire available connection (you may want to increase the driver number
>> of per-host connections)), /192.168.198.169:9042
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__192.168.198.169-3A9042&d=DgMFaQ&c=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M&r=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow&m=wZdIVDa-EbZuUZj0EJI6M9VoFmtc2eFxUOHbHRw45-I&s=smRB1jLV0OZ9xfSHq_BNF-q_e8T6rjozvvlqAxpJV_I&e=>
>> (com.datastax.driver.core.exceptions.DriverException: Timeout while trying
>> to acquire available connection (you may want to increase the driver number
>> of per-host connections)), /192.168.198.75:9042
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__192.168.198.75-3A9042&d=DgMFaQ&c=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M&r=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow&m=wZdIVDa-EbZuUZj0EJI6M9VoFmtc2eFxUOHbHRw45-I&s=c5l0qvjdeL8FlWVyq-AEs3zRJpOxcBPSxT8WMthdz40&e=>
>> (com.datastax.driver.core.OperationTimedOutException: [/192.168.198.75:9042
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__192.168.198.75-3A9042&d=DgMFaQ&c=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M&r=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow&m=wZdIVDa-EbZuUZj0EJI6M9VoFmtc2eFxUOHbHRw45-I&s=c5l0qvjdeL8FlWVyq-AEs3zRJpOxcBPSxT8WMthdz40&e=>]
>> Operation timed out)) at
>> com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:84)
>> at
>> com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)
>> at
>> com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:214)
>> at
>> com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:52)
>> at *
>>
>> We are using c*-2.0.17 , datastax java driver -
>> cassandra-driver-core-2.1.8.jar.
>>
>>
>> In cassandra.yaml following were set
>> rpc_address: 0.0.0.0
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__0.0.0.0&d=DgQFaQ&c=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M&r=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow&m=wZdIVDa-EbZuUZj0EJI6M9VoFmtc2eFxUOHbHRw45-I&s=hvPmmPUl-vLcMbSI

Max Capacity per Node

2016-11-24 Thread Shalom Sagges

Hi Everyone,

I have a 24 node cluster (12 in each DC) with a capacity of 3.3 TB per node
for the data directory.
I'd like to increase the capacity per node.
Can anyone tell what is the maximum recommended capacity a node can use?
The disks we use are HDD, not SSD.

Thanks!

Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

Cassandra Upgrade

2016-11-29 Thread Shalom Sagges

Hi Everyone,

Hypothetically speaking, can I add a new node with version 2.2.8 to a
2.0.14 cluster?
Meaning, instead of upgrading the cluster, I'd like to remove a node, clear
all its data, install 2.2.8 and add it back to the cluster, with the
process eventually performed on all nodes one by one.

Is this possible?

Thanks!


Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

Re: Cassandra Upgrade

2016-11-29 Thread Shalom Sagges

Thanks Ben and Brooke!
@Brooke, I'd like to do that because I want to install Centos 7 on those
machines instead of the current Centos 6. To achieve that, I need to make a
new installation of the OS, meaning taking the server down.
So if that's the case, and I can't perform the upgrade online, why not
install everything anew?
By the way, if I do take the longer way and add a new 2.2.8 node to the
cluster, do I still need to perform upgradesstables on the new node?




Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>


On Tue, Nov 29, 2016 at 12:38 PM, Brooke Jensen 
wrote:

> Hi Shalom.
>
> That seems like the long way around doing it. If you clear the data from
> the node then add it back in then you will have to restream and recompact
> the data again for each node. Is there a particular reason why you would
> need to do it this way?
>
> The way we do it is to update Cassandra on each node as per the steps Ben
> linked to. Once all nodes are on the newer version you can run
> upgradesstables. If you have a large cluster and are using racks you can do
> the upgrade one rack at a time to speed things up. Either way, this should
> enable you to do the upgrade fairly quickly with no downtime.
>
> Regards,
> *Brooke Jensen*
> VP Technical Operations & Customer Services
> www.instaclustr.com | support.instaclustr.com
> <https://support.instaclustr.com/hc/en-us>
>
> This email has been sent on behalf of Instaclustr Limited (Australia) and
> Instaclustr Inc (USA). This email and any attachments may contain
> confidential and legally privileged information.  If you are not the
> intended recipient, do not copy or disclose its content, but please reply
> to this email immediately and highlight the error to the sender and then
> immediately delete the message.
>
> On 29 November 2016 at 21:12, Ben Dalling  wrote:
>
>> Hi Shalom,
>>
>> There is a pretty good write up of the procedure written up here (
>> https://docs.datastax.com/en/upgrade/doc/upgrade/cassandra/
>> upgrdCassandraDetails.html).  Things to highlight are:
>>
>>
>>- Don't have a repair running while carrying out the upgrade (so that
>>does timebox your upgrade).
>>- When the upgrade is complete.  Run "nodetool upgradesstables" on
>>all the nodes.
>>
>>
>> Pretty much what you suggested.
>>
>> Best wishes,
>>
>> Ben
>>
>> On 29 November 2016 at 09:52, Shalom Sagges 
>> wrote:
>>
>>> Hi Everyone,
>>>
>>> Hypothetically speaking, can I add a new node with version 2.2.8 to a
>>> 2.0.14 cluster?
>>> Meaning, instead of upgrading the cluster, I'd like to remove a node,
>>> clear all its data, install 2.2.8 and add it back to the cluster, with the
>>> process eventually performed on all nodes one by one.
>>>
>>> Is this possible?
>>>
>>> Thanks!
>>>
>>>
>>> Shalom Sagges
>>> DBA
>>> T: +972-74-700-4035
>>> <http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
>>> <http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
>>>
>>> <https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>
>>>
>>>
>>> This message may contain confidential and/or privileged information.
>>> If you are not the addressee or authorized to receive this on behalf of
>>> the addressee you must not use, copy, disclose or take action based on this
>>> message or any information herein.
>>> If you have received this message in error, please advise the sender
>>> immediately by reply email and delete this message. Thank you.
>>>
>>
>>
>>
>> --
>> *Ben Dalling** MSc, CEng, MBCS CITP*
>> League of Crafty Programmers Ltd
>> Mobile:  +44 (0) 776 981-1900
>> email: b.dall...@locp.co.uk
>> www: http://www.locp.co.uk
>> http://www.linkedin.com/in/bendalling
>>
>
>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

Re: Cassandra Upgrade

2016-11-29 Thread Shalom Sagges

Thanks for the info Kurt,

I guess I'd go with the normal upgrade procedure then.

Thanks again for the help everyone.




Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>


On Tue, Nov 29, 2016 at 2:05 PM, kurt Greaves  wrote:

> Why would you remove all the data? That doesn't sound like a good idea.
> Just upgrade the OS and then go through the normal upgrade flow of starting
> C* with the next version and upgrading sstables.
>
> Also, *you will need to go from 2.0.14 -> 2.1.16 -> 2.2.8* and upgrade
> sstables at each stage of the upgrade. you cannot transition from 2.0.14
> straight to 2.2.8.
>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

Cassandra 2.x Stability

2016-11-30 Thread Shalom Sagges

Hi Everyone,

I'm about to upgrade our 2.0.14 version to a newer 2.x version.
At first I thought of upgrading to 2.2.8, but I'm not sure how stable it
is, as I understand the 2.2 version was supposed to be a sort of beta
version for 3.0 feature-wise, whereas 3.0 upgrade will mainly handle the
storage modifications (please correct me if I'm wrong).

So my question is, if I need a 2.x version (can't upgrade to 3 due to
client considerations), which one should I choose, 2.1.x or 2.2.x? (I'm
don't require any new features available in 2.2).

Thanks!

Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

Re: Cassandra 2.x Stability

2016-12-01 Thread Shalom Sagges

Hey Kai,

Thanks for the info. Can you please elaborate on the reasons you'd pick
2.2.6 over 3.0?


Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>


On Thu, Dec 1, 2016 at 2:26 PM, Kai Wang  wrote:

> I have been running 2.2.6 in production. As of today I would still pick it
> over 3.x for production.
>
> On Nov 30, 2016 5:42 AM, "Shalom Sagges"  wrote:
>
>> Hi Everyone,
>>
>> I'm about to upgrade our 2.0.14 version to a newer 2.x version.
>> At first I thought of upgrading to 2.2.8, but I'm not sure how stable it
>> is, as I understand the 2.2 version was supposed to be a sort of beta
>> version for 3.0 feature-wise, whereas 3.0 upgrade will mainly handle the
>> storage modifications (please correct me if I'm wrong).
>>
>> So my question is, if I need a 2.x version (can't upgrade to 3 due to
>> client considerations), which one should I choose, 2.1.x or 2.2.x? (I'm
>> don't require any new features available in 2.2).
>>
>> Thanks!
>>
>> Shalom Sagges
>> DBA
>> T: +972-74-700-4035 <+972%2074-700-4035>
>> <http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
>> <http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
>>
>> <https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>
>>
>>
>> This message may contain confidential and/or privileged information.
>> If you are not the addressee or authorized to receive this on behalf of
>> the addressee you must not use, copy, disclose or take action based on this
>> message or any information herein.
>> If you have received this message in error, please advise the sender
>> immediately by reply email and delete this message. Thank you.
>>
>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

Re: Cassandra 2.x Stability

2016-12-01 Thread Shalom Sagges

Thanks a lot Kai!


Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>


On Thu, Dec 1, 2016 at 5:18 PM, Kai Wang  wrote:

> Just based on a few observations on this list. Not one week goes by
> without people asking which release is the most stable on 3.x line. Folks
> at instaclustr also provide their own 3.x fork for stability issues. etc
>
> We developers already have enough to think about. I really don't feel like
> spending time researching which release of C* I should choose. So for me,
> 2.2.x is the choice in production.
>
> That being said, I have nothing against 3.x. I do like its new storage
> engine. If I start a brand new project today with zero previous C*
> experience, I probably would choose 3.0.10 as my starting point. However if
> I were to upgrade to 3.x, I would have to test it thoroughly in a dev
> environment with real production load and monitor it very closely on
> performance, compaction, repair, bootstrap, replacing etc. Data is simply
> too important to take chances with.
>
>
> On Thu, Dec 1, 2016 at 9:38 AM, Shalom Sagges 
> wrote:
>
>> Hey Kai,
>>
>> Thanks for the info. Can you please elaborate on the reasons you'd pick
>> 2.2.6 over 3.0?
>>
>>
>> Shalom Sagges
>> DBA
>> T: +972-74-700-4035 <+972%2074-700-4035>
>> <http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
>> <http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
>>
>> <https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>
>>
>>
>> On Thu, Dec 1, 2016 at 2:26 PM, Kai Wang  wrote:
>>
>>> I have been running 2.2.6 in production. As of today I would still pick
>>> it over 3.x for production.
>>>
>>> On Nov 30, 2016 5:42 AM, "Shalom Sagges"  wrote:
>>>
>>>> Hi Everyone,
>>>>
>>>> I'm about to upgrade our 2.0.14 version to a newer 2.x version.
>>>> At first I thought of upgrading to 2.2.8, but I'm not sure how stable
>>>> it is, as I understand the 2.2 version was supposed to be a sort of beta
>>>> version for 3.0 feature-wise, whereas 3.0 upgrade will mainly handle the
>>>> storage modifications (please correct me if I'm wrong).
>>>>
>>>> So my question is, if I need a 2.x version (can't upgrade to 3 due to
>>>> client considerations), which one should I choose, 2.1.x or 2.2.x? (I'm
>>>> don't require any new features available in 2.2).
>>>>
>>>> Thanks!
>>>>
>>>> Shalom Sagges
>>>> DBA
>>>> T: +972-74-700-4035 <+972%2074-700-4035>
>>>> <http://www.linkedin.com/company/164748>
>>>> <http://twitter.com/liveperson> <http://www.facebook.com/LivePersonInc> We
>>>> Create Meaningful Connections
>>>>
>>>> <https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>
>>>>
>>>>
>>>> This message may contain confidential and/or privileged information.
>>>> If you are not the addressee or authorized to receive this on behalf of
>>>> the addressee you must not use, copy, disclose or take action based on this
>>>> message or any information herein.
>>>> If you have received this message in error, please advise the sender
>>>> immediately by reply email and delete this message. Thank you.
>>>>
>>>
>>
>> This message may contain confidential and/or privileged information.
>> If you are not the addressee or authorized to receive this on behalf of
>> the addressee you must not use, copy, disclose or take action based on this
>> message or any information herein.
>> If you have received this message in error, please advise the sender
>> immediately by reply email and delete this message. Thank you.
>>
>
>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

Imprecise Repair

2016-12-08 Thread Shalom Sagges

Hi Everyone,

I'm performing a repair as I usual do, but this time I got a weird
notification:
"Requested range intersects a local range but is not fully contained in
one; this would lead to imprecise repair".

I've never encountered this before during a repair.
The repair command that I ran is:
*nodetool repair -par -local mykeyspace mycolumnfamily*

The difference from other repairs I did is by adding *-local* and remove
*-pr* since I'm adding nodes in the other DC (have 2 DCs in the cluster)
and don't want the repair to interfere with the bootstrap.

I found CASSANDRA-7317 but saw it was fixed on 2.0.9. The version I'm using
is 2.0.14.
Any ideas?

Thanks!


Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

About Tombstones and TTLs

2016-12-19 Thread Shalom Sagges

Hi Everyone,

I was reading a blog on TWCS by Alex Dejanovski from The Last Pickle (
http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html)

When I got to the comments section, I didn't understand why setting
gc_grace_seconds to 0 will disable hints for the associated table:
*"It is a very good point that gc_grace_seconds shouldn't be lowered too
much as its impact on hinted handoff is not a well known fact, and using a
value of 0 will purely disable hints on the table."*

When I tried to read some more about deletes and TTLs, I got to a Datastax
documentation
https://docs.datastax.com/en/cassandra/3.0/cassandra/dml/dmlAboutDeletes.html
stating the following:

*Cassandra allows you to set a default_time_to_live property for an entire
table. Columns and rows marked with regular TTLs are processed as described
above; but when a record exceeds the table-level TTL, Cassandra deletes it
immediately, without tombstoning or compaction.*

Which got me a bit more confused.
So I hope someone can shed some light on some questions I've got:


   - Why setting gc_grace_seconds=0 will disable hints for the table?
   - How can an expired TTL record be deleted by Cassandra without
   tombstoning or compaction? Aren't SSTables immutable files, and expired
   records are removed through compaction?
   - If I only use TTL for deletion, do I still need gc_grace_seconds to be
   bigger than 0?
   - If I only use TTL for deletion, but use updates as well, do I need
   gc_grace_seconds to be bigger than 0?


Sorry for all those questions, I'm just really confused from all the
TTL/tombstones subject (still a newbie).

Thanks a lot!


Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

Re: About Tombstones and TTLs

2016-12-19 Thread Shalom Sagges

Thanks for the explanation Matija, but fortunately, that I know. Forgot to
mention that I'm using a multi DC cluster.
I'll try to summarize just the questions I have, because my email was
indeed quite long :-)


   - Why setting gc_grace_seconds=0 will disable hints for the table?
   - How can an expired TTL record be deleted by Cassandra without
   tombstoning or compaction? Aren't SSTables immutable files, and expired
   records are removed through compaction?
   - If I only use TTL for deletion, do I still need gc_grace_seconds to be
   bigger than 0?
   - If I only use TTL for deletion, but use updates as well, do I need
   gc_grace_seconds to be bigger than 0?


Thanks!


Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections



On Mon, Dec 19, 2016 at 2:39 PM, Matija Gobec  wrote:

> Hi,
>
> gc_grace_seconds is used to maintain data consistency in some failure
> scenarios. When manually deleting data that action creates tombstones which
> are kept for that defined period before being compacted. If one of the
> replica nodes is down while deleting data and it gets back up after the
> gc_grace_seconds defined period your previously delete data will reappear
> (ghost data). As it is stated in datastax documentation on a single node
> you can set gc_grace_seconds to 0 and you can do the same for tables that
> contain only data with TTL. In the mentioned failure scenario your downed
> node will have data with TTL information and no data inconsistency will
> happen.
>
> On Mon, Dec 19, 2016 at 1:00 PM, Shalom Sagges 
> wrote:
>
>> Hi Everyone,
>>
>> I was reading a blog on TWCS by Alex Dejanovski from The Last Pickle (
>> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html)
>>
>> When I got to the comments section, I didn't understand why setting
>> gc_grace_seconds to 0 will disable hints for the associated table:
>> *"It is a very good point that gc_grace_seconds shouldn't be lowered too
>> much as its impact on hinted handoff is not a well known fact, and using a
>> value of 0 will purely disable hints on the table."*
>>
>> When I tried to read some more about deletes and TTLs, I got to a
>> Datastax documentation https://docs.datastax.com/en/cassandra/3.0/cas
>> sandra/dml/dmlAboutDeletes.html
>> stating the following:
>>
>> *Cassandra allows you to set a default_time_to_live property for an
>> entire table. Columns and rows marked with regular TTLs are processed as
>> described above; but when a record exceeds the table-level TTL, Cassandra
>> deletes it immediately, without tombstoning or compaction.*
>>
>> Which got me a bit more confused.
>> So I hope someone can shed some light on some questions I've got:
>>
>>
>>- Why setting gc_grace_seconds=0 will disable hints for the table?
>>- How can an expired TTL record be deleted by Cassandra without
>>tombstoning or compaction? Aren't SSTables immutable files, and expired
>>records are removed through compaction?
>>- If I only use TTL for deletion, do I still need gc_grace_seconds to
>>    be bigger than 0?
>>- If I only use TTL for deletion, but use updates as well, do I need
>>gc_grace_seconds to be bigger than 0?
>>
>>
>> Sorry for all those questions, I'm just really confused from all the
>> TTL/tombstones subject (still a newbie).
>>
>> Thanks a lot!
>>
>>
>> Shalom Sagges
>> DBA
>> T: +972-74-700-4035 <+972%2074-700-4035>
>> <http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
>> <http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
>>
>>
>>
>> This message may contain confidential and/or privileged information.
>> If you are not the addressee or authorized to receive this on behalf of
>> the addressee you must not use, copy, disclose or take action based on this
>> message or any information herein.
>> If you have received this message in error, please advise the sender
>> immediately by reply email and delete this message. Thank you.
>>
>
>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

Re: About Tombstones and TTLs

2016-12-19 Thread Shalom Sagges

Thanks a lot Alain!!

This really cleared a lot of things for me.

Thanks again!



Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections



On Mon, Dec 19, 2016 at 5:58 PM, Alain RODRIGUEZ  wrote:

> From http://www.uberobert.com/cassandra_gc_grace_disables_hinted_handoff/
>
> This is just a quick FYI post as I don't see this documented on the web
>> elsewhere. As of now in all versions of Cassandra a gc_grace_seconds setting
>> of 0 will disable hinted handoff. Basically to avoid an edge case that
>> could cause data to reappear in a cluster (Detailed in Jira
>> CASSANDRA-5314 <https://issues.apache.org/jira/browse/CASSANDRA-5314>)
>> hints are stored with a TTL of gc_grace_seconds for the keyspace in
>> question. A gc_grace_seconds setting of 0 will cause hints to TTL instantly
>> and they will never be streamed off when a node comes back up.
>
>
>  I did not read the ticket yet, but it might bring some enlightening as
> well regarding your question Cody,
>
> C*heers,
> ---
> Alain Rodriguez - @arodream - al...@thelastpickle.com
> France
>
> The Last Pickle - Apache Cassandra Consulting
> http://www.thelastpickle.com
>
> 2016-12-19 16:39 GMT+01:00 Cody Yancey :
>
>> >> Cassandra stores hints for the lowest of gc_grace_seconds and
>> max_hint_window_in_ms
>>
>> Was this a tough design decision or just a bug? It is certainly very
>> surprising behavior. Everything that I've read leads me to believe that
>> gc_grace_seconds was only intended to affect the treatment of *expired*
>>  data.
>>
>> Thanks,
>> Cody
>>
>> On Mon, Dec 19, 2016 at 8:10 AM, Alain RODRIGUEZ 
>> wrote:
>>
>>> Hi,
>>>
>>>
>>>>- Why setting gc_grace_seconds=0 will disable hints for the table?
>>>>
>>>> It was the first time I heard about this as well when Alexander told us
>>> about that. This read might be helpful http://www.uberobert.com/cassa
>>> ndra_gc_grace_disables_hinted_handoff/. Also Alexander I know tested it.
>>>
>>> *tl;dr*:  Cassandra stores hints for the lowest of gc_grace_seconds and
>>> max_hint_window_in_ms
>>>
>>> Still I see no reason not to set gc_grace_seconds to 3 hours as a fix /
>>> workaround. Keeping 3 hours of extra data on disk is something you
>>> definitely want to be able to do.
>>>
>>>
>>>>- How can an expired TTL record be deleted by Cassandra without
>>>>tombstoning or compaction? Aren't SSTables immutable files, and expired
>>>>records are removed through compaction?
>>>>
>>>>
>>> This sounds magical to me as well. The only way I am aware of to drop
>>> tombstone without compaction is having an entire "SSTable expired" that
>>> would be soon be evicted, without compactions. TWCS relies on this property
>>> and make a great use of it. Here is Jeff talk about TWCS:
>>> https://www.youtube.com/watch?v=PWtekUWCIaw. I believe he mentioned
>>> that.
>>>
>>>
>>>>- If I only use TTL for deletion, do I still need gc_grace_seconds
>>>>to be bigger than 0?
>>>>
>>>>
>>>>- If I only use TTL for deletion, but use updates as well, do I
>>>>need gc_grace_seconds to be bigger than 0?
>>>>
>>>>
>>> Yes, if you care about hints. Anyway, setting gc_grace_seconds to 0
>>> brings more troubles than solutions in many cases. Use the value of
>>> max_hint_window_in_ms as a minimal gc_grace_seconds (watch out for the time
>>> units in use, do the math ;-) )
>>>
>>> Here is a blog I wrote a few months ago about tombstones and deletes
>>> http://thelastpickle.com/blog/2016/07/27/about-deletes-and-t
>>> ombstones.html. I hope it will give you interesting insight about
>>> tombstones, even if you do not care about all the "deletes" part. About
>>> TTLs, see http://thelastpickle.com/blog/2016/07/27/about-deletes-a
>>> nd-tombstones.html#tombstones-drop. There is no need for you to repair
>>> within gc_grace_seconds, but given that "Cassandra stores hints for the
>>> lowest of gc_grace_seconds and max_hint_window_in_ms"  I would never use a
>>> lower value than 3 hours (default  max_hint_window_in_ms) for
>>> gc_grace_seconds, on any table.
>>>
>>> C*heers,
&

Openstack and Cassandra

2016-12-21 Thread Shalom Sagges

Hi Everyone,

I am looking into the option of deploying a Cassandra cluster on Openstack
nodes instead of physical nodes due to resource management considerations.

Does anyone has any insights regarding this?
Can this combination work properly?
Since the disks (HDDs) are part of one physical machine that divide their
capacity to various instances (not only Cassandra), will this affect
performance, especially when the commitlog directory will probably reside
with the data directory?

I'm at a loss here and don't have any answers for that matter.

Can anyone assist please?

Thanks!



Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

Re: Openstack and Cassandra

2016-12-22 Thread Shalom Sagges

Thanks Vladimir!

I guess I'll just have to deploy and continue from there.




Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://liveperson.docsend.com/view/8iiswfp>


On Thu, Dec 22, 2016 at 5:20 PM, Vladimir Yudovin 
wrote:

> Hi Shalom,
>
> I don't see any reason why it wouldn't work,  but obviously, any resource
> sharing affects performance. You can expect less degradation with SSD
> disks, I guess.
>
>
> Best regards, Vladimir Yudovin,
> *Winguzone <https://winguzone.com?from=list> - Cloud Cassandra Hosting*
>
>
>  On Wed, 21 Dec 2016 13:31:22 -0500 *Shalom Sagges
> >* wrote 
>
> Hi Everyone,
>
> I am looking into the option of deploying a Cassandra cluster on Openstack
> nodes instead of physical nodes due to resource management considerations.
>
> Does anyone has any insights regarding this?
> Can this combination work properly?
> Since the disks (HDDs) are part of one physical machine that divide their
> capacity to various instances (not only Cassandra), will this affect
> performance, especially when the commitlog directory will probably reside
> with the data directory?
>
> I'm at a loss here and don't have any answers for that matter.
>
> Can anyone assist please?
>
> Thanks!
>
>
>
>
> Shalom Sagges
> DBA
> T: +972-74-700-4035 <+972%2074-700-4035>
> <http://www.linkedin.com/company/164748>
> <http://twitter.com/liveperson>
> <http://www.facebook.com/LivePersonInc>
> We Create Meaningful Connections
>
>
>
>
> This message may contain confidential and/or privileged information.
> If you are not the addressee or authorized to receive this on behalf of
> the addressee you must not use, copy, disclose or take action based on this
> message or any information herein.
> If you have received this message in error, please advise the sender
> immediately by reply email and delete this message. Thank you.
>
>
>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

Re: Openstack and Cassandra

2016-12-22 Thread Shalom Sagges

Thanks for the info Aaron!

I will test it in hope there will be no issues. If no issues will occur,
this could actually be a good idea and would save a lot of resources.

Have a great day!


Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://liveperson.docsend.com/view/8iiswfp>


On Thu, Dec 22, 2016 at 6:27 PM, Aaron Ploetz  wrote:

> Shalom,
>
> We (Target) have been challenged by our management team to leverage
> OpenStack whenever possible, and that includes Cassandra.  I was against it
> at first, but we have done some stress testing with it and had application
> teams try it out.  So far, there haven't been any issues.
>
> A good use case for Cassandra on OpenStack, is to support an
> internal-facing application that needs to scale for disk footprint, or to
> spin-up a quick dev environment.  When building clusters to support those
> solutions, we haven't had any problems due to simply deploying on
> OpenStack.  Our largest Cassandra cluster on OpenStack is currently around
> 30 nodes.  OpenStack is a good solution for that particular use case as we
> can easily add/remove nodes to accommodate the dynamic disk usage
> requirements.
>
> However, when query latency is a primary concern, I do still recommend
> that we use one of our external cloud providers.
>
> Hope that helps,
>
> Aaron
>
> On Thu, Dec 22, 2016 at 9:51 AM, Shalom Sagges 
> wrote:
>
>> Thanks Vladimir!
>>
>> I guess I'll just have to deploy and continue from there.
>>
>>
>>
>>
>> Shalom Sagges
>> DBA
>> T: +972-74-700-4035 <+972%2074-700-4035>
>> <http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
>> <http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
>> <https://liveperson.docsend.com/view/8iiswfp>
>>
>>
>> On Thu, Dec 22, 2016 at 5:20 PM, Vladimir Yudovin 
>> wrote:
>>
>>> Hi Shalom,
>>>
>>> I don't see any reason why it wouldn't work,  but obviously, any
>>> resource sharing affects performance. You can expect less degradation with
>>> SSD disks, I guess.
>>>
>>>
>>> Best regards, Vladimir Yudovin,
>>> *Winguzone <https://winguzone.com?from=list> - Cloud Cassandra Hosting*
>>>
>>>
>>>  On Wed, 21 Dec 2016 13:31:22 -0500 *Shalom Sagges
>>> >* wrote 
>>>
>>> Hi Everyone,
>>>
>>> I am looking into the option of deploying a Cassandra cluster on
>>> Openstack nodes instead of physical nodes due to resource management
>>> considerations.
>>>
>>> Does anyone has any insights regarding this?
>>> Can this combination work properly?
>>> Since the disks (HDDs) are part of one physical machine that divide
>>> their capacity to various instances (not only Cassandra), will this affect
>>> performance, especially when the commitlog directory will probably reside
>>> with the data directory?
>>>
>>> I'm at a loss here and don't have any answers for that matter.
>>>
>>> Can anyone assist please?
>>>
>>> Thanks!
>>>
>>>
>>>
>>>
>>> Shalom Sagges
>>> DBA
>>> T: +972-74-700-4035 <+972%2074-700-4035>
>>> <http://www.linkedin.com/company/164748>
>>> <http://twitter.com/liveperson>
>>> <http://www.facebook.com/LivePersonInc>
>>> We Create Meaningful Connections
>>>
>>>
>>>
>>>
>>> This message may contain confidential and/or privileged information.
>>> If you are not the addressee or authorized to receive this on behalf of
>>> the addressee you must not use, copy, disclose or take action based on this
>>> message or any information herein.
>>> If you have received this message in error, please advise the sender
>>> immediately by reply email and delete this message. Thank you.
>>>
>>>
>>>
>>
>> This message may contain confidential and/or privileged information.
>> If you are not the addressee or authorized to receive this on behalf of
>> the addressee you must not use, copy, disclose or take action based on this
>> message or any information herein.
>> If you have received this message in error, please advise the sender
>> immediately by reply email and delete this message. Thank you.
>>
>
>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

Re: Openstack and Cassandra

2016-12-27 Thread Shalom Sagges

Hi Romain,

Thanks for the input!

We currently use the Kilo release of Openstack. Are you aware of any known
bugs/issues with this release?
We definitely defined anti-affinity rules regarding spreading C* on
different hosts. (I surely don't want to be woken up at night due to a
failed host ;-) )

Regarding Trove, I doubt we'll use it in Production any time soon.

Thanks again!





Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://liveperson.docsend.com/view/8iiswfp>


On Mon, Dec 26, 2016 at 7:37 PM, Romain Hardouin 
wrote:

> Hi Shalom,
>
> I assume you'll use KVM virtualization so pay attention to your stack at
> every level:
> - Nova e.g. CPU pinning, NUMA awareness if relevant, etc. Have a look to
> extra specs.
> - libvirt
> - KVM
> - QEMU
>
> You can also be interested by resources quota on other OpenStack VMs that
> will be colocated with C* VMs.
> Don't forget to define anti-affinity rules in order to spread out your C*
> VMs on different hosts.
> Finally, watch out versions of libvirt/KVM/QEMU. Some optimizations/bugs
> are good to know.
>
> Out of curiosity, which OpenStack release are you using?
> You can be interested by Trove but C* support is for testing only.
>
> Best,
>
> Romain
>
>
>
>
>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

Re: Openstack and Cassandra

2017-01-01 Thread Shalom Sagges

Thanks for the info Romain,

Can you tell me please what are the implications of not using CPU Pinning?




Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://liveperson.docsend.com/view/8iiswfp>


On Wed, Dec 28, 2016 at 7:01 PM, Romain Hardouin 
wrote:

> Kilo is a bit old but the good news is that CPU pinning is available which
> IMHO is a must to run C* on Production.
> Of course your bottleneck will be shared HDDs.
>
> Best,
>
> Romain
>
>
> Le Mardi 27 décembre 2016 10h21, Shalom Sagges  a
> écrit :
>
>
> Hi Romain,
>
> Thanks for the input!
>
> We currently use the Kilo release of Openstack. Are you aware of any known
> bugs/issues with this release?
> We definitely defined anti-affinity rules regarding spreading C* on
> different hosts. (I surely don't want to be woken up at night due to a
> failed host ;-) )
>
> Regarding Trove, I doubt we'll use it in Production any time soon.
>
> Thanks again!
>
>
>
>
>
> Shalom Sagges
> DBA
> T: +972-74-700-4035 <+972%2074-700-4035>
> <http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
> <http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
> <https://liveperson.docsend.com/view/8iiswfp>
>
>
> On Mon, Dec 26, 2016 at 7:37 PM, Romain Hardouin 
> wrote:
>
> Hi Shalom,
>
> I assume you'll use KVM virtualization so pay attention to your stack at
> every level:
> - Nova e.g. CPU pinning, NUMA awareness if relevant, etc. Have a look to
> extra specs.
> - libvirt
> - KVM
> - QEMU
>
> You can also be interested by resources quota on other OpenStack VMs that
> will be colocated with C* VMs.
> Don't forget to define anti-affinity rules in order to spread out your C*
> VMs on different hosts.
> Finally, watch out versions of libvirt/KVM/QEMU. Some optimizations/bugs
> are good to know.
>
> Out of curiosity, which OpenStack release are you using?
> You can be interested by Trove but C* support is for testing only.
>
> Best,
>
> Romain
>
>
>
>
>
>
> This message may contain confidential and/or privileged information.
> If you are not the addressee or authorized to receive this on behalf of
> the addressee you must not use, copy, disclose or take action based on this
> message or any information herein.
> If you have received this message in error, please advise the sender
> immediately by reply email and delete this message. Thank you.
>
>
>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

WriteTimeoutException When only One Node is Down

2017-01-11 Thread Shalom Sagges

Hi Everyone,

I'm using C* v3.0.9 for a cluster of 3 DCs with RF 3 in each DC. All
read/write queries are set to consistency LOCAL_QUORUM.
The relevant keyspace is built as follows:

*CREATE KEYSPACE mykeyspace WITH replication = {'class':
'NetworkTopologyStrategy', 'DC1': '3', 'DC2': '3', 'DC3': '3'}  AND
durable_writes = true;*

I use* Datastax driver 3.0.1*


When I performed a resiliency test for the application, each time I dropped
one node, the client got the following error:


com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra
timeout during write query at consistency TWO (2 replica were required but
only 1 acknowledged the write)
at
com.datastax.driver.core.exceptions.WriteTimeoutException.copy(WriteTimeoutException.java:73)
at
com.datastax.driver.core.exceptions.WriteTimeoutException.copy(WriteTimeoutException.java:26)
at
com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)
at
com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:245)
at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:63)
at
humanclick.ldap.commImpl.siteData.CassandraSiteDataDaoSpring.updateJprunDomains(CassandraSiteDataDaoSpring.java:121)
at
humanclick.ldap.commImpl.siteData.CassandraSiteDataDaoSpring.createOrUpdate(CassandraSiteDataDaoSpring.java:97)
at
humanclick.ldapAdapter.dataUpdater.impl.SiteDataToLdapUpdater.update(SiteDataToLdapUpdater.java:280)


After a few seconds the error no longer recurs. I have no idea why there's
a timeout since there are additional replicas that satisfy the consistency
level, and I'm more baffled when the error showed *"Cassandra timeout
during write query at consistency TWO (2 replica were required but only 1
acknowledged the write)"*

Any ideas?  I'm quite at a loss here.

Thanks!



Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

Re: WriteTimeoutException When only One Node is Down

2017-01-15 Thread Shalom Sagges

Hi Yuji,

Thanks for your reply.
That's what I don't understand. Since the writes are in LOCAL_QUORUM, even
if a node fails, there should be enough replica to satisfy the request,
shouldn't it?
Otherwise, the whole idea behind no single point of failure is only
partially true? Or is there something I'm missing...

Thanks!


Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://liveperson.docsend.com/view/8iiswfp>


On Fri, Jan 13, 2017 at 4:15 AM, Yuji Ito  wrote:

> Hi Shalom,
>
> I also got WriteTimeoutException in my destructive test like your test.
>
> When did you drop a node?
> A coordinator node sends a write request to all replicas.
> When one of nodes was down while the request is executed, sometimes
> WriteTimeOutException happens.
>
> cf. http://www.datastax.com/dev/blog/cassandra-error-handling-done-right
>
> Thanks,
> Yuji
>
>
>
> On Thu, Jan 12, 2017 at 4:26 PM, Shalom Sagges 
> wrote:
>
>> Hi Everyone,
>>
>> I'm using C* v3.0.9 for a cluster of 3 DCs with RF 3 in each DC. All
>> read/write queries are set to consistency LOCAL_QUORUM.
>> The relevant keyspace is built as follows:
>>
>> *CREATE KEYSPACE mykeyspace WITH replication = {'class':
>> 'NetworkTopologyStrategy', 'DC1': '3', 'DC2': '3', 'DC3': '3'}  AND
>> durable_writes = true;*
>>
>> I use* Datastax driver 3.0.1*
>>
>>
>> When I performed a resiliency test for the application, each time I
>> dropped one node, the client got the following error:
>>
>>
>> com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra
>> timeout during write query at consistency TWO (2 replica were required but
>> only 1 acknowledged the write)
>> at com.datastax.driver.core.exceptions.WriteTimeoutException.
>> copy(WriteTimeoutException.java:73)
>> at com.datastax.driver.core.exceptions.WriteTimeoutException.
>> copy(WriteTimeoutException.java:26)
>> at com.datastax.driver.core.DriverThrowables.propagateCause(Dri
>> verThrowables.java:37)
>> at com.datastax.driver.core.DefaultResultSetFuture.getUninterru
>> ptibly(DefaultResultSetFuture.java:245)
>> at com.datastax.driver.core.AbstractSession.execute(AbstractSes
>> sion.java:63)
>> at humanclick.ldap.commImpl.siteData.CassandraSiteDataDaoSpring
>> .updateJprunDomains(CassandraSiteDataDaoSpring.java:121)
>> at humanclick.ldap.commImpl.siteData.CassandraSiteDataDaoSpring
>> .createOrUpdate(CassandraSiteDataDaoSpring.java:97)
>> at humanclick.ldapAdapter.dataUpdater.impl.SiteDataToLdapUpdate
>> r.update(SiteDataToLdapUpdater.java:280)
>>
>>
>> After a few seconds the error no longer recurs. I have no idea why
>> there's a timeout since there are additional replicas that satisfy the
>> consistency level, and I'm more baffled when the error showed *"Cassandra
>> timeout during write query at consistency TWO (2 replica were required but
>> only 1 acknowledged the write)"*
>>
>> Any ideas?  I'm quite at a loss here.
>>
>> Thanks!
>>
>>
>>
>> Shalom Sagges
>> DBA
>> T: +972-74-700-4035 <+972%2074-700-4035>
>> <http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
>> <http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
>>
>>
>>
>> This message may contain confidential and/or privileged information.
>> If you are not the addressee or authorized to receive this on behalf of
>> the addressee you must not use, copy, disclose or take action based on this
>> message or any information herein.
>> If you have received this message in error, please advise the sender
>> immediately by reply email and delete this message. Thank you.
>>
>
>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

Re: RemoveNode CPU Spike Question

2017-01-15 Thread Shalom Sagges

Hi Anubhav,

This happened to us as well, on all nodes in the DC. We found that after
performing removenode, all other nodes suddenly started to do a lot of
compactions that increased CPU.
To mitigate that, we used nodetool disableautocompaction before removing
the node. Then, after removal, we slowly enabled autocompaction (a few
minutes between each enable) on the nodes one by one.
This helped with the CPU increase you've mentioned.



Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://liveperson.docsend.com/view/8iiswfp>


On Tue, Jan 10, 2017 at 8:03 PM, Anubhav Kale 
wrote:

> Well, looking through logs I confirmed that my understanding below is
> correct, but would be good to hear from experts for sure 😊
>
>
>
> *From:* Anubhav Kale [mailto:anubhav.k...@microsoft.com]
> *Sent:* Tuesday, January 10, 2017 9:58 AM
> *To:* user@cassandra.apache.org
> *Cc:* Sean Usher 
> *Subject:* RemoveNode CPU Spike Question
>
>
>
> Hello,
>
>
>
> Recently, I started noticing an interesting pattern. When I execute
> “removenode”, a subset of the nodes that now own the tokens result it in a
> CPU spike / disk activity, and sometimes SSTables on those nodes shoot up.
>
>
>
> After looking through the code, it appears to me that below function
> forces data to be streamed from some of the new nodes to the node from
> where “removenode” is kicked in. Is my understanding correct ?
>
>
>
> https://github.com/apache/cassandra/blob/d384e781d6f7c028dbe88cfe9dd3e9
> 66e72cd046/src/java/org/apache/cassandra/service/StorageService.java#L2548
> <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fcassandra%2Fblob%2Fd384e781d6f7c028dbe88cfe9dd3e966e72cd046%2Fsrc%2Fjava%2Forg%2Fapache%2Fcassandra%2Fservice%2FStorageService.java%23L2548&data=02%7C01%7CAnubhav.Kale%40microsoft.com%7C173daa48fcaf4ca6498d08d43982318c%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636196678720784947&sdata=JZ9zWh%2FtJJ%2FbhXXkT41yQhANKaUSBHfP53WraY2vL8M%3D&reserved=0>
>
>
>
> Our nodes don’t run very hot, but it appears this streaming causes them to
> have issues. Have other people seen this ?
>
>
>
> Thanks !
>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

A Single Dropped Node Fails Entire Read Queries

2017-03-09 Thread Shalom Sagges

Hi Cassandra Users,

I hope someone could help me understand the following scenario:

Version: 3.0.9
3 nodes per DC
3 DCs in the cluster.
Consistency Local_Quorum.

I did a small resiliency test and dropped a node to check the availability
of the data.
What I assumed would happen is nothing at all. If a node is down in a 3
nodes DC, Local_Quorum should still be satisfied.
However, during the ~10 first seconds after stopping the service, I got
timeout errors (tried it both from the client and from cqlsh.

This is the error I get:
*ServerError:
com.google.common.util.concurrent.UncheckedExecutionException:
com.google.common.util.concurrent.UncheckedExecutionException:
java.lang.RuntimeException:
org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out -
received only 4 responses.*


After ~10 seconds, the same query is successful with no timeout errors. The
dropped node is still down.

Any idea what could cause this and how to fix it?

Thanks!

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

Re: A Single Dropped Node Fails Entire Read Queries

2017-03-10 Thread Shalom Sagges

@Ryan, my keyspace replication settings are as follows:
CREATE KEYSPACE mykeyspace WITH replication = {'class':
'NetworkTopologyStrategy', 'DC1': '3', 'DC2: '3', 'DC3': '3'}  AND
durable_writes = true;

CREATE TABLE mykeyspace.test (
column1 text,
column2 text,
column3 text,
PRIMARY KEY (column1, column2)

The query is *select * from mykeyspace.test where column1='x';*

@Daniel, the replication factor is 3. That's why I don't understand why I
get these timeouts when only one node drops.

Also, when I enabled tracing, I got the following error:
*Unable to fetch query trace: ('Unable to complete the operation against
any hosts', {: Unavailable('Error from server:
code=1000 [Unavailable exception] message="Cannot achieve consistency level
LOCAL_QUORUM" info={\'required_replicas\': 2, \'alive_replicas\': 1,
\'consistency\': \'LOCAL_QUORUM\'}',)})*

But nodetool status shows that only 1 replica was down:
--  Address  Load   Tokens   OwnsHost ID
Rack
DN  x.x.x.235  134.32 MB  256  ?
c0920d11-08da-4f18-a7f3-dbfb8c155b19  RAC1
UN  x.x.x.236  134.02 MB  256  ?
2cc0a27b-b1e4-461f-a3d2-186d3d82ff3d  RAC1
UN  x.x.x.237  134.34 MB  256  ?
5b2162aa-8803-4b54-88a9-ff2e70b3d830  RAC1


I tried to run the same scenario on all 3 nodes, and only the 3rd node
didn't fail the query when I dropped it. The nodes were installed and
configured with Puppet so the configuration is the same on all 3 nodes.


Thanks!



On Fri, Mar 10, 2017 at 10:25 AM, Daniel Hölbling-Inzko <
daniel.hoelbling-in...@bitmovin.com> wrote:

> The LOCAL_QUORUM works on the available replicas in the dc. So if your
> replication factor is 2 and you have 10 nodes you can still only loose 1.
> With a replication factor of 3 you can loose one node and still satisfy the
> query.
> Ryan Svihla  schrieb am Do. 9. März 2017 um 18:09:
>
>> whats your keyspace replication settings and what's your query?
>>
>> On Thu, Mar 9, 2017 at 9:32 AM, Shalom Sagges 
>> wrote:
>>
>> Hi Cassandra Users,
>>
>> I hope someone could help me understand the following scenario:
>>
>> Version: 3.0.9
>> 3 nodes per DC
>> 3 DCs in the cluster.
>> Consistency Local_Quorum.
>>
>> I did a small resiliency test and dropped a node to check the
>> availability of the data.
>> What I assumed would happen is nothing at all. If a node is down in a 3
>> nodes DC, Local_Quorum should still be satisfied.
>> However, during the ~10 first seconds after stopping the service, I got
>> timeout errors (tried it both from the client and from cqlsh.
>>
>> This is the error I get:
>> *ServerError:
>> com.google.common.util.concurrent.UncheckedExecutionException:
>> com.google.common.util.concurrent.UncheckedExecutionException:
>> java.lang.RuntimeException:
>> org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out -
>> received only 4 responses.*
>>
>>
>> After ~10 seconds, the same query is successful with no timeout errors.
>> The dropped node is still down.
>>
>> Any idea what could cause this and how to fix it?
>>
>> Thanks!
>>
>>
>> This message may contain confidential and/or privileged information.
>> If you are not the addressee or authorized to receive this on behalf of
>> the addressee you must not use, copy, disclose or take action based on this
>> message or any information herein.
>> If you have received this message in error, please advise the sender
>> immediately by reply email and delete this message. Thank you.
>>
>>
>>
>>
>> --
>>
>> Thanks,
>> Ryan Svihla
>>
>>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

Re: A Single Dropped Node Fails Entire Read Queries

2017-03-10 Thread Shalom Sagges

Hi daniel,

I don't think that's a network issue, because ~10 seconds after the node
stopped, the queries were successful again without any timeout issues.

Thanks!


Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://liveperson.docsend.com/view/8iiswfp>


On Fri, Mar 10, 2017 at 12:01 PM, Daniel Hölbling-Inzko <
daniel.hoelbling-in...@bitmovin.com> wrote:

> Could there be network issues in connecting between the nodes? If node a
> gets To be the query coordinator but can't reach b and c is obviously down
> it won't get a quorum.
>
> Greetings
>
> Shalom Sagges  schrieb am Fr. 10. März 2017 um
> 10:55:
>
>> @Ryan, my keyspace replication settings are as follows:
>> CREATE KEYSPACE mykeyspace WITH replication = {'class':
>> 'NetworkTopologyStrategy', 'DC1': '3', 'DC2: '3', 'DC3': '3'}  AND
>> durable_writes = true;
>>
>> CREATE TABLE mykeyspace.test (
>> column1 text,
>> column2 text,
>> column3 text,
>> PRIMARY KEY (column1, column2)
>>
>> The query is *select * from mykeyspace.test where column1='x';*
>>
>> @Daniel, the replication factor is 3. That's why I don't understand why I
>> get these timeouts when only one node drops.
>>
>> Also, when I enabled tracing, I got the following error:
>> *Unable to fetch query trace: ('Unable to complete the operation against
>> any hosts', {: Unavailable('Error from server:
>> code=1000 [Unavailable exception] message="Cannot achieve consistency level
>> LOCAL_QUORUM" info={\'required_replicas\': 2, \'alive_replicas\': 1,
>> \'consistency\': \'LOCAL_QUORUM\'}',)})*
>>
>> But nodetool status shows that only 1 replica was down:
>> --  Address  Load   Tokens   OwnsHost ID
>>   Rack
>> DN  x.x.x.235  134.32 MB  256  ?   
>> c0920d11-08da-4f18-a7f3-dbfb8c155b19
>>  RAC1
>> UN  x.x.x.236  134.02 MB  256  ?   
>> 2cc0a27b-b1e4-461f-a3d2-186d3d82ff3d
>>  RAC1
>> UN  x.x.x.237  134.34 MB  256  ?   
>> 5b2162aa-8803-4b54-88a9-ff2e70b3d830
>>  RAC1
>>
>>
>> I tried to run the same scenario on all 3 nodes, and only the 3rd node
>> didn't fail the query when I dropped it. The nodes were installed and
>> configured with Puppet so the configuration is the same on all 3 nodes.
>>
>>
>> Thanks!
>>
>>
>>
>> On Fri, Mar 10, 2017 at 10:25 AM, Daniel Hölbling-Inzko <
>> daniel.hoelbling-in...@bitmovin.com> wrote:
>>
>> The LOCAL_QUORUM works on the available replicas in the dc. So if your
>> replication factor is 2 and you have 10 nodes you can still only loose 1.
>> With a replication factor of 3 you can loose one node and still satisfy the
>> query.
>> Ryan Svihla  schrieb am Do. 9. März 2017 um 18:09:
>>
>> whats your keyspace replication settings and what's your query?
>>
>> On Thu, Mar 9, 2017 at 9:32 AM, Shalom Sagges 
>> wrote:
>>
>> Hi Cassandra Users,
>>
>> I hope someone could help me understand the following scenario:
>>
>> Version: 3.0.9
>> 3 nodes per DC
>> 3 DCs in the cluster.
>> Consistency Local_Quorum.
>>
>> I did a small resiliency test and dropped a node to check the
>> availability of the data.
>> What I assumed would happen is nothing at all. If a node is down in a 3
>> nodes DC, Local_Quorum should still be satisfied.
>> However, during the ~10 first seconds after stopping the service, I got
>> timeout errors (tried it both from the client and from cqlsh.
>>
>> This is the error I get:
>> *ServerError:
>> com.google.common.util.concurrent.UncheckedExecutionException:
>> com.google.common.util.concurrent.UncheckedExecutionException:
>> java.lang.RuntimeException:
>> org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out -
>> received only 4 responses.*
>>
>>
>> After ~10 seconds, the same query is successful with no timeout errors.
>> The dropped node is still down.
>>
>> Any idea what could cause this and how to fix it?
>>
>> Thanks!
>>
>>
>> This message may contain confidential and/or privileged information.
>> If you are not the addressee or authorized to receive this on behalf of
>> the addre

Re: A Single Dropped Node Fails Entire Read Queries

2017-03-12 Thread Shalom Sagges

Hi Michael,

If a node suddenly fails, and there are other replicas that can still
satisfy the consistency level, shouldn't the request succeed regardless of
the failed node?

Thanks!





Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://liveperson.docsend.com/view/8iiswfp>


On Fri, Mar 10, 2017 at 6:25 PM, Michael Shuler 
wrote:

> I may be mistaken on the exact configuration option for the timeout
> you're hitting, but I believe this may be the general
> `request_timeout_in_ms: 1` in conf/cassandra.yaml.
>
> A reasonable timeout for a "node down" discovery/processing is needed to
> prevent random flapping of nodes with a super short timeout interval.
> Applications should also retry on a host unavailable exception like
> this, because in the long run, this should be expected from time to time
> for network partitions, node failure, maintenance cycles, etc.
>
> --
> Kind regards,
> Michael
>
> On 03/10/2017 04:07 AM, Shalom Sagges wrote:
> > Hi daniel,
> >
> > I don't think that's a network issue, because ~10 seconds after the node
> > stopped, the queries were successful again without any timeout issues.
> >
> > Thanks!
> >
> >
> > Shalom Sagges
> > DBA
> > T: +972-74-700-4035
> > <http://www.linkedin.com/company/164748>
> > <http://twitter.com/liveperson>   <http://www.facebook.com/
> LivePersonInc>
> >
> >   We Create Meaningful Connections
> >
> > <https://liveperson.docsend.com/view/8iiswfp>
> >
> >
> >
> > On Fri, Mar 10, 2017 at 12:01 PM, Daniel Hölbling-Inzko
> >  > <mailto:daniel.hoelbling-in...@bitmovin.com>> wrote:
> >
> > Could there be network issues in connecting between the nodes? If
> > node a gets To be the query coordinator but can't reach b and c is
> > obviously down it won't get a quorum.
> >
> > Greetings
> >
> > Shalom Sagges  > <mailto:shal...@liveperson.com>> schrieb am Fr. 10. März 2017 um
> 10:55:
> >
> > @Ryan, my keyspace replication settings are as follows:
> > CREATE KEYSPACE mykeyspace WITH replication = {'class':
> > 'NetworkTopologyStrategy', 'DC1': '3', 'DC2: '3', 'DC3': '3'}
> >  AND durable_writes = true;
> >
> > CREATE TABLE mykeyspace.test (
> > column1 text,
> > column2 text,
> > column3 text,
> > PRIMARY KEY (column1, column2)
> >
> > The query is */select * from mykeyspace.test where
> > column1='x';/*
> >
> > @Daniel, the replication factor is 3. That's why I don't
> > understand why I get these timeouts when only one node drops.
> >
> > Also, when I enabled tracing, I got the following error:
> > *Unable to fetch query trace: ('Unable to complete the operation
> > against any hosts', {: Unavailable('Error
> > from server: code=1000 [Unavailable exception] message="Cannot
> > achieve consistency level LOCAL_QUORUM"
> > info={\'required_replicas\': 2, \'alive_replicas\': 1,
> > \'consistency\': \'LOCAL_QUORUM\'}',)})*
> >
> > But nodetool status shows that only 1 replica was down:
> > --  Address  Load   Tokens   OwnsHost ID
> >   Rack
> > DN  x.x.x.235  134.32 MB  256  ?
> > c0920d11-08da-4f18-a7f3-dbfb8c155b19  RAC1
> > UN  x.x.x.236  134.02 MB  256  ?
> > 2cc0a27b-b1e4-461f-a3d2-186d3d82ff3d  RAC1
> > UN  x.x.x.237  134.34 MB  256  ?
> > 5b2162aa-8803-4b54-88a9-ff2e70b3d830  RAC1
> >
> >
> > I tried to run the same scenario on all 3 nodes, and only the
> > 3rd node didn't fail the query when I dropped it. The nodes were
> > installed and configured with Puppet so the configuration is the
> > same on all 3 nodes.
> >
> >
> > Thanks!
> >
> >
> >
> > On Fri, Mar 10, 2017 at 10:25 AM, Daniel Hölbling-Inzko
> >  > <mailto:daniel.hoelbling-in...@bitmovin.com>> wrote:
> >
> > The LOCAL

Re: A Single Dropped Node Fails Entire Read Queries

2017-03-13 Thread Shalom Sagges

Just some more info, I've tried the same scenario on 2.0.14 and 2.1.15 and
didn't encounter such errors.
What I did find is that the timeout errors appear only until the node is
discovered as "DN" in nodetool status. Once the node is in DN status, the
errors stop and the data is retrieved.

Could this be a bug in 3.0.9? Or some sort of misconfiguration I missed?

Thanks!



Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://liveperson.docsend.com/view/8iiswfp>


On Sun, Mar 12, 2017 at 10:21 AM, Shalom Sagges 
wrote:

> Hi Michael,
>
> If a node suddenly fails, and there are other replicas that can still
> satisfy the consistency level, shouldn't the request succeed regardless of
> the failed node?
>
> Thanks!
>
>
>
>
>
> Shalom Sagges
> DBA
> T: +972-74-700-4035 <+972%2074-700-4035>
> <http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
> <http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
> <https://liveperson.docsend.com/view/8iiswfp>
>
>
> On Fri, Mar 10, 2017 at 6:25 PM, Michael Shuler 
> wrote:
>
>> I may be mistaken on the exact configuration option for the timeout
>> you're hitting, but I believe this may be the general
>> `request_timeout_in_ms: 1` in conf/cassandra.yaml.
>>
>> A reasonable timeout for a "node down" discovery/processing is needed to
>> prevent random flapping of nodes with a super short timeout interval.
>> Applications should also retry on a host unavailable exception like
>> this, because in the long run, this should be expected from time to time
>> for network partitions, node failure, maintenance cycles, etc.
>>
>> --
>> Kind regards,
>> Michael
>>
>> On 03/10/2017 04:07 AM, Shalom Sagges wrote:
>> > Hi daniel,
>> >
>> > I don't think that's a network issue, because ~10 seconds after the node
>> > stopped, the queries were successful again without any timeout issues.
>> >
>> > Thanks!
>> >
>> >
>> > Shalom Sagges
>> > DBA
>> > T: +972-74-700-4035
>> > <http://www.linkedin.com/company/164748>
>> > <http://twitter.com/liveperson>   <http://www.facebook.com/Live
>> PersonInc>
>> >
>> >   We Create Meaningful Connections
>> >
>> > <https://liveperson.docsend.com/view/8iiswfp>
>> >
>> >
>> >
>> > On Fri, Mar 10, 2017 at 12:01 PM, Daniel Hölbling-Inzko
>> > > > <mailto:daniel.hoelbling-in...@bitmovin.com>> wrote:
>> >
>> > Could there be network issues in connecting between the nodes? If
>> > node a gets To be the query coordinator but can't reach b and c is
>> > obviously down it won't get a quorum.
>> >
>> > Greetings
>> >
>> > Shalom Sagges > > <mailto:shal...@liveperson.com>> schrieb am Fr. 10. März 2017 um
>> 10:55:
>> >
>> > @Ryan, my keyspace replication settings are as follows:
>> > CREATE KEYSPACE mykeyspace WITH replication = {'class':
>> > 'NetworkTopologyStrategy', 'DC1': '3', 'DC2: '3', 'DC3': '3'}
>> >  AND durable_writes = true;
>> >
>> > CREATE TABLE mykeyspace.test (
>> > column1 text,
>> > column2 text,
>> > column3 text,
>> > PRIMARY KEY (column1, column2)
>> >
>> > The query is */select * from mykeyspace.test where
>> > column1='x';/*
>> >
>> > @Daniel, the replication factor is 3. That's why I don't
>> > understand why I get these timeouts when only one node drops.
>> >
>> > Also, when I enabled tracing, I got the following error:
>> > *Unable to fetch query trace: ('Unable to complete the operation
>> > against any hosts', {: Unavailable('Error
>> > from server: code=1000 [Unavailable exception] message="Cannot
>> > achieve consistency level LOCAL_QUORUM"
>> > info={\'required_replicas\': 2, \'alive_replicas\': 1,
>> > \'consistency\': \'LOCAL_QUORUM\'}',)})*
>> >
>> >

Re: A Single Dropped Node Fails Entire Read Queries

2017-03-14 Thread Shalom Sagges

Thanks a lot Joel!

I'll go ahead and upgrade.

Thanks again!


Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://liveperson.docsend.com/view/8iiswfp>


On Mon, Mar 13, 2017 at 7:27 PM, Joel Knighton 
wrote:

> It's possible that you're hitting https://issues.apache.
> org/jira/browse/CASSANDRA-13009 .
>
> In (simplified) summary, the read query picks the right number of
> endpoints fairly early in its execution. Because the down node has not been
> detected as down yet, it may be one of the nodes. When this node doesn't
> answer, it is likely that speculative retry will kick in after a certain
> amount of time and query an up node. This feature is present and working in
> the earlier releases you tested. Unfortunately, percentile-based
> speculative retry wasn't working as intended in 2.2+ until fixed in
> CASSANDRA-13009, which went into 2.2.9/3.0.11+.
>
> It may be worth evaluating the latest 3.0.x release.
>
> On Mon, Mar 13, 2017 at 11:48 AM, Shalom Sagges 
> wrote:
>
>> Just some more info, I've tried the same scenario on 2.0.14 and 2.1.15
>> and didn't encounter such errors.
>> What I did find is that the timeout errors appear only until the node is
>> discovered as "DN" in nodetool status. Once the node is in DN status, the
>> errors stop and the data is retrieved.
>>
>> Could this be a bug in 3.0.9? Or some sort of misconfiguration I missed?
>>
>> Thanks!
>>
>>
>>
>> Shalom Sagges
>> DBA
>> T: +972-74-700-4035 <+972%2074-700-4035>
>> <http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
>> <http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
>> <https://liveperson.docsend.com/view/8iiswfp>
>>
>>
>> On Sun, Mar 12, 2017 at 10:21 AM, Shalom Sagges 
>> wrote:
>>
>>> Hi Michael,
>>>
>>> If a node suddenly fails, and there are other replicas that can still
>>> satisfy the consistency level, shouldn't the request succeed regardless of
>>> the failed node?
>>>
>>> Thanks!
>>>
>>>
>>>
>>>
>>>
>>> Shalom Sagges
>>> DBA
>>> T: +972-74-700-4035 <+972%2074-700-4035>
>>> <http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
>>> <http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
>>> <https://liveperson.docsend.com/view/8iiswfp>
>>>
>>>
>>> On Fri, Mar 10, 2017 at 6:25 PM, Michael Shuler 
>>> wrote:
>>>
>>>> I may be mistaken on the exact configuration option for the timeout
>>>> you're hitting, but I believe this may be the general
>>>> `request_timeout_in_ms: 1` in conf/cassandra.yaml.
>>>>
>>>> A reasonable timeout for a "node down" discovery/processing is needed to
>>>> prevent random flapping of nodes with a super short timeout interval.
>>>> Applications should also retry on a host unavailable exception like
>>>> this, because in the long run, this should be expected from time to time
>>>> for network partitions, node failure, maintenance cycles, etc.
>>>>
>>>> --
>>>> Kind regards,
>>>> Michael
>>>>
>>>> On 03/10/2017 04:07 AM, Shalom Sagges wrote:
>>>> > Hi daniel,
>>>> >
>>>> > I don't think that's a network issue, because ~10 seconds after the
>>>> node
>>>> > stopped, the queries were successful again without any timeout issues.
>>>> >
>>>> > Thanks!
>>>> >
>>>> >
>>>> > Shalom Sagges
>>>> > DBA
>>>> > T: +972-74-700-4035
>>>> > <http://www.linkedin.com/company/164748>
>>>> > <http://twitter.com/liveperson>   <http://www.facebook.com/Live
>>>> PersonInc>
>>>> >
>>>> >   We Create Meaningful Connections
>>>> >
>>>> > <https://liveperson.docsend.com/view/8iiswfp>
>>>> >
>>>> >
>>>> >
>>>> > On Fri, Mar 10, 2017 at 12:01 PM, Daniel Hölbling-Inzko
>>>> > >>> > <mailto:daniel.hoelbling-in...@bitmovin.com>> wrote:
>>>> >
>>>&

Re: A Single Dropped Node Fails Entire Read Queries

2017-03-22 Thread Shalom Sagges

Upgrading to 3.0.12 solved the issue.

Thanks a lot for the help Joel!


Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://liveperson.docsend.com/view/8iiswfp>


On Tue, Mar 14, 2017 at 10:44 AM, Shalom Sagges 
wrote:

> Thanks a lot Joel!
>
> I'll go ahead and upgrade.
>
> Thanks again!
>
>
> Shalom Sagges
> DBA
> T: +972-74-700-4035 <+972%2074-700-4035>
> <http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
> <http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
> <https://liveperson.docsend.com/view/8iiswfp>
>
>
> On Mon, Mar 13, 2017 at 7:27 PM, Joel Knighton  > wrote:
>
>> It's possible that you're hitting https://issues.apache.
>> org/jira/browse/CASSANDRA-13009 .
>>
>> In (simplified) summary, the read query picks the right number of
>> endpoints fairly early in its execution. Because the down node has not been
>> detected as down yet, it may be one of the nodes. When this node doesn't
>> answer, it is likely that speculative retry will kick in after a certain
>> amount of time and query an up node. This feature is present and working in
>> the earlier releases you tested. Unfortunately, percentile-based
>> speculative retry wasn't working as intended in 2.2+ until fixed in
>> CASSANDRA-13009, which went into 2.2.9/3.0.11+.
>>
>> It may be worth evaluating the latest 3.0.x release.
>>
>> On Mon, Mar 13, 2017 at 11:48 AM, Shalom Sagges 
>> wrote:
>>
>>> Just some more info, I've tried the same scenario on 2.0.14 and 2.1.15
>>> and didn't encounter such errors.
>>> What I did find is that the timeout errors appear only until the node is
>>> discovered as "DN" in nodetool status. Once the node is in DN status, the
>>> errors stop and the data is retrieved.
>>>
>>> Could this be a bug in 3.0.9? Or some sort of misconfiguration I missed?
>>>
>>> Thanks!
>>>
>>>
>>>
>>> Shalom Sagges
>>> DBA
>>> T: +972-74-700-4035 <+972%2074-700-4035>
>>> <http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
>>> <http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
>>> <https://liveperson.docsend.com/view/8iiswfp>
>>>
>>>
>>> On Sun, Mar 12, 2017 at 10:21 AM, Shalom Sagges 
>>> wrote:
>>>
>>>> Hi Michael,
>>>>
>>>> If a node suddenly fails, and there are other replicas that can still
>>>> satisfy the consistency level, shouldn't the request succeed regardless of
>>>> the failed node?
>>>>
>>>> Thanks!
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Shalom Sagges
>>>> DBA
>>>> T: +972-74-700-4035 <+972%2074-700-4035>
>>>> <http://www.linkedin.com/company/164748>
>>>> <http://twitter.com/liveperson> <http://www.facebook.com/LivePersonInc> We
>>>> Create Meaningful Connections
>>>> <https://liveperson.docsend.com/view/8iiswfp>
>>>>
>>>>
>>>> On Fri, Mar 10, 2017 at 6:25 PM, Michael Shuler >>> > wrote:
>>>>
>>>>> I may be mistaken on the exact configuration option for the timeout
>>>>> you're hitting, but I believe this may be the general
>>>>> `request_timeout_in_ms: 1` in conf/cassandra.yaml.
>>>>>
>>>>> A reasonable timeout for a "node down" discovery/processing is needed
>>>>> to
>>>>> prevent random flapping of nodes with a super short timeout interval.
>>>>> Applications should also retry on a host unavailable exception like
>>>>> this, because in the long run, this should be expected from time to
>>>>> time
>>>>> for network partitions, node failure, maintenance cycles, etc.
>>>>>
>>>>> --
>>>>> Kind regards,
>>>>> Michael
>>>>>
>>>>> On 03/10/2017 04:07 AM, Shalom Sagges wrote:
>>>>> > Hi daniel,
>>>>> >
>>>>> > I don't think that's a network issue, because ~10 seconds after the
>>>>> node
>>>>> > stopped, the queries were successful again without any tim

Bootstraping a Node With a Newer Version

2017-05-16 Thread Shalom Sagges

Hi All,

Hypothetically speaking, let's say I want to upgrade my Cassandra cluster,
but I also want to perform a major upgrade to the kernel of all nodes.
In order to upgrade the kernel, I need to reinstall the server, hence lose
all data on the node.

My question is this, after reinstalling the server with the new kernel, can
I first install the upgraded Cassandra version and then bootstrap it to the
cluster?

Since there's already no data on the node, I wish to skip the agonizing
sstable upgrade process.

Does anyone know if this is doable?

Thanks!



Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

Re: Bootstraping a Node With a Newer Version

2017-05-17 Thread Shalom Sagges

Our DevOPS team told me that their policy is not to perform major kernel
upgrades but simply install a clean new version.
I also checked online and found a lot of recommendations *not *to do so as
there might be a lot of dependencies issues that may affect processes such
as yum.
e.g.
https://www.centos.org/forums/viewtopic.php?t=53678
"The upgrade from CentOS 6 to 7 is a process that is fraught with danger
and very very untested. Almost no-one succeeds without extreme effort. The
CentOS wiki page about it has a big fat warning saying "Do not do this". If
at all possible you should do a parallel install, migrate your data, apps
and settings to the new box and decommission the old one.

The problem comes about because there are a large number of packages in el6
that already have a higher version number than those in el7. This means
that the el6 packages take precedence in the update and there are quite a
few orphans left behind and these break lilttle things like yum. For
example, one that I know about is openldap which is
openldap-2.4.40-5.el6.x86_64 and openldap-2.4.39-6.el7.x86_64 so the el6
package is seen as newer than the el7 one. Anything that's linked against
openldap (a *lot*) now will not function until that package is replaced
with its el7 equivalent, The easiest way to do this would be to yum
downgrade openldap but, ooops, one of the things that needs openldap is yum
so it doesn't work."

I've also checked the Centos Wiki page and found the same recommendation:
https://wiki.centos.org/FAQ/General?highlight=%28upgrade%29%7C%28to%29%7C%28centos7%29#head-3ac1bdb51f0fecde1f98142cef90e887b1b12a00
 :

*"Upgrades in place are not supported nor recommended by CentOS or TUV. A
backup followed by a fresh install is the only recommended upgrade path.
See the Migration Guide for more information."*

Since I have around twenty 2TB nodes in each DC (2 DCs in 6 different
farms) and I don't want it to take forever, perhaps the best way would be
to either leave it with Centos 6 and install Python 2.7 (I understand
that's not so user friendly) or perform the backup recommendations shown on
the Centos page (which sounds extremely agonizing as well).

What do you think?

Thanks!

Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections

On Tue, May 16, 2017 at 6:48 PM, daemeon reiydelle 
wrote:

> What makes you think you cannot upgrade the kernel?
>
> “All men dream, but not equally. Those who dream by night in the dusty
> recesses of their minds wake up in the day to find it was vanity, but the
> dreamers of the day are dangerous men, for they may act their dreams with
> open eyes, to make it possible.” — T.E. Lawrence
>
> sent from my mobile
> Daemeon Reiydelle
> skype daemeon.c.m.reiydelle
> USA 415.501.0198 <(415)%20501-0198>
>
> On May 16, 2017 5:27 AM, "Shalom Sagges"  wrote:
>
>> Hi All,
>>
>> Hypothetically speaking, let's say I want to upgrade my Cassandra
>> cluster, but I also want to perform a major upgrade to the kernel of all
>> nodes.
>> In order to upgrade the kernel, I need to reinstall the server, hence
>> lose all data on the node.
>>
>> My question is this, after reinstalling the server with the new kernel,
>> can I first install the upgraded Cassandra version and then bootstrap it to
>> the cluster?
>>
>> Since there's already no data on the node, I wish to skip the agonizing
>> sstable upgrade process.
>>
>> Does anyone know if this is doable?
>>
>> Thanks!
>>
>>
>>
>> Shalom Sagges
>> DBA
>> T: +972-74-700-4035 <+972%2074-700-4035>
>> <http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
>> <http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
>>
>>
>>
>> This message may contain confidential and/or privileged information.
>> If you are not the addressee or authorized to receive this on behalf of
>> the addressee you must not use, copy, disclose or take action based on this
>> message or any information herein.
>> If you have received this message in error, please advise the sender
>> immediately by reply email and delete this message. Thank you.
>>
>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

Re: Bootstraping a Node With a Newer Version

2017-05-17 Thread Shalom Sagges

Data directories are indeed separated from the root filesystem.
Our System team will look into this and hopefully they will be able to
install the new version seamlessly.

Thanks a lot everyone for your points and guidance. Much appreciated!





Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://liveperson.docsend.com/view/8iiswfp>


On Wed, May 17, 2017 at 10:59 AM, Dor Laor  wrote:

> We've done such in-place upgrade in the past but not for a real production.
>
> However you're MISSING the point. The root filesystem along with the entire
> OS should be completely separated from your data directories. It should
> reside
> in a different logical volume and thus you can easily change the OS while
> not
> changing the data volume. Not to mention that there are fancier options
> like
> snapshoting the data volume and thus having zero risk.
>
> Happy LVMing.
> Dor
>
> On Wed, May 17, 2017 at 12:51 AM, Shalom Sagges 
> wrote:
>
>> Our DevOPS team told me that their policy is not to perform major kernel
>> upgrades but simply install a clean new version.
>> I also checked online and found a lot of recommendations *not *to do so
>> as there might be a lot of dependencies issues that may affect processes
>> such as yum.
>> e.g.
>> https://www.centos.org/forums/viewtopic.php?t=53678
>> "The upgrade from CentOS 6 to 7 is a process that is fraught with danger
>> and very very untested. Almost no-one succeeds without extreme effort. The
>> CentOS wiki page about it has a big fat warning saying "Do not do this". If
>> at all possible you should do a parallel install, migrate your data, apps
>> and settings to the new box and decommission the old one.
>>
>> The problem comes about because there are a large number of packages in
>> el6 that already have a higher version number than those in el7. This means
>> that the el6 packages take precedence in the update and there are quite a
>> few orphans left behind and these break lilttle things like yum. For
>> example, one that I know about is openldap which is
>> openldap-2.4.40-5.el6.x86_64 and openldap-2.4.39-6.el7.x86_64 so the el6
>> package is seen as newer than the el7 one. Anything that's linked against
>> openldap (a *lot*) now will not function until that package is replaced
>> with its el7 equivalent, The easiest way to do this would be to yum
>> downgrade openldap but, ooops, one of the things that needs openldap is
>> yum so it doesn't work."
>>
>>
>> I've also checked the Centos Wiki page and found the same recommendation:
>> https://wiki.centos.org/FAQ/General?highlight=%28upgrade%29%
>> 7C%28to%29%7C%28centos7%29#head-3ac1bdb51f0fecde1f98142cef90e887b1b12a00
>>  :
>>
>> *"Upgrades in place are not supported nor recommended by CentOS or TUV. A
>> backup followed by a fresh install is the only recommended upgrade path.
>> See the Migration Guide for more information."*
>>
>>
>> Since I have around twenty 2TB nodes in each DC (2 DCs in 6 different
>> farms) and I don't want it to take forever, perhaps the best way would be
>> to either leave it with Centos 6 and install Python 2.7 (I understand
>> that's not so user friendly) or perform the backup recommendations shown on
>> the Centos page (which sounds extremely agonizing as well).
>>
>> What do you think?
>>
>> Thanks!
>>
>>
>> Shalom Sagges
>> DBA
>> T: +972-74-700-4035 <074-700-4035>
>> <http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
>> <http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
>>
>>
>>
>> On Tue, May 16, 2017 at 6:48 PM, daemeon reiydelle 
>> wrote:
>>
>>> What makes you think you cannot upgrade the kernel?
>>>
>>> “All men dream, but not equally. Those who dream by night in the dusty
>>> recesses of their minds wake up in the day to find it was vanity, but the
>>> dreamers of the day are dangerous men, for they may act their dreams with
>>> open eyes, to make it possible.” — T.E. Lawrence
>>>
>>> sent from my mobile
>>> Daemeon Reiydelle
>>> skype daemeon.c.m.reiydelle
>>> USA 415.501.0198 <(415)%20501-0198>
>>>
>>> On May 16, 2017 5:27 AM, "Shalom Sagges"  wrote:
>>>
>>>> Hi All,
>>>>
>>>> Hypothetically speaking, let&

Re: sstablesplit - status

2017-05-17 Thread Shalom Sagges

If you make all as 10gb each, they will compact immediately into same size
again.


The idea is actually to trigger the compaction so the tombstones will be
removed. That's the whole purpose of the split. and if the split sstable
has lots of tombstones, it'll be compacted to a much smaller size.
Also, you can always play with the compaction threshold to suit your needs.


Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://liveperson.docsend.com/view/8iiswfp>


On Wed, May 17, 2017 at 8:23 PM, Nitan Kainth  wrote:

> Right, but realistically that is what happens with SizeTiered. Another
> option is to split the tables in proportion size NOT same size. Like 100 GB
> into 50, 25, 12,13. If you make all as 10gb each, they will compact
> immediately into same size again. Motive is to get rid of duplicates which
> exist on smaller tables outside this one big table (as per my understanding
> from your email).
>
> IOn May 17, 2017, at 12:20 PM, Hannu Kröger  wrote:
>
> Basically meaning that if you run major compaction (=nodetool compact),
> you will end up with even bigger file and that is likely to never get
> compacted without running major compaction again. And therefore not
> recommended for production system.
>
> Hannu
>
>
> On 17 May 2017, at 19:46, Nitan Kainth  wrote:
>
> You can try running major compaction to get rid of duplicate data and
> deleted data. But will be the routine for future.
>
> On May 17, 2017, at 10:23 AM, Jan Kesten  wrote:
>
> me patt
>
>
>
>
>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

Re: Bootstraping a Node With a Newer Version

2017-05-17 Thread Shalom Sagges

So you are not upgrading the kernel, you are upgrading the OS.

Sorry Daemeon, my bad. I meant the OS :-)
So what would you recommend, replace a node with a new OS node
with -Dcassandra.replace_address (never tried it before), or try to format
the root directory of the existing node, without touching the data
directory which is on a different vg?



Shalom Sagges
DBA
T: +972-74-700-4035
<http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
<http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
<https://liveperson.docsend.com/view/8iiswfp>


On Wed, May 17, 2017 at 8:24 PM, daemeon reiydelle 
wrote:

>
> So you are not upgrading the kernel, you are upgrading the OS. Not what
> you asked about. Your devops team is right.
>
> However, Depending on what is using python, the new version of python may
> break older scripts (I do not know, mentioning this, testing required?)
> W
> hen I am doing an OS upgrade (and usually ditto with Hadoop) I 
> add nodes to the cluster at the new OS/HDFS version, decom
> mission old nodes, and repeat. The replication takes a bit but zero down
> time, etc. Since you don't have a lot of storage per node, I don't think
> you will have a lot of high network traffic impacting the performance of
> nodes.
>
>
>
>
>
> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198 <(415)%20501-0198>London
> (+44) (0) 20 8144 9872 <+44%2020%208144%209872>*
>
>
>
> On Wed, May 17, 2017 at 12:51 AM, Shalom Sagges 
> wrote:
>
>> Our DevOPS team told me that their policy is not to perform major kernel
>> upgrades but simply install a clean new version.
>> I also checked online and found a lot of recommendations *not *to do so
>> as there might be a lot of dependencies issues that may affect processes
>> such as yum.
>> e.g.
>> https://www.centos.org/forums/viewtopic.php?t=53678
>> "The upgrade from CentOS 6 to 7 is a process that is fraught with danger
>> and very very untested. Almost no-one succeeds without extreme effort. The
>> CentOS wiki page about it has a big fat warning saying "Do not do this". If
>> at all possible you should do a parallel install, migrate your data, apps
>> and settings to the new box and decommission the old one.
>>
>> The problem comes about because there are a large number of packages in
>> el6 that already have a higher version number than those in el7. This means
>> that the el6 packages take precedence in the update and there are quite a
>> few orphans left behind and these break lilttle things like yum. For
>> example, one that I know about is openldap which is
>> openldap-2.4.40-5.el6.x86_64 and openldap-2.4.39-6.el7.x86_64 so the el6
>> package is seen as newer than the el7 one. Anything that's linked against
>> openldap (a *lot*) now will not function until that package is replaced
>> with its el7 equivalent, The easiest way to do this would be to yum
>> downgrade openldap but, ooops, one of the things that needs openldap is
>> yum so it doesn't work."
>>
>>
>> I've also checked the Centos Wiki page and found the same recommendation:
>> https://wiki.centos.org/FAQ/General?highlight=%28upgrade%29%
>> 7C%28to%29%7C%28centos7%29#head-3ac1bdb51f0fecde1f98142cef90e887b1b12a00
>>  :
>>
>> *"Upgrades in place are not supported nor recommended by CentOS or TUV. A
>> backup followed by a fresh install is the only recommended upgrade path.
>> See the Migration Guide for more information."*
>>
>>
>> Since I have around twenty 2TB nodes in each DC (2 DCs in 6 different
>> farms) and I don't want it to take forever, perhaps the best way would be
>> to either leave it with Centos 6 and install Python 2.7 (I understand
>> that's not so user friendly) or perform the backup recommendations shown on
>> the Centos page (which sounds extremely agonizing as well).
>>
>> What do you think?
>>
>> Thanks!
>>
>>
>> Shalom Sagges
>> DBA
>> T: +972-74-700-4035 <+972%2074-700-4035>
>> <http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
>> <http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
>>
>>
>>
>> On Tue, May 16, 2017 at 6:48 PM, daemeon reiydelle 
>> wrote:
>>
>>> What makes you think you cannot upgrade the kernel?
>>>
>>> “All men dream, but not equally. Those who dream by night in the dusty
>>> recesses of their minds wake up in the day to find it was vanity, but the
>>> dreamers of the day are dangerous men, for they may a

1 2 >

1 - 100 of 150 matches

Mail list logo