Re: strange behavior of counter tables after losing a node

2021-01-27 Thread Elliott Sims
To start with, maybe update to beta4.  There's an absolute massive list of
fixes since alpha4.  I don't think the alphas are expected to be in a
usable/low-bug state necessarily, where beta4 is approaching RC status.

On Tue, Jan 26, 2021, 10:44 PM Attila Wind  wrote:

> Hey All,
>
> I'm coming back on my own question (see below) as this has happened again
> to us 2 days later so we took the time to further analyse this issue. I'd
> like to share our experiences and the workaround which we figured out too.
>
> So to just quickly sum up the most important details again:
>
>- we have a 3 nodes cluster - Cassandra 4-alpha4 and RF=2 - in one DC
>- we are using ONE consistency level in all queries
>- if we lose one node from the cluster then
>   - non-counter table writes are fine, remaining 2 nodes taking over
>   everything
>   - but counter table writes start to fail with exception
>   "com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra
>   timeout during COUNTER write query at consistency ONE (1 replica were
>   required but only 0 acknowledged the write)"
>   - the two remaining nodes are both producing hints files for the
>   fallen one
>- just a note: counter_write_request_timeout_in_ms = 1,
>write_request_timeout_in_ms = 5000 in our cassandra.yaml
>
> To test this further bit we did the following:
>
>- we shut down one of the nodes normally
>In this case we do not have the above behavior - everything happens as
>it should, no failures on counter table writes
>so this is good
>- we reproduced the issue in our TEST env by hard-killing one of the
>nodes instead of normal shutdown (simulating a hardware failure as we had
>in PROD)
>Bingo, issue starts immediately!
>
> Based on the above observations the "normal shutdown - no problem" case
> gave an idea - so now we have a workaround how to get back the cluster into
> a working state in a case if we would lose a node permanently (or for a
> long time at least)
>
>1. (in our case) we stop the App to stop all Cassandra operations
>2. stop all remaining nodes in the cluster normally
>3. restart them normally
>
> This way the remaining nodes realize the failed node is down and they are
> jumping into expected processing - everything works including counter table
> writes
>
> If anyone has any idea what to check / change / do in our cluster I'm all
> ears! :-)
>
> thanks
> Attila Wind
>
> http://www.linkedin.com/in/attilaw
> Mobile: +49 176 43556932
>
>
> 22.01.2021 07:35 keltezéssel, Attila Wind írta:
>
> Hey guys,
>
> Yesterday we had an outage after we have lost a node and we saw such a
> behavior we can not explain.
>
> Our data schema has both: counter and norma tables. And we have
> replicationFactor = 2 and consistency level LOCAL_ONE (explicitly set)
>
> What we saw:
> After a node went down the updates of the counter tables slowed down. A
> lot! These updates normally take only a few millisecs but now started to
> take 30-60 seconds(!)
> At the same time the write ops against non-counter tables did not show any
> difference. The app log was silent in a sense of errors. So the queries -
> including the counter table updates - were not failing (otherwise we see
> exceptions coming from DAO layer originating from Cassandra driver) at all.
> One more thing: only those updates suffered from the above huuuge wait
> time where the lost node was involved (due to partition key). Other updates
> just went fine
>
> The whole stuff looks like Cassandra internally started to wait - a lot -
> for the lost node. Updates finally succeeded without failure - at least for
> the App (the client)
>
> Did anyone ever experienced similar behavior?
> What could be an explanation for the above?
>
> Some more details: the App is implemented in Java 8, we are using Datastax
> driver 3.7.1 and server cluster is running on Cassandra 4.0 alpha 4.
> Cluster size is 3 nodes.
>
> Any feedback is appreciated! :-)
>
> thanks
> --
> Attila Wind
>
> http://www.linkedin.com/in/attilaw
> Mobile: +49 176 43556932
>
>
>


Re: strange behavior of counter tables after losing a node

2021-01-27 Thread Attila Wind
Thanks Elliott, yepp! This is exactly what we also figured out as a next 
step. Upgrade our TEST env to that so we can re-evaluate the test we did.

Makes 100% sense

Attila Wind

http://www.linkedin.com/in/attilaw 
Mobile: +49 176 43556932


27.01.2021 10:18 keltezéssel, Elliott Sims írta:
To start with, maybe update to beta4.  There's an absolute massive 
list of fixes since alpha4.  I don't think the alphas are expected to 
be in a usable/low-bug state necessarily, where beta4 is approaching 
RC status.


On Tue, Jan 26, 2021, 10:44 PM Attila Wind  wrote:

Hey All,

I'm coming back on my own question (see below) as this has
happened again to us 2 days later so we took the time to further
analyse this issue. I'd like to share our experiences and the
workaround which we figured out too.

So to just quickly sum up the most important details again:

  * we have a 3 nodes cluster - Cassandra 4-alpha4 and RF=2 - in
one DC
  * we are using ONE consistency level in all queries
  * if we lose one node from the cluster then
  o non-counter table writes are fine, remaining 2 nodes
taking over everything
  o but counter table writes start to fail with exception
"com.datastax.driver.core.exceptions.WriteTimeoutException:
Cassandra timeout during COUNTER write query at
consistency ONE (1 replica were required but only 0
acknowledged the write)"
  o the two remaining nodes are both producing hints files for
the fallen one
  * just a note: counter_write_request_timeout_in_ms = 1,
write_request_timeout_in_ms = 5000 in our cassandra.yaml

To test this further bit we did the following:

  * we shut down one of the nodes normally
In this case we do not have the above behavior - everything
happens as it should, no failures on counter table writes
so this is good
  * we reproduced the issue in our TEST env by hard-killing one of
the nodes instead of normal shutdown (simulating a hardware
failure as we had in PROD)
Bingo, issue starts immediately!

Based on the above observations the "normal shutdown - no problem"
case gave an idea - so now we have a workaround how to get back
the cluster into a working state in a case if we would lose a node
permanently (or for a long time at least)

 1. (in our case) we stop the App to stop all Cassandra operations
 2. stop all remaining nodes in the cluster normally
 3. restart them normally

This way the remaining nodes realize the failed node is down and
they are jumping into expected processing - everything works
including counter table writes

If anyone has any idea what to check / change / do in our cluster
I'm all ears! :-)

thanks

Attila Wind

http://www.linkedin.com/in/attilaw

Mobile: +49 176 43556932


22.01.2021 07:35 keltezéssel, Attila Wind írta:


Hey guys,

Yesterday we had an outage after we have lost a node and we saw
such a behavior we can not explain.

Our data schema has both: counter and norma tables. And we have
replicationFactor = 2 and consistency level LOCAL_ONE (explicitly
set)

What we saw:
After a node went down the updates of the counter tables slowed
down. A lot! These updates normally take only a few millisecs but
now started to take 30-60 seconds(!)
At the same time the write ops against non-counter tables did not
show any difference. The app log was silent in a sense of errors.
So the queries - including the counter table updates - were not
failing (otherwise we see exceptions coming from DAO layer
originating from Cassandra driver) at all.
One more thing: only those updates suffered from the above huuuge
wait time where the lost node was involved (due to partition
key). Other updates just went fine

The whole stuff looks like Cassandra internally started to wait -
a lot - for the lost node. Updates finally succeeded without
failure - at least for the App (the client)

Did anyone ever experienced similar behavior?
What could be an explanation for the above?

Some more details: the App is implemented in Java 8, we are using
Datastax driver 3.7.1 and server cluster is running on Cassandra
4.0 alpha 4. Cluster size is 3 nodes.

Any feedback is appreciated! :-)

thanks

-- 
Attila Wind


http://www.linkedin.com/in/attilaw

Mobile: +49 176 43556932




Re: Setting DC in different geographical location

2021-01-27 Thread Elliott Sims
TO start, I'd try to figure out what your slowdown is.  Surely GCP has far,
far more than 17Mbps available.
You don't want to cut it close on this, because for stuff like repairs,
rebuilds, interruptions, etc you'll want to be able to catch up and not
just keep up.
Generally speaking, Cassandra defers a lot of work and if you get behind
when you're already at the limit of performance it's going to deteriorate
badly.


Whether it's synchronous or async will depend on the query type for the
write (ALL, LOCAL_QUORUM, etc)

On Tue, Jan 26, 2021 at 6:37 PM MyWorld  wrote:

> Hi,
> We have a cluster with one Data Center of 3 nodes in GCP-US(RF=3).Current
> apache cassandra version 3.11.6. We are planning to add one new Data Center
> of 3 nodes in GCP-India.
>
> At peak hours, files generation in commit logs at GCP-US side on one node
> is around 1 GB per minute (i.e 17+ mbps).
>
> Currently the file transfer speed from GCP US to India is 9 mbps.
>
> So, with this speed, is it possible in cassandra to perform asynchronous
> write in new DC(India)?
> Also, is there any compression technique which cassandra applies while
> transferring data across DC?
>
> *My assumption *: All 3 coordinator nodes in US will be responsible for
> transfering 1/3rd data to new DC. So, at peak time only 1GB/3 is what each
> node has to sync.
> Please let me know is my assumption right? If yes, what will happen if
> data generated in commit log per node increase to 3 GB per minute tomorrow.
>
> Regards,
> Ashish
>
>


Table metrics grid isn't showing in the apache cassandra documentation

2021-01-27 Thread Carl Mueller
https://cassandra.apache.org/doc/latest/operating/metrics.html#table-metrics

I checked with Chrome and Firefox


Re: Use NetworkTopologyStrategy for single data center and add data centers later

2021-01-27 Thread Carl Mueller
Yes, perform that as soon as possible.

When you add a new datacenter, keyspaces that are SimpleStrategy (don't
forget about system_traces and system_distributed) won't work.

On Sat, Dec 19, 2020 at 12:38 PM Aaron Ploetz  wrote:

> Yes, you absolutely can (and should) use NetworkTopologyStrategy with a
> single data center.  In fact, I would argue that SimpleStrategy should
> almost never be used.  But that's just me.
>
> Thanks,
>
> Aaron
>
>
> On Sat, Dec 19, 2020 at 3:21 AM Manu Chadha 
> wrote:
>
>> Is it possible to use NetworkTopologyStrategy when creating a keyspace and
>> add data centers later?
>>
>> I am just starting with an MVP application and I don't expect much
>> traffic or data. Thus I have created only one data center. However, I'll
>> like to add more data centers later if needed
>>
>> I notice that the replication factor for each data center needs to be
>> specified at the time of keyspace creation
>>
>> CREATE KEYSPACE "Excalibur"
>>
>>   WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', 'dc1' : 3, 'dc2' 
>> : 2};
>>
>> As I only have dc1 at the moment, could I just do
>>
>> CREATE KEYSPACE "Excalibur"
>>
>>   WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', 'dc1' : 3};
>>
>> and when I have another datacenter say dc2, could I edit the Excalibur
>>  keyspace?
>>
>> ALTER KEYSPACE "Excalibur"
>>
>>   WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', 'dc2' : 2};
>>
>>
>>
>> or can I start with SimpleStrategy now and change to
>> NetworkTopologyStrategy later? I suspect this might not work as I think
>> this needs changing snitch etc.
>>
>>
>>
>>
>>
>> Sent from Mail  for
>> Windows 10
>>
>>
>>
>


Re: understand bootstrapping

2021-01-27 Thread Yifan Cai
Your thoughts regarding Gossip are correct. There could be a time that
nodes in the cluster hold different views of the ring locally.

In the case of bootstrapping,
1. The joining node updates its status to BOOT before streaming data and
waits for a certain delay in order to populate the update in the cluster.
2. The replica nodes calculate the pending ranges, and the joining node can
get new writes for the token ranges, meanwhile data streaming is on-going.
3. When the other nodes learn the joining node has finished bootstrapping
via gossip (after some delay), the new node starts to serve the read as
well.

(I was wrong about not getting any client traffic earlier. The joining node
accepts write, but no read)

- Yifan

On Tue, Jan 26, 2021 at 11:36 AM Han  wrote:

>
>>> I'm particularly trying to understand the fault-tolerant part of
>>> updating Token Ring state on every node
>>
>> The new node only joins the ring (updates the rings state) when the data
>> streaming (bootstrapping) is successful. Otherwise, the existing ring
>> remains as is, the joining node remains in JOINING state, and it won't get
>> any client traffic. If I understand the question correctly.
>>
>
> Thanks Yifan for your response.
>
> The ring state update is propagated via gossip, so it is "eventually
> consistent".  This means there is a time period when some existing node has
> the new token ring but other existing nodes still have the old ring info,
> right?  Is it true that other operations (e.g. replications) are still
> going on between nodes, even when their local token rings info are not
> consistent?
>
> From the new node point of view,  even when it is successful, at the end
> of `StorageService::joinTokenRing`, it is possible that some existing nodes
> have not updated their token ring yet, is this correct?
>
> Thanks
> Han
>
>
>
>>
>> Hopefully, the answers help.
>>
>> - Yifan
>>
>> On Sun, Jan 24, 2021 at 1:00 PM Han  wrote:
>>
>>> Hi,
>>>
>>> I wanted to understand how the bootstrapping (add a new node) works. My
>>> understanding is that the first step is Token Allocation and the new node
>>> will get a number of tokens.
>>>
>>> My question is:
>>>
>>> How / when do the existing nodes update their Token Ring state?  and is
>>> that different between the seed node and non-seed node?
>>>
>>> I'm particularly trying to understand the fault-tolerant part of
>>> updating Token Ring state on every node, but couldn't find relevant info by
>>> searching.
>>>
>>> Any info or pointers are appreciated.
>>>
>>> Thanks!
>>> Han
>>>
>>>


Re: Table metrics grid isn't showing in the apache cassandra documentation

2021-01-27 Thread Yifan Cai
Thanks for reporting! The missing table defined in the source file,
`doc/source/operating/metrics.rst`, is invalid, hence not rendered.

I just filed a ticket for it.
https://issues.apache.org/jira/browse/CASSANDRA-16410

- Yifan

On Wed, Jan 27, 2021 at 4:58 PM Carl Mueller
 wrote:

>
> https://cassandra.apache.org/doc/latest/operating/metrics.html#table-metrics
>
> I checked with Chrome and Firefox
>