Question About Reaper

2018-05-19 Thread Surbhi Gupta
Hi, We have a cluster with 144 nodes( 3 datacenter) with 256 Vnodes . When we tried to start repairs from opscenter then it showed 1.9Million ranges to repair . And even after doing compaction and strekamthroughput to 0 , opscenter is not able to help us much to finish repair in 9 days timeframe .

Re: Question About Reaper

2018-05-20 Thread Surbhi Gupta
ar in 1 dc only , we can actually have seperate reaper >> instance handling seperate dc but havent tested it yet. >> >> >> On Sunday, May 20, 2018, Surbhi Gupta wrote: >> >>> Hi, >>> >>> We have a cluster with 144 nodes( 3 datacenter) with 256

Re: Question About Reaper

2018-05-21 Thread Surbhi Gupta
but it will have cascading effects in cpu and memory > consumption. > So test well. > > > On Monday, May 21, 2018, Surbhi Gupta wrote: > >> Thanks a lot for your inputs, >> Abdul, how did u tune reaper? >> >> On Sun, May 20, 2018 at 10:10 AM Jonathan Hadda

Re: Question About Reaper

2018-05-21 Thread Surbhi Gupta
odes, and is available with Cassandra 2.2 and onwards (the > improvement is especially beneficial with Cassandra 3.0+ as such token > ranges will be repaired in a single session). > > We have a gitter that you can join if you want to ask questions. > > Cheers, > > L

Re: Question About Reaper

2018-05-24 Thread Surbhi Gupta
beneficial with Cassandra 3.0+ as such token >>> ranges will be repaired in a single session). >>> >>> We have a gitter that you can join if you want to ask questions. >>> >>> Cheers, >>> >>> Le lun. 21 mai 2018 à 15:29, Surbhi Gupta a >

Re: Question About Reaper

2018-05-24 Thread Surbhi Gupta
Another question, We use 9142 cqlsh port in one of the datacenter and on other datacenter we use 9042 port. How should we configure this ? On 24 May 2018 at 10:22, Surbhi Gupta wrote: > What is the impact of > PARALLEL - all replicas at the same time ? > Will it make repair faster,

Re: Question About Reaper

2018-05-24 Thread Surbhi Gupta
(X509TrustManagerImpl.java:281) at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:136) at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1501) ... 20 common frames omitted Any thought? On 24 May 2018 at 10:35, Surbhi Gupta wrote: > Another quest

Re: Question About Reaper

2018-05-24 Thread Surbhi Gupta
efaultManagersHolder$2.run(SSLContextImpl.java:823) On 24 May 2018 at 14:12, Dennis Lovely wrote: > looks like you're connecting to a service listening on SSL but you don't > have the CA used in your truststore > > On Thu, May 24, 2018 at 1:58 PM, Surbhi Gupta > wro

Re: Log application Queries

2018-05-25 Thread Surbhi Gupta
If using dse then u can enable in dse.yaml. # CQL slow log settings cql_slow_log_options: enabled: true threshold_ms: 0 ttl_seconds: 259200 As far as my understanding says setlogginglevel is used for changing the logging level as below but not for slow query . - ALL - TRACE - DEBUG

Re: Log application Queries

2018-05-25 Thread Surbhi Gupta
ogging in system_traces. > > How is it different from nodeool setlogginglevel? > > > Regards, > Nitan K. > Cassandra and Oracle Architect/SME > Datastax Certified Cassandra expert > Oracle 10g Certified > > On Fri, May 25, 2018 at 11:41 AM, Surbhi Gupta > wro

Re: nodetool rebuild

2018-09-12 Thread Surbhi Gupta
Increase 3 throughput Compaction throughput Stream throughput Interdcstream throughput (if rebuilding from another DC) Make all of the above to 0 and see if there is any improvement and later set the value if u can’t leave these values to 0. On Wed, Sep 12, 2018 at 5:42 AM Vitali Dyachuk wrote:

Re: Read timeouts when performing rolling restart

2018-09-12 Thread Surbhi Gupta
Another thing to notice is : system_auth WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'} system_auth has a replication factor of 1 and even if one node is down it may impact the system because of the replication factor. On Wed, 12 Sep 2018 at 09:46, Steinmaurer, Thoma

Re: Cassandra | Cross Data Centre Replication Status

2018-10-31 Thread Surbhi Gupta
Nodetool repair will take way more time than nodetool rebuild. How much data u have in your original data center? Repair should be run to make the data consistent in case of node down more than hintedhandoff period and dropped mutations. But as a thumb rule ,generally we run repair using opscenter

Re: Cassandra | Cross Data Centre Replication Status

2018-10-31 Thread Surbhi Gupta
Repair will take way more time then rebuild. On Wed, Oct 31, 2018 at 6:45 AM Kiran mk wrote: > Run the repair with -pr option on each node which will repair only the > > parition range. > > > > nodetool repair -pr > > On Wed, Oct 31, 2018 at 7:04 PM Surbhi Gupta &g

Re: can i...

2019-03-07 Thread Surbhi Gupta
Send the details On Thu, Mar 7, 2019 at 8:45 AM Nick Hatfield wrote: > Use this email to get some insight on how to fix database issues in our > cluster? >

Re: Cassandra-stress testing

2019-08-20 Thread Surbhi Gupta
Have you tried ycsa? It is a tool from yahoo for stress testing nosql databases. On Tue, Aug 20, 2019 at 3:34 AM wrote: > Hi Everyone, > > > > Anyone before who have bused Cassandra-stress. I want to test if it’s > possible to load 600 milllions records per hour in Cassandra or > > Find a better

Re: Alternative approach to setting up new DC

2016-04-22 Thread Surbhi Gupta
Why dont you use nodetool rebuild ? On 21 April 2016 at 09:16, Jan wrote: > Jens; > > I am unsure that you need to enable Replication & also use the sstable > loader. > You could load the data into the new DC and susbsequently alter the > keyspace to replicate from the older DC. > > Cheers > Jan

Changing a cluster name

2016-06-28 Thread Surbhi Gupta
system.local uses local strategy . You need to update on all nodes . On 28 June 2016 at 14:51, Tyler Hobbs > wrote: > First, make sure that you call nodetool flush after modifying the system > table. That's probably why it's not surviving the restart. > > Second, I believe you will have to do t

Re: compactions stuck and restart doesn't kill it

2016-08-08 Thread Surbhi Gupta
Once you restart a node compaction will start automatically, if u dont want to do so then do nodetool disableautocompaction as soon as node is started . On 8 August 2016 at 07:22, John Wong wrote: > Hi > > We have a compaction stuck. No progress ever made. > > nodetool compactionstats > pending

Re: compactions stuck and restart doesn't kill it

2016-08-08 Thread Surbhi Gupta
AM, Surbhi Gupta > wrote: > >> Once you restart a node compaction will start automatically, if u dont >> want to do so then do >> nodetool disableautocompaction as soon as node is started . >> >> > Thanks. I certainly can give that a try for the specific co

Re: compactions stuck and restart doesn't kill it

2016-08-08 Thread Surbhi Gupta
or. Is it safe to > run this? > > Thanks. > > On Mon, Aug 8, 2016 at 1:18 PM, Surbhi Gupta > wrote: > >> Can you see if any of the sstable is corrupt ? >> I have seen with my past experience , if any of the sstable which is >> part of the compaction is

Re: New data center to an existing cassandra cluster

2016-08-27 Thread Surbhi Gupta
Yes, it will have issue during the time new nodes are building So it is always advised to use LOCAL_QUORUM instead of QUORUM and LOCAL_ONE instead of ONE On 27 August 2016 at 09:45, laxmikanth sadula wrote: > Hi, > > I'm going to add a new data center DC3 to an existing cassandra cluster > w

Re: Re : Cluster performance after enabling SSL

2016-09-13 Thread Surbhi Gupta
We have seen a little overhead in latencies while enabling the client_encryption. Our cluster gets around 40-50K reads and writes per second. On 13 September 2016 at 12:01, sai krishnam raju potturi < pskraj...@gmail.com> wrote: > hi; > will enabling SSL (node-to-node) cause an overhead in the

Re: Where to change the datacenter name?

2016-10-10 Thread Surbhi Gupta
Data center name is there in two file , if you are using gossip as GossipingPropertyFileSnitch in Cassandra.yaml then data center name is in cassandra-rackdc.properties If you are using PropertyFileSnitch in Cassandra.yaml then file name where data center name is Cassandra-topology.properties fi

Re: Thousands of SSTables generated in only one node

2016-10-25 Thread Surbhi Gupta
We have seen the issue while using LCS that there were around 100K stables got generated and compactions were not able to catch up and node became unresponsive. The reason for that was one of the stable got corrupted and compaction was kind of hanging on that sstable and further sstables were flush

Re: Keyspace/CF creation Timeouts

2016-10-25 Thread Surbhi Gupta
1. Make sure all nodes are up and running while you are trying to create the Keyspaces and Column Family. 2. What is the write consistency level u r using? On 25 October 2016 at 13:18, Jai Bheemsen Rao Dhanwada < jaibheem...@gmail.com> wrote: > Hello All, > > I have recently started noticing tim

Re: Keyspace/CF creation Timeouts

2016-10-25 Thread Surbhi Gupta
5, 2016 at 4:41 PM, Jai Bheemsen Rao Dhanwada < > jaibheem...@gmail.com> wrote: > >> 1. Yes, all nodes are up and running, >> 2. We are using the Local_QUORUM. >> >> On Tue, Oct 25, 2016 at 1:28 PM, Surbhi Gupta >> wrote: >> >>> 1. Make sure

Re: Keyspace/CF creation Timeouts

2016-10-25 Thread Surbhi Gupta
S > and 3 CF. > > write_request_timeout_in_ms = 1 -> 10 seconds > > On Tue, Oct 25, 2016 at 3:00 PM, Surbhi Gupta > wrote: > >> As you have many keyspaces and column family to be created that might be >> the reason that within the stipulated time response is not coming back

Re: Priority for cassandra nodes in cluster

2016-11-12 Thread Surbhi Gupta
If u ask conceptually, it is possible but not recommended. If u really want to do it use the initial token setting and provide the broad range to the nodes where u want more data. But u need to understand about the replication factor consideration, if u keep rf as 3 on a 3 node cluster that means a

Re: Cassandra Node Restart Stuck in STARTING?

2016-11-16 Thread Surbhi Gupta
Attaching the system.log can give more details ... On 16 November 2016 at 11:05, Daniel Subak wrote: > Hey everyone, > > Ran into an issue running a node restart where "nodetool netstats" > reported the node as "STARTING" with no streams when run locally. "nodetool > status" run on other nodes r

Re: How to find total data size of a keyspace.

2017-02-28 Thread Surbhi Gupta
Nodetool status key space_name . On Tue, Feb 28, 2017 at 4:53 AM anuja jain wrote: > Hi, > Using nodetool cfstats gives me data size of each table/column family and > nodetool ring gives me load of all keyspace in cluster but I need total > data size of one keyspace in the cluster. How can I get

Re: Is there a way to remove a node with Opscenter?

2015-07-07 Thread Surbhi Gupta
If node is down use : nodetool removenode We have to run the below command when the node is down & if the cluster does not use vnodes, before running the nodetool removenode command, adjust the tokens. If the node is up, then the command would be “nodetool decommission” to remove the node. Rem

Re: Can't connect to Cassandra server

2015-07-23 Thread Surbhi Gupta
What is the output you are getting if you are issuing nodetool status command ... On 23 July 2015 at 11:30, Chamila Wijayarathna wrote: > Hi Peer, > > I changed cassandra-env.sh and following are the parameters I used,' > > MAX_HEAP_SIZE="8G" > HEAP_NEWSIZE="1600M" > > But I am still unable to s

Validation of Data after data migration from RDBMS to Cassandra

2015-08-06 Thread Surbhi Gupta
Hi, We have to migrate the data from Oracle/mysql to Cassandra. I wanted to understand, if we have any tool/utilitiy which can help in validation the data after the data migration to Cassandra. Thanks Surbhi

Re: Error Connecting to Cassandra

2015-10-28 Thread Surbhi Gupta
Are you running heavy load? I have seen these kinds error application team reporting to us in case when they have too many connection already setup and they are trying to connect more applications. Try to disconnect the applications which are not required and try again .. Hope this helps... On 28

Re: Error Connecting to Cassandra

2015-10-28 Thread Surbhi Gupta
ing a simple python application that isn’t heavy from point of > view of access to cassandra. The application create a new keyspace, tables > and do the load of data. > The application is an example in python-driver folder. > > > On 29 Oct 2015, at 00:33, Surbhi Gupta wrote: >

Re: Re : will Unsafeassaniate a dead node maintain the replication factor

2015-10-31 Thread Surbhi Gupta
You have to do few things before unsafe as sanitation . First run the nodetool decommission if the node is up and wait till streaming happens . You can check is the streaming is completed by nodetool netstats . If streaming is completed you can do unsafe assanitation . To answer your question

Re: Re : Unable to bootstrap a new node

2015-10-31 Thread Surbhi Gupta
Please send both the yaml file I.e. Seed and new node. Sent from my iPhone > On Oct 31, 2015, at 5:55 AM, sai krishnam raju potturi > wrote: > > hi; >we were trying to add in a new node to the cluster. It fails during the > bootstrap process unable to gossip with seed nodes. We have n

Re: Re : will Unsafeassaniate a dead node maintain the replication factor

2015-10-31 Thread Surbhi Gupta
d make sure the replication of 3 > is maintained? > > >> On Sat, Oct 31, 2015, 11:14 Surbhi Gupta wrote: >> You have to do few things before unsafe as sanitation . First run the >> nodetool decommission if the node is up and wait till streaming happens . >>

Re: Re : will Unsafeassaniate a dead node maintain the replication factor

2015-10-31 Thread Surbhi Gupta
So have you already done unsafe assassination ? On 31 October 2015 at 08:37, sai krishnam raju potturi wrote: > it's dead; and we had to do unsafeassassinate as other 2 methods did not > work > > On Sat, Oct 31, 2015 at 11:30 AM, Surbhi Gupta > wrote: > >> Whether

Re: Re : will Unsafeassaniate a dead node maintain the replication factor

2015-10-31 Thread Surbhi Gupta
Is the cluster using vnodes? Sent from my iPhone > On Oct 31, 2015, at 9:16 AM, sai krishnam raju potturi > wrote: > > yes Surbhi. > >> On Sat, Oct 31, 2015 at 12:10 PM, Surbhi Gupta >> wrote: >> So have you already done unsafe assassination ? >>

Re: Re : will Unsafeassaniate a dead node maintain the replication factor

2015-10-31 Thread Surbhi Gupta
If it is using vnodes then just run nodetool repair . It should fix the issue related to data if any. And then run nodetool cleanup Sent from my iPhone > On Oct 31, 2015, at 3:12 PM, sai krishnam raju potturi > wrote: > > yes Surbhi. > >> On Sat, Oct 31, 2015 at

discrepancy in up nodes from different nodes

2016-03-19 Thread Surbhi Gupta
Hi, I have changed endpoint_snitch from Simple to GossipingPropertyFileSnitch. And changed the cassandra-rackdc.properties file to reflect the correct DC and RACK. However when i did rolling restart then one node is showing 15 nodes up, otehr node is showing 10 nodes up etc. I have done rolling

antlr-runtime-3.2.jar is turning into 0 bytes and dse is going down

2017-04-05 Thread Surbhi Gupta
Hi, We have single node instance where we have cassandra , mysql and application running at the same node for developers. We are at dse 4.8.9 and dse is going down after sometime . What we have noticed is that few of the jar at /usr/share/dse/common are turning into 0 bytes. Jars are as follows:

Re: Unsuccessful back-up and restore with differing counts

2017-05-13 Thread Surbhi Gupta
Below link has the method u r looking for http://datascale.io/cloning-cassandra-clusters-fast-way/ On Sat, May 13, 2017 at 9:49 AM srinivasarao daruna wrote: > I am using vnodes. Is there a documenation that you can suggest to > understand how to assign same tokens in new cluster.? I will try it

How do you do automatic restacking of AWS instance for cassandra?

2017-05-25 Thread Surbhi Gupta
Hi, Wanted to understand, how do you do automatic restacking of cassandra nodes on AWS? Thanks Surbhi

Re: How do you do automatic restacking of AWS instance for cassandra?

2017-05-27 Thread Surbhi Gupta
9872 <+44%2020%208144%209872>* >> >> >> *“All men dream, but not equally. Those who dream by night in the dusty >> recesses of their minds wake up in the day to find it was vanity, but the >> dreamers of the day are dangerous men, for they may act their dreams with

Re: How do you do automatic restacking of AWS instance for cassandra?

2017-05-27 Thread Surbhi Gupta
We get the new AMI release with the new OS updates and we are not allowed to use the old AMI . On Sat, May 27, 2017 at 7:11 PM Jeff Jirsa wrote: > > > > > On 2017-05-27 18:04 (-0700), Surbhi Gupta > wrote: > > > Thanks a lot for all of your reply. > > > Ou

Re: Tool to manage cassandra

2017-06-16 Thread Surbhi Gupta
If u are using dse then u can use opscenter On Fri, Jun 16, 2017 at 6:01 AM Ram Bhatia wrote: > Hi > > > > > > > > > > May I know, if there a tool similar to Oracle Enterprise Manager for > managing Cassandra ? > > > > > > > > > > Thank you in advance for your help, > > > > > Ram Bhatia > > > >

Re: Cassandra crashes....

2017-08-22 Thread Surbhi Gupta
16GB heap is too small for G1GC . Try at least 32GB of heap size On Tue, Aug 22, 2017 at 7:58 AM Fay Hou [Storage Service] ­ < fay...@coupang.com> wrote: > What errors do you see? > 16gb of 256 GB . Heap is too small. I would give heap at least 160gb. > > > On Aug 22, 2017 7:42 AM, "Thakrar, Jayes

How to use nodetool ring only for one data center

2015-04-28 Thread Surbhi Gupta
Hi, I wanted to know, how can we get the information of the token rings only for one data centers when using vnodes and multiple data center. Thanks Surbhi

Re: How to use nodetool ring only for one data center

2015-04-28 Thread Surbhi Gupta
7;ll script using grep to remove the unwanted data > > Rahul > > > On Apr 28, 2015, at 7:24 PM, Surbhi Gupta > wrote: > > > > Hi, > > > > I wanted to know, how can we get the information of the token rings only > for one data centers when using vnodes and multiple data center. > > > > Thanks > > Surbhi >

Case Study from Migrating from RDBMS to Cassandra

2014-07-22 Thread Surbhi Gupta
Hi, Does anybody has the case study for Migrating from RDBMS to Cassandra ? Thanks

Re: Case Study from Migrating from RDBMS to Cassandra

2014-07-22 Thread Surbhi Gupta
com/relational-database-to-nosql > > > > On Tue, Jul 22, 2014 at 7:45 PM, Surbhi Gupta > wrote: > >> Hi, >> >> Does anybody has the case study for Migrating from RDBMS to Cassandra ? >> >> Thanks >> > >

Re: Cassandra is not showing a node up hours after restart

2019-11-24 Thread Surbhi Gupta
It sounds silly but sometimes restarting again the node which is showing down from other nodes fix the issue. This looks like a gossip issue. On Sun, Nov 24, 2019 at 7:19 AM Paul Mena wrote: > I am in the process of doing a rolling restart on a 4-node cluster running > Cassandra 2.1.9. I stopped

Re: Cassandra is not showing a node up hours after restart

2019-11-24 Thread Surbhi Gupta
Before Cassandra shutdown, nodetool drain should be executed first. As soon as you do nodetool drain, others node will see this node down and no new traffic will come to this node. I generally gives 10 seconds gap between nodetool drain and Cassandra stop. On Sun, Nov 24, 2019 at 9:52 AM Paul Mena

How to read content of hints file and apply them manually?

2020-01-27 Thread Surbhi Gupta
Hi, We are on Open source 3.11 . We have a issue in one of the cluster where lots of hints gets piled up and they don't get applied within hinted handoff period ( 3 hour in our case) . And load and CPU of the server goes very high. We see lot of messages in system.log and debug.log . Our read re

Re: How to read content of hints file and apply them manually?

2020-01-27 Thread Surbhi Gupta
to the underlying issue. Run a > full repair. > On Monday, January 27, 2020, 10:17:01 p.m. UTC, Surbhi Gupta < > surbhi.gupt...@gmail.com> wrote: > > > Hi, > > We are on Open source 3.11 . > We have a issue in one of the cluster where lots of hints gets piled up > and th

Re: How to read content of hints file and apply them manually?

2020-01-27 Thread Surbhi Gupta
We tried to tune sethintedhandoffthrottlekb to 100 , 1024 , 10240 but nothing helped . Our hints related parameters are as below, if you don't find any parameter below then it is not set in our environment and should be of the default value. max_hint_window_in_ms: 1080 # 3 hours hinted_handof

Re: How to read content of hints file and apply them manually?

2020-01-28 Thread Surbhi Gupta
put to write and read at > the same time? These are exactly the symptoms I see when running Cassandra > on a SAN or NAS. > > Patrick > > On Mon, Jan 27, 2020 at 8:17 PM Surbhi Gupta > wrote: > > We tried to tune sethintedhandoffthrottlekb to 100 , 1024 , 10240 but > no

Re: How to read content of hints file and apply them manually?

2020-01-28 Thread Surbhi Gupta
high cpu issue ? On Tue, Jan 28, 2020 at 1:12 PM Patrick McFadin wrote: > I would definitely check the IO stats then, If you see latency going over > 20ms, you need to solve that problem. > > Patrick > > On Tue, Jan 28, 2020 at 12:01 PM Surbhi Gupta > wrote: > >>

Nodes becoming unresponsive

2020-02-05 Thread Surbhi Gupta
Hi, We have noticed in a Cassandra Cluster , one of the node has 100% cpu utilization, using top we can see that cassandra process is showing futex_wait . We are on CentOS release 6.10 (Final) .As per below document the futex bug was on Centos 6.6 . https://support.datastax.com/hc/en-us/articles

Re: Nodes becoming unresponsive

2020-02-05 Thread Surbhi Gupta
Sure Eric... I tried strace as well ...

Re: Nodes becoming unresponsive

2020-02-06 Thread Surbhi Gupta
I have limited options to use JDK based tools because in our environment we are running JRE . I tried to debug more and could see using top that Command is MutationStage in top output , Any clue we get from this ? top - 16:30:47 up 94 days, 5:33, 1 user, load average: 134.83, 142.48, 144.75 Ta

Overload because of hint pressure + MVs

2020-02-07 Thread Surbhi Gupta
Hi, We are getting hit by the below bug. Other than lowering hinted_handoff_throttle_in_kb to 100 any other work around ? https://issues.apache.org/jira/browse/CASSANDRA-13810 Any idea if it got fixed in later version. We are on Open source Cassandra 3.11.1 . Thanks Surbhi

Re: Overload because of hint pressure + MVs

2020-02-09 Thread Surbhi Gupta
That JIRA still says Open, so no, it has not been fixed (unless there's >> a fixed duplicate in JIRA somewhere). >> >> For clarification, you could update that ticket with a comment including >> your environmental details, usage of MV, etc. I'll bump the priority up

Re: Overload because of hint pressure + MVs

2020-02-10 Thread Surbhi Gupta
even when we can see that all nodes are UP ? Recommended value of phi_convict_threshold is 12 in AWS multi datacenter environment. Thanks Surbhi On Sun, 9 Feb 2020 at 21:42, Surbhi Gupta wrote: > Thanks a lot Jon.. > Will try the recommendations and let you know the results > >

Re: Overload because of hint pressure + MVs

2020-02-10 Thread Surbhi Gupta
Just to add , we are using 24GB heap size. On Mon, 10 Feb 2020 at 09:08, Surbhi Gupta wrote: > Hi Jon, > > We are on multi datacenter(On Prim) setup. > We also noticed too many messages like below: > > DEBUG [GossipStage:1] 2020-02-10 09:38:52,953 FailureDetector.java:457 - &

Re: Overload because of hint pressure + MVs

2020-02-11 Thread Surbhi Gupta
We are using G1 ... On Tue, 11 Feb 2020 at 08:51, Reid Pinchback wrote: > A caveat to the 31GB recommendation for G1GC. If you have tight latency > SLAs instead of throughput SLAs then this doesn’t necessary pan out to be > beneficial. > > > > Yes the GCs are less frequent, but they can hurt mo

Consequences of dropping Materialized views

2020-02-12 Thread Surbhi Gupta
Hi, So application team created 11 materialized views on a base table in production and we need to drop 7 Materialized views as they are not in use. Wanted to understand the impact of dropping the materialized views. We are on Cassandra 3.11.1 , multi datacenter with replication factor of 3 in eac

Re: Consequences of dropping Materialized views

2020-02-12 Thread Surbhi Gupta
ami...@datastax.com | datastax.com <http://www.datastax.com> > <https://www.linkedin.com/company/datastax> > <https://www.facebook.com/datastax> <https://twitter.com/datastax> > <http://feeds.feedburner.com/datastax> <https://github.com/datastax/> > &g

Re: Consequences of dropping Materialized views

2020-02-18 Thread Surbhi Gupta
https://support.datastax.com/hc/en-us/articles/36368126-Hints-file-with-unknown-CFID-can-cause-nodes-to-fail On Wed, 12 Feb 2020 at 19:10, Surbhi Gupta wrote: > Thanks Eric ... > This is helpful... > > > On Wed, 12 Feb 2020 at 17:46, Erick Ramirez > wrote: > >> There

Re: Consequences of dropping Materialized views

2020-02-18 Thread Surbhi Gupta
> We now disallow the use of MVs globally. > > On Tue, Feb 18, 2020, 8:27 PM Surbhi Gupta > wrote: > >> We are on cassandra 3.11 , we are using G1GC and using 16GB of heap. >> >> So we had to drop 7 MVs in production, as soon as we dropped the first >> M

Re: Consequences of dropping Materialized views

2020-02-18 Thread Surbhi Gupta
nd warn others. > > The version would have been 3.0.15 or 3.11.3 as that is what we were > deploying on our clusters at the time. I think it was more likely 3.0.15. > > So sorry for the "vagueness" :( > > On Tue, Feb 18, 2020, 8:54 PM Surbhi Gupta > wrote: > &g

Re: Consequences of dropping Materialized views

2020-02-18 Thread Surbhi Gupta
Thanks Eric, Let me go back to the app team On Tue, Feb 18, 2020 at 6:49 PM Erick Ramirez wrote: > We are on cassandra 3.11 , we are using G1GC and using 16GB of heap. >> > > Which exact version of C* is it again? > >> WARN [MessagingService-Incoming-/10.X.X.X] 2020-02-18 14:21:47,115 >> Incomin

Re: Consequences of dropping Materialized views

2020-02-18 Thread Surbhi Gupta
We are Cassandra 3.11.0 unfortunately :( On Tue, 18 Feb 2020 at 19:41, Erick Ramirez wrote: > Clearly the hint error invoked the fs error handler - probably incorrectly >> - which shut down the db. That’s not ok and deserves a JIRA. >> > > It's supposed to have been fixed by CASSANDRA-13696 in 3

Re: Consequences of dropping Materialized views

2020-02-18 Thread Surbhi Gupta
3.11.0 in dev/test and prod . Just a thought On Tue, 18 Feb 2020 at 19:49, Surbhi Gupta wrote: > We are Cassandra 3.11.0 unfortunately :( > > On Tue, 18 Feb 2020 at 19:41, Erick Ramirez > wrote: > >> Clearly the hint error invoked the fs error handler - probably >>>

Re: Consequences of dropping Materialized views

2020-02-18 Thread Surbhi Gupta
Hi Eric, As per https://issues.apache.org/jira/browse/CASSANDRA-13696 , this issue happens even with write traffic "I did more investigation today. Seems it's more serious than I thought. Even there's no down node, "drop table" + write traffic, will trigger the problem." Thanks Surbhi On Tue, 1

Re: Consequences of dropping Materialized views

2020-02-18 Thread Surbhi Gupta
Just to add to my above point because here we are dropping MV not a regular table. And MV does read before write , Is this the reason we are seeing the below message? Trying to understand WARN [HintsDispatcher:6737] 2020-02-18 14:22:24,932 HintsReader.java:237 - Failed to read a hint for /10.X.X.

Re: Consequences of dropping Materialized views

2020-02-18 Thread Surbhi Gupta
Thanks Eric... On Tue, 18 Feb 2020 at 22:06, Erick Ramirez wrote: > Just to add to my above point because here we are dropping MV not a >> regular table. >> And MV does read before write , Is this the reason we are seeing the >> below message? Trying to understand >> >> WARN [HintsDispatcher:67

Re: Consequences of dropping Materialized views

2020-02-18 Thread Surbhi Gupta
So should upgrading to 3.11.1 will solve this issue? On Tue, 18 Feb 2020 at 22:18, Surbhi Gupta wrote: > Thanks Eric... > > On Tue, 18 Feb 2020 at 22:06, Erick Ramirez > wrote: > >> Just to add to my above point because here we are dropping MV not a >>> regul

Re: Consequences of dropping Materialized views

2020-02-18 Thread Surbhi Gupta
Application team confirmed that they are * not* referencing the dropped MVs anywhere for reading or writing On Tue, 18 Feb 2020 at 22:25, Surbhi Gupta wrote: > So should upgrading to 3.11.1 will solve this issue? > > On Tue, 18 Feb 2020 at 22:18, Surbhi Gupta > wrote: >

Downgrading from 3.11.5 to 3.11.0

2020-03-04 Thread Surbhi Gupta
Hi, As the SSTable file formats have changed from 3.11.4 to "md " https://docs.datastax.com/en/landing_page/doc/landing_page/compatibility.html We are going to take snapshots but still wanted to understand . After we do upgrades stable when we upgrade from 3.11.0 to 3.11.5 , and later in future i

Upgradesstables - PerSSTableIndexWriter.java:211 - Rejecting value

2020-03-05 Thread Surbhi Gupta
Hi, We are in process of upgrading from 3.11.0 to 3.115 . While upgrading SSTables we are noticing messages like below in system.log. What are the significance of these messages? INFO [CompactionExecutor:3] 2020-03-05 16:12:41,393 PerSSTableIndexWriter.java:211 - Rejecting value (size 1.022KiB,

Re: Upgradesstables - PerSSTableIndexWriter.java:211 - Rejecting value

2020-03-09 Thread Surbhi Gupta
We have SASI index . Any solution ? On Thu, 5 Mar 2020 at 15:20, Surbhi Gupta wrote: > Hi, > > We are in process of upgrading from 3.11.0 to 3.115 . > While upgrading SSTables we are noticing messages like below in system.log. > What are the significance of these mes

Re: Upgradesstables - PerSSTableIndexWriter.java:211 - Rejecting value

2020-03-09 Thread Surbhi Gupta
would like to understand the impact of this rejection. On Mon, 9 Mar 2020 at 08:47, Surbhi Gupta wrote: > We have SASI index . > Any solution ? > > On Thu, 5 Mar 2020 at 15:20, Surbhi Gupta > wrote: > >> Hi, >> >> We are in process of upgrading from 3.11.0

Re: Upgradesstables - PerSSTableIndexWriter.java:211 - Rejecting value

2020-03-09 Thread Surbhi Gupta
On Mon, 9 Mar 2020 at 09:36, Surbhi Gupta wrote: > > https://javadoc.io/static/org.apache.cassandra/cassandra-all/3.11.4/constant-values.html#org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.MAX_TERM_SIZE > The MAX_TERM_SIZE value is 1024 > > Can we change it ? > > > T

OOM only on one datacenter nodes

2020-04-04 Thread Surbhi Gupta
Hi, We have two datacenter with 5 nodes each and have replication factor of 3. We have traffic on DC1 and DC2 is just for disaster recovery and there is no direct traffic. We are using 24cpu with 128GB RAM machines . For DC1 where we have live traffic , we don't see any issue, however for DC2 wher

Re: OOM only on one datacenter nodes

2020-04-05 Thread Surbhi Gupta
I just checked, we have setup the Heapsize to be 31GB not 32GB in DC2. I checked the CPU and RAM both are same on all the nodes in DC1 and DC2. What specific parameter I should check on OS ? We are using CentOS release 6.10. Currently disk_access_modeis not set hence it is auto in our env. Should

Re: OOM only on one datacenter nodes

2020-04-05 Thread Surbhi Gupta
unts objects by total retained size. Take a screenshot. > Send that. > > > > On Apr 5, 2020, at 6:51 PM, Surbhi Gupta wrote: > >  > I just checked, we have setup the Heapsize to be 31GB not 32GB in DC2. > > I checked the CPU and RAM both are same on all the nodes in DC1 a

Bootstraping is failing

2020-05-07 Thread Surbhi Gupta
Hi, We are trying to expand a datacenter and trying to add nodes but when node is bootstrapping , it goes half way through and then fail with below error, We have increased stremthroughput from 200 to 400 when we were trying for the 2nd time but still it failed. We are on 3.11.0 , using G1GC with

Re: Bootstraping is failing

2020-05-07 Thread Surbhi Gupta
ion? > > What does the following show? > cat /proc/sys/net/ipv4/tcp_keepalive_time > cat /proc/sys/net/ipv4/tcp_keepalive_intvl > cat /proc/sys/net/ipv4/tcp_keepalive_probes > > On Thu, May 7, 2020 at 10:31 AM Surbhi Gupta > wrote: > >> Hi, >> >> We are

Re: Bootstraping is failing

2020-05-07 Thread Surbhi Gupta
ive_intvl=10* > then run sysctl -p to cause the kernel to reload the settings > > 5 minutes (300) seconds is probably too long. > > On Thu, May 7, 2020 at 1:09 PM Surbhi Gupta > wrote: > >> [root@abc cassandra]# cat /proc/sys/net/ipv4/tcp_keepalive_time >> >>

Re: Bootstraping is failing

2020-05-07 Thread Surbhi Gupta
gt; > On Thu, May 7, 2020 at 1:30 PM Surbhi Gupta > wrote: > >> streaming_socket_timeout_in_ms is 24 hour. >> So tcp settings should be changed on the new bootstrap node or on all >> nodes ? >> >> >> On Thu, 7 May 2020 at 13:23, Adam Scott wrote: >>

Re: Bootstraping is failing

2020-05-07 Thread Surbhi Gupta
detool/bootstrap.html) > to pick up where it last left off. Sorry for the late reply. > > > On Thu, May 7, 2020 at 2:22 PM Surbhi Gupta > wrote: > >> So after failed bootstrapped , if we start cassandra again on the new >> node , will it resume bootstrap or will it start

Re: Bootstraping is failing

2020-05-09 Thread Surbhi Gupta
mended, what I wanted to understand , how tcp settings can effect the bootstrapping process ? Thanks Surbhi On Thu, 7 May 2020 at 17:01, Surbhi Gupta wrote: > When we are starting the node, it is starting bootstrap automatically and > restreaming the whole data again. It is not resuming . &

Add a new node of 3.11.5 in a 3.11.0 Cassandra Cluster

2020-05-09 Thread Surbhi Gupta
Hi, We are facing some issue in bootstrapping new node in 3.11.0 and bootstrapping is failing. We have two tasks here : 1. Expand the cluster (Due to disk concern and dropped mutation) 2. Upgrade the cluster from 3.11.0 to 3.11.5 because of various bugs we are hitting in 3.11.0 . So my question h

Truncate Materialized View

2020-05-14 Thread Surbhi Gupta
Hi, We are on 3.11.0 . We have 11 Materialized view on a table. After discussion with application team , we found out that they are using only 4 out of 11 . We tried to drop the materialized view and got hit by the bug https://issues.apache.org/jira/browse/CASSANDRA-13696 which made our whole clus

Re: Truncate Materialized View

2020-05-15 Thread Surbhi Gupta
Anyone has truncated materialized views ? On Thu, 14 May 2020 at 11:59, Surbhi Gupta wrote: > Hi, > > We are on 3.11.0 . > We have 11 Materialized view on a table. > After discussion with application team , we found out that they are using > only 4 out of 11 . >

Re: Truncate Materialized View

2020-05-15 Thread Surbhi Gupta
iew. > What exact error got ? If you think it is same as the bug, then you may > try to avoid the bug triggered condition. It says pending hints. So you may > let all hints applied, then try drop the view. > > Thanks, > > James > > On Fri, May 15, 2020 at 1:35 PM

  1   2   >