Which version of Cassandra did you install? deb or tar? If it's deb, its script should be used for start/stop. If it's tar, kill pid of cassandra to stop and use bin/cassandra to start.
Stop doesn't need any other actions: drain, disable gossip & etc. Where do you use Cassandra? *-------------------------------------------------------* *VafaTech <http://www.vafatech.com> : A Total Solution for Data Gathering & Analysis* *-------------------------------------------------------* On Fri, Dec 6, 2019 at 11:20 PM Paul Mena <pm...@whoi.edu> wrote: > As we are still without a functional Cassandra cluster in our development > environment, I thought I’d try restarting the same node (one of 4 in the > cluster) with the following command: > > > > ip=$(cat /etc/hostname); nodetool disablethrift && nodetool disablebinary > && sleep 5 && nodetool disablegossip && nodetool drain && sleep 10 && sudo > service cassandra restart && until echo "SELECT * FROM system.peers LIMIT > 1;" | cqlsh $ip > /dev/null 2>&1; do echo "Node $ip is still DOWN"; sleep > 10; done && echo "Node $ip is now UP" > > > > The above command returned “Node is now UP” after about 40 seconds, > confirmed on “node001” via “nodetool status”: > > > > user@node001=> nodetool status > > Datacenter: datacenter1 > > ======================= > > Status=Up/Down > > |/ State=Normal/Leaving/Joining/Moving > > -- Address Load Tokens Owns Host > ID Rack > > UN 192.168.187.121 539.43 GB 256 ? > c99cf581-f4ae-4aa9-ab37-1a114ab2429b rack1 > > UN 192.168.187.122 633.92 GB 256 ? > bfa07f47-7e37-42b4-9c0b-024b3c02e93f rack1 > > UN 192.168.187.123 576.31 GB 256 ? > 273df9f3-e496-4c65-a1f2-325ed288a992 rack1 > > UN 192.168.187.124 628.5 GB 256 ? > b8639cf1-5413-4ece-b882-2161bbb8a9c3 rack1 > > > > As was the case before, running “nodetool status” on any of the other > nodes shows that “node001” is still down: > > > > user@node002=> nodetool status > > Datacenter: datacenter1 > > ======================= > > Status=Up/Down > > |/ State=Normal/Leaving/Joining/Moving > > -- Address Load Tokens Owns Host > ID Rack > > DN 192.168.187.121 538.94 GB 256 ? > c99cf581-f4ae-4aa9-ab37-1a114ab2429b rack1 > > UN 192.168.187.122 634.04 GB 256 ? > bfa07f47-7e37-42b4-9c0b-024b3c02e93f rack1 > > UN 192.168.187.123 576.42 GB 256 ? > 273df9f3-e496-4c65-a1f2-325ed288a992 rack1 > > UN 192.168.187.124 628.56 GB 256 ? > b8639cf1-5413-4ece-b882-2161bbb8a9c3 rack1 > > > > Is it inadvisable to continue with the rolling restart? > > > > *Paul Mena* > > Senior Application Administrator > > WHOI - Information Services > > 508-289-3539 > > > > *From:* Shalom Sagges <shalomsag...@gmail.com> > *Sent:* Tuesday, November 26, 2019 12:59 AM > *To:* user@cassandra.apache.org > *Subject:* Re: Cassandra is not showing a node up hours after restart > > > > Hi Paul, > > > > From the gossipinfo output, it looks like the node's IP address and > rpc_address are different. > > /192.168.*187*.121 vs RPC_ADDRESS:192.168.*185*.121 > > You can also see that there's a schema disagreement between nodes, e.g. > schema_id on node001 is fd2dcb4b-ca62-30df-b8f2-d3fd774f2801 and on node002 > it is fd2dcb4b-ca62-30df-b8f2-d3fd774f2801. > > You can run nodetool describecluster to see it as well. > > So I suggest to change the rpc_address to the ip_address of the node or > set it to 0.0.0.0 and it should resolve the issue. > > > > Hope this helps! > > > > > > On Tue, Nov 26, 2019 at 4:05 AM Inquistive allen <inquial...@gmail.com> > wrote: > > Hello , > > > > Check and compare everything parameters > > > > 1. Java version should ideally match across all nodes in the cluster > > 2. Check if port 7000 is open between the nodes. Use telnet or nc commands > > 3. You must see some clues in system logs, why the gossip is failing. > > > > Do confirm on the above things. > > > > Thanks > > > > > > On Tue, 26 Nov, 2019, 2:50 AM Paul Mena, <pm...@whoi.edu> wrote: > > NTP was restarted on the Cassandra nodes, but unfortunately I’m still > getting the same result: the restarted node does not appear to be rejoining > the cluster. > > > > Here’s another data point: “nodetool gossipinfo”, when run from the > restarted node (“node001”) shows a status of “normal”: > > > > user@node001=> nodetool -u gossipinfo > > /192.168.187.121 > > generation:1574364410 > > heartbeat:209150 > > NET_VERSION:8 > > RACK:rack1 > > STATUS:NORMAL,-104847506331695918 > > RELEASE_VERSION:2.1.9 > > SEVERITY:0.0 > > LOAD:5.78684155614E11 > > HOST_ID:c99cf581-f4ae-4aa9-ab37-1a114ab2429b > > SCHEMA:fd2dcb4b-ca62-30df-b8f2-d3fd774f2801 > > DC:datacenter1 > > RPC_ADDRESS:192.168.185.121 > > > > When run from one of the other nodes, however, node001’s status is shown > as “shutdown”: > > > > user@node002=> nodetool gossipinfo > > /192.168.187.121 > > generation:1491825076 > > heartbeat:2147483647 > > STATUS:shutdown,true > > RACK:rack1 > > NET_VERSION:8 > > LOAD:5.78679987693E11 > > RELEASE_VERSION:2.1.9 > > DC:datacenter1 > > SCHEMA:fd2dcb4b-ca62-30df-b8f2-d3fd774f2801 > > HOST_ID:c99cf581-f4ae-4aa9-ab37-1a114ab2429b > > RPC_ADDRESS:192.168.185.121 > > SEVERITY:0.0 > > > > > > *Paul Mena* > > Senior Application Administrator > > WHOI - Information Services > > 508-289-3539 > > > > *From:* Paul Mena > *Sent:* Monday, November 25, 2019 9:29 AM > *To:* user@cassandra.apache.org > *Subject:* RE: Cassandra is not showing a node up hours after restart > > > > I’ve just discovered that NTP is not running on any of these Cassandra > nodes, and that the timestamps are all over the map. Could this be causing > my issue? > > > > user@remote=> ansible pre-prod-cassandra -a date > > node001.intra.myorg.org | CHANGED | rc=0 >> > > Mon Nov 25 13:58:17 UTC 2019 > > > > node004.intra.myorg.org | CHANGED | rc=0 >> > > Mon Nov 25 14:07:20 UTC 2019 > > > > node003.intra.myorg.org | CHANGED | rc=0 >> > > Mon Nov 25 13:57:06 UTC 2019 > > > > node001.intra.myorg.org | CHANGED | rc=0 >> > > Mon Nov 25 14:07:22 UTC 2019 > > > > *Paul Mena* > > Senior Application Administrator > > WHOI - Information Services > > 508-289-3539 > > > > *From:* Inquistive allen <inquial...@gmail.com> > *Sent:* Monday, November 25, 2019 2:46 AM > *To:* user@cassandra.apache.org > *Subject:* Re: Cassandra is not showing a node up hours after restart > > > > Hello team, > > > > Just to add on to the discussion, one may run, > > Nodetool disablebinary followed by a nodetool disablethrift followed by > nodetool drain. > > Nodetool drain also does the work of nodetool flush+ declaring in the > cluster that I'm down and not accepting traffic. > > > > Thanks > > > > > > On Mon, 25 Nov, 2019, 12:55 AM Surbhi Gupta, <surbhi.gupt...@gmail.com> > wrote: > > Before Cassandra shutdown, nodetool drain should be executed first. As > soon as you do nodetool drain, others node will see this node down and no > new traffic will come to this node. > > I generally gives 10 seconds gap between nodetool drain and Cassandra > stop. > > > > On Sun, Nov 24, 2019 at 9:52 AM Paul Mena <pm...@whoi.edu> wrote: > > Thank you for the replies. I had made no changes to the config before the > rolling restart. > > > > I can try another restart but was wondering if I should do it differently. > I had simply done "service cassandra stop" followed by "service cassandra > start". Since then I've seen some suggestions to proceed the shutdown with > "nodetool disablegossip" and/or "nodetool drain". Are these commands > advisable? Are any other commands recommended either before the shutdown or > after the startup? > > > > Thanks again! > > > > Paul > ------------------------------ > > *From:* Naman Gupta <naman.gu...@girnarsoft.com> > *Sent:* Sunday, November 24, 2019 11:18:14 AM > *To:* user@cassandra.apache.org > *Subject:* Re: Cassandra is not showing a node up hours after restart > > > > Did you change the name of datacenter or any other config changes before > the rolling restart? > > > > On Sun, Nov 24, 2019 at 8:49 PM Paul Mena <pm...@whoi.edu> wrote: > > I am in the process of doing a rolling restart on a 4-node cluster running > Cassandra 2.1.9. I stopped and started Cassandra on node 1 via "service > cassandra stop/start", and noted nothing unusual in either system.log or > cassandra.log. Doing a "nodetool status" from node 1 shows all four nodes > up: > > > > user@node001=> nodetool status > > Datacenter: datacenter1 > > ======================= > > Status=Up/Down > > |/ State=Normal/Leaving/Joining/Moving > > -- Address Load Tokens Owns Host ID > Rack > > UN 192.168.187.121 538.95 GB 256 ? > c99cf581-f4ae-4aa9-ab37-1a114ab2429b rack1 > > UN 192.168.187.122 630.72 GB 256 ? > bfa07f47-7e37-42b4-9c0b-024b3c02e93f rack1 > > UN 192.168.187.123 572.73 GB 256 ? > 273df9f3-e496-4c65-a1f2-325ed288a992 rack1 > > UN 192.168.187.124 625.05 GB 256 ? > b8639cf1-5413-4ece-b882-2161bbb8a9c3 rack1 > > But doing the same command from any other of the 3 nodes shows node 1 > still down. > > > > user@node002=> nodetool status > > Datacenter: datacenter1 > > ======================= > > Status=Up/Down > > |/ State=Normal/Leaving/Joining/Moving > > -- Address Load Tokens Owns Host ID > Rack > > DN 192.168.187.121 538.94 GB 256 ? > c99cf581-f4ae-4aa9-ab37-1a114ab2429b rack1 > > UN 192.168.187.122 630.72 GB 256 ? > bfa07f47-7e37-42b4-9c0b-024b3c02e93f rack1 > > UN 192.168.187.123 572.73 GB 256 ? > 273df9f3-e496-4c65-a1f2-325ed288a992 rack1 > > UN 192.168.187.124 625.04 GB 256 ? > b8639cf1-5413-4ece-b882-2161bbb8a9c3 rack1 > > Is there something I can do to remedy this current situation - so that I > can continue with the rolling restart? > > > >