Thanks Ben,
Before stoping the ec2 I did run nodetool drain .so i ruled it out and
system.log also doesn't show commitlogs being applied.





On Tue, Nov 5, 2019, 7:51 PM Ben Slater <ben.sla...@instaclustr.com> wrote:

> The logs between first start and handshaking should give you a clue but my
> first guess would be replaying commit logs.
>
> Cheers
> Ben
>
> ---
>
>
> *Ben Slater**Chief Product Officer*
>
> <https://www.instaclustr.com/platform/>
>
> <https://www.facebook.com/instaclustr>   <https://twitter.com/instaclustr>
>    <https://www.linkedin.com/company/instaclustr>
>
> Read our latest technical blog posts here
> <https://www.instaclustr.com/blog/>.
>
> This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
> and Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>
>
> On Wed, 6 Nov 2019 at 04:36, Rahul Reddy <rahulreddy1...@gmail.com> wrote:
>
>> I can reproduce the issue.
>>
>> I did drain Cassandra node then stop and started Cassandra instance .
>> Cassandra instance comes up but other nodes will be in DN state around 10
>> minutes.
>>
>> I don't see error in the systemlog
>>
>> DN  xx.xx.xx.59   420.85 MiB  256          48.2%             id  2
>> UN  xx.xx.xx.30   432.14 MiB  256          50.0%             id  0
>> UN  xx.xx.xx.79   447.33 MiB  256          51.1%             id  4
>> DN  xx.xx.xx.144  452.59 MiB  256          51.6%             id  1
>> DN  xx.xx.xx.19   431.7 MiB  256          50.1%             id  5
>> UN  xx.xx.xx.6    421.79 MiB  256          48.9%
>>
>> when i do nodetool status 3 nodes still showing down. and i dont see
>> errors in system.log
>>
>> and after 10 mins it shows the other node is up as well.
>>
>>
>> INFO  [HANDSHAKE-/10.72.100.156] 2019-11-05 15:05:09,133
>> OutboundTcpConnection.java:561 - Handshaking version with /stopandstarted
>> node
>> INFO  [RequestResponseStage-7] 2019-11-05 15:16:27,166 Gossiper.java:1019
>> - InetAddress /nodewhichitwasshowing down is now UP
>>
>> what is causing delay for 10mins to be able to say that node is reachable
>>
>> On Wed, Oct 30, 2019, 8:37 AM Rahul Reddy <rahulreddy1...@gmail.com>
>> wrote:
>>
>>> And also aws ec2 stop and start comes with new instance with same ip and
>>> all our file systems are in ebs mounted fine.  Does coming new instance
>>> with same ip cause any gossip issues?
>>>
>>> On Tue, Oct 29, 2019, 6:16 PM Rahul Reddy <rahulreddy1...@gmail.com>
>>> wrote:
>>>
>>>> Thanks Alex. We have 6 nodes in each DC with RF=3  with CL local qourum
>>>> . and we stopped and started only one instance at a time . Tough nodetool
>>>> status says all nodes UN and system.log says canssandra started and started
>>>> listening . Jmx explrter shows instance stayed down longer how do we
>>>> determine what caused  the Cassandra unavialbe though log says its stared
>>>> and listening ?
>>>>
>>>> On Tue, Oct 29, 2019, 4:44 PM Oleksandr Shulgin <
>>>> oleksandr.shul...@zalando.de> wrote:
>>>>
>>>>> On Tue, Oct 29, 2019 at 9:34 PM Rahul Reddy <rahulreddy1...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>> We have our infrastructure on aws and we use ebs storage . And aws
>>>>>> was retiring on of the node. Since our storage was persistent we did
>>>>>> nodetool drain and stopped and start the instance . This caused 500 
>>>>>> errors
>>>>>> in the service. We have local_quorum and rf=3 why does stopping one
>>>>>> instance cause application to have issues?
>>>>>>
>>>>>
>>>>> Can you still look up what was the underlying error from Cassandra
>>>>> driver in the application logs?  Was it request timeout or not enough
>>>>> replicas?
>>>>>
>>>>> For example, if you only had 3 Cassandra nodes, restarting one of them
>>>>> reduces your cluster capacity by 33% temporarily.
>>>>>
>>>>> Cheers,
>>>>> --
>>>>> Alex
>>>>>
>>>>>

Reply via email to