Re: Hanging repairs in Cassandra

Bowen Song Tue, 18 Jan 2022 09:20:37 -0800

The link was related to Cassandra 1.2, and it was 9 years ago. Cassandrawas full of bugs at that time, and it has improved a lot since then. Forthat reason, I would rather not compare the issue you have with some 9years old issues someone else had.


On 18/01/2022 16:11, manish khandelwal wrote:

I am not sure what is happening but it has happened thrice. It ishappening that merkle trees are not received from nodes of other datacenter. Getting issue on similar lines as mentioned herehttps://user.cassandra.apache.narkive.com/GTbqO6za/repair-hangs-when-merkle-tree-request-is-not-acknowledged


Regards
Manish

On Tue, Jan 18, 2022, 18:18 Bowen Song <bo...@bso.ng> wrote:

    Keep reading the log on the initiator and the node sending the
    merkle tree, anything follows that? FYI, not all log has the
    repair ID in it, therefore please read the relevant logs in the
    chronological order without filtering (e.g. "grep") on the repair ID.

    I'm sceptical network issue is causing all this. The merkle tree
    is send over TCP connections, therefore some dropped packets over
    a few second of network connectivity issue occasionally should not
    cause any issue to the repair. You should only start to see
    network related issues if the network problem persists over a
    period of time close to or longer than the timeout values set in
    the cassandra.yaml file, in the case of repair it's the
    request_timeout_in_ms which is default to 10 seconds.

    Carry on examine the logs, you may find something useful.

    BTW, talking about stuck repair, in my experience this can happen
    if two or more repairs were ran concurrently on the same node
    (regardless which node was the initiator) involving the same
    table. This could happen if you accidentally ran "nodetool repair"
    on two nodes and both involve the same table, or if you cancelled
    and then restarted a "nodetool repair" on a node without waiting
    or killing the remannings of the first repair session on other nodes.

    On 18/01/2022 11:55, manish khandelwal wrote:

    In the system logs, on the node where repair was initiated, I see
    that the node has requested merkle tree from all nodes including
    itself

    INFO  [Repair#3:1] 2022-01-14 03:32:18,805 RepairJob.java:172 -
    *[repair #6e3385e0-74d1-11ec-8e66-9f084ace9968*] Requesting
    merkle trees for *tablename* (to [*/xyz.abc.def.14,
    /xyz.abc.def.13, /xyz.abc.def.12, /xyz.mkn.pq.18, /xyz.mkn.pq.16,
    /xyz.mkn.pq.17*])
    INFO  [AntiEntropyStage:1] 2022-01-14 03:32:18,841
    RepairSession.java:180 - [repair
    #6e3385e0-74d1-11ec-8e66-9f084ace9968] Received merkle tree for
    *tablename* from */xyz.mkn.pq.17*
    INFO  [AntiEntropyStage:1] 2022-01-14 03:32:18,847
    RepairSession.java:180 - [repair
    #6e3385e0-74d1-11ec-8e66-9f084ace9968] Received merkle tree for
    *tablename* from */xyz.mkn.pq.16*
    INFO  [AntiEntropyStage:1] 2022-01-14 03:32:18,851
    RepairSession.java:180 - [repair
    #6e3385e0-74d1-11ec-8e66-9f084ace9968] Received merkle tree for
    *tablename* from */xyz.mkn.pq.18*
    INFO  [AntiEntropyStage:1] 2022-01-14 03:32:18,856
    RepairSession.java:180 - [repair
    #6e3385e0-74d1-11ec-8e66-9f084ace9968] Received merkle tree for
    *tablename* from */xyz.abc.def.14*
    Line 2480: INFO  [AntiEntropyStage:1] *2022-01-14 03:32:18*,876
    RepairSession.java:180 - [*repair
    #6e3385e0-74d1-11ec-8e66-9f084ace9968*] Received merkle tree for
    *tablename* from */xyz.abc.def.12*
    *
    *
    As per the logs merkle tree is not received from node with ip
    *xyz.abc.def.13*
    *
    *
    In the system logs of node with ip *xyz.abc.def.13, *I can see
    following logs

    NFO  [AntiEntropyStage:1] *2022-01-14 03:32:18*,850
    Validator.java:281 - [*repair
    #6e3385e0-74d1-11ec-8e66-9f084ace9968*] Sending completed merkle
    tree to */* *xyz.mkn.pq.17*  for *keyspace.tablename*

    From the above I inferred that the repair task has become
    orphaned since it is waiting for merkle tree from a node and it
    is not going to receive it since it has been lost in the network
    somewhere between.

    Regards
    Manish

    On Tue, Jan 18, 2022 at 4:39 PM Bowen Song <bo...@bso.ng> wrote:

        The entry in the debug.log is not specific to a repair
        session, and it could also be caused by reasons other than
        network connectivity issue, such as long STW GC pauses. I
        usually don't start troubleshooting an issue from the debug
        log, as it can be rather noisy. The system.log is a better
        starting point.

        If I was to troubleshoot the issue, I would start from the
        system logs on the node that initiated the repair, i.e. the
        node you ran the "nodetool repair" command on. Follow the
        repair ID (an UUID) in the logs on all nodes involved in the
        repair and read all related logs in chronological order to
        find out what exactly had happened.

        BTW, If the issue is easily reproducible, I would re-run the
        repair with a reduce scope (such as table and token range) to
        get less logs related to the repair session. Less logs means
        less time spend on reading and analysing them.

        Hope this helps.

        On 18/01/2022 10:03, manish khandelwal wrote:

        I have a Cassandra 3.11.2 cluster with two DCs. While
        running repair , I am observing the following behavior.

        I am seeing that node is not able to receive merkle tree
        from one or two nodes. Also I am able to see that the
        missing nodes did send the merkle tree but it was not
        received. This make repair hangs on consistent basis. In
        netstats I can see output as follows

        *Mode: NORMAL*
        *Not sending any streams. Attempted: 7858888*
        *Mismatch (Blocking): 2560*
        *Mismatch (Background): 17173*
        *Pool Name Active Pending Completed Dropped*
        *Large messages n/a 0 6313 3*
        *Small messages n/a 0 55978004 3*
        *Gossip messages n/a 0 93756 125**Does it represent network
        issues? In Debug logs I saw something*DEBUG
        [MessagingService-Outgoing-hostname/xxx.yy.zz.kk-Large]
        2022-01-14 05:00:19,031 OutboundTcpConnection.java:349 -
        Error writing to hostname/xxx.yy.zz.kk
        java.io.IOException: Connection timed out
        at sun.nio.ch
        <http://sun.nio.ch/>.FileDispatcherImpl.write0(Native
        Method) ~[na:1.8.0_221]
        at sun.nio.ch
        <http://sun.nio.ch/>.SocketDispatcher.write(SocketDispatcher.java:47)
        ~[na:1.8.0_221]
        at sun.nio.ch
        <http://sun.nio.ch/>.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
        ~[na:1.8.0_221]
        at sun.nio.ch
        <http://sun.nio.ch/>.IOUtil.write(IOUtil.java:65)
        ~[na:1.8.0_221]
        at sun.nio.ch
        <http://sun.nio.ch/>.SocketChannelImpl.write(SocketChannelImpl.java:471)
        ~[na:1.8.0_221]
        at
        java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
        ~[na:1.8.0_221]
        at java.nio.channels.Channels.writeFully(Channels.java:98)
        ~[na:1.8.0_221]
        at java.nio.channels.Channels.access$000(Channels.java:61)
        ~[na:1.8.0_221]
        at java.nio.channels.Channels$1.write(Channels.java:174)
        ~[na:1.8.0_221]
        at
        
net.jpountz.lz4.LZ4BlockOutputStream.flushBufferedData(LZ4BlockOutputStream.java:205)
        ~[lz4-1.3.0.jar:na]
        at
        
net.jpountz.lz4.LZ4BlockOutputStream.write(LZ4BlockOutputStream.java:158)
        ~[lz4-1.3.0.jar:na] (edited)

        Does this show any network fluctuations?

        Regards
        Manish

Re: Hanging repairs in Cassandra

Reply via email to