Re: Hanging repairs in Cassandra

Bowen Song Tue, 18 Jan 2022 04:47:58 -0800

Keep reading the log on the initiator and the node sending the merkletree, anything follows that? FYI, not all log has the repair ID in it,therefore please read the relevant logs in the chronological orderwithout filtering (e.g. "grep") on the repair ID.

I'm sceptical network issue is causing all this. The merkle tree is sendover TCP connections, therefore some dropped packets over a few secondof network connectivity issue occasionally should not cause any issue tothe repair. You should only start to see network related issues if thenetwork problem persists over a period of time close to or longer thanthe timeout values set in the cassandra.yaml file, in the case of repairit's the request_timeout_in_ms which is default to 10 seconds.


Carry on examine the logs, you may find something useful.

BTW, talking about stuck repair, in my experience this can happen if twoor more repairs were ran concurrently on the same node (regardless whichnode was the initiator) involving the same table. This could happen ifyou accidentally ran "nodetool repair" on two nodes and both involve thesame table, or if you cancelled and then restarted a "nodetool repair"on a node without waiting or killing the remannings of the first repairsession on other nodes.


On 18/01/2022 11:55, manish khandelwal wrote:

In the system logs, on the node where repair was initiated, I see thatthe node has requested merkle tree from all nodes including itself
INFO [Repair#3:1] 2022-01-14 03:32:18,805 RepairJob.java:172 -*[repair #6e3385e0-74d1-11ec-8e66-9f084ace9968*] Requesting merkletrees for *tablename* (to [*/xyz.abc.def.14, /xyz.abc.def.13,/xyz.abc.def.12, /xyz.mkn.pq.18, /xyz.mkn.pq.16, /xyz.mkn.pq.17*])INFO [AntiEntropyStage:1] 2022-01-14 03:32:18,841RepairSession.java:180 - [repair#6e3385e0-74d1-11ec-8e66-9f084ace9968] Received merkle tree for*tablename* from */xyz.mkn.pq.17*INFO [AntiEntropyStage:1] 2022-01-14 03:32:18,847RepairSession.java:180 - [repair#6e3385e0-74d1-11ec-8e66-9f084ace9968] Received merkle tree for*tablename* from */xyz.mkn.pq.16*INFO [AntiEntropyStage:1] 2022-01-14 03:32:18,851RepairSession.java:180 - [repair#6e3385e0-74d1-11ec-8e66-9f084ace9968] Received merkle tree for*tablename* from */xyz.mkn.pq.18*INFO [AntiEntropyStage:1] 2022-01-14 03:32:18,856RepairSession.java:180 - [repair#6e3385e0-74d1-11ec-8e66-9f084ace9968] Received merkle tree for*tablename* from */xyz.abc.def.14*Line 2480: INFO [AntiEntropyStage:1] *2022-01-14 03:32:18*,876RepairSession.java:180 - [*repair#6e3385e0-74d1-11ec-8e66-9f084ace9968*] Received merkle tree for*tablename* from */xyz.abc.def.12*
*
*
As per the logs merkle tree is not received from node with ip*xyz.abc.def.13*
*
*
In the system logs of node with ip *xyz.abc.def.13, *I can seefollowing logs
NFO [AntiEntropyStage:1] *2022-01-14 03:32:18*,850 Validator.java:281- [*repair #6e3385e0-74d1-11ec-8e66-9f084ace9968*] Sending completedmerkle tree to */* *xyz.mkn.pq.17* for *keyspace.tablename*
From the above I inferred that the repair task has become orphanedsince it is waiting for merkle tree from a node and it is not going toreceive it since it has been lost in the network somewhere between.
Regards
Manish

On Tue, Jan 18, 2022 at 4:39 PM Bowen Song <bo...@bso.ng> wrote:

    The entry in the debug.log is not specific to a repair session,
    and it could also be caused by reasons other than network
    connectivity issue, such as long STW GC pauses. I usually don't
    start troubleshooting an issue from the debug log, as it can be
    rather noisy. The system.log is a better starting point.

    If I was to troubleshoot the issue, I would start from the system
    logs on the node that initiated the repair, i.e. the node you ran
    the "nodetool repair" command on. Follow the repair ID (an UUID)
    in the logs on all nodes involved in the repair and read all
    related logs in chronological order to find out what exactly had
    happened.

    BTW, If the issue is easily reproducible, I would re-run the
    repair with a reduce scope (such as table and token range) to get
    less logs related to the repair session. Less logs means less time
    spend on reading and analysing them.

    Hope this helps.

    On 18/01/2022 10:03, manish khandelwal wrote:
    I have a Cassandra 3.11.2 cluster with two DCs. While running
    repair , I am observing the following behavior.

    I am seeing that node is not able to receive merkle tree from one
    or two nodes. Also I am able to see that the missing nodes did
    send the merkle tree but it was not received. This make repair
    hangs on consistent basis. In netstats I can see output as follows

    *Mode: NORMAL*
    *Not sending any streams. Attempted: 7858888*
    *Mismatch (Blocking): 2560*
    *Mismatch (Background): 17173*
    *Pool Name Active Pending Completed Dropped*
    *Large messages n/a 0 6313 3*
    *Small messages n/a 0 55978004 3*
    *Gossip messages n/a 0 93756 125**Does it represent network
    issues? In Debug logs I saw something*DEBUG
    [MessagingService-Outgoing-hostname/xxx.yy.zz.kk-Large]
    2022-01-14 05:00:19,031 OutboundTcpConnection.java:349 - Error
    writing to hostname/xxx.yy.zz.kk
    java.io.IOException: Connection timed out
    at sun.nio.ch
    <http://sun.nio.ch/>.FileDispatcherImpl.write0(Native Method)
    ~[na:1.8.0_221]
    at sun.nio.ch
    <http://sun.nio.ch/>.SocketDispatcher.write(SocketDispatcher.java:47)
    ~[na:1.8.0_221]
    at sun.nio.ch
    <http://sun.nio.ch/>.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
    ~[na:1.8.0_221]
    at sun.nio.ch <http://sun.nio.ch/>.IOUtil.write(IOUtil.java:65)
    ~[na:1.8.0_221]
    at sun.nio.ch
    <http://sun.nio.ch/>.SocketChannelImpl.write(SocketChannelImpl.java:471)
    ~[na:1.8.0_221]
    at java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
    ~[na:1.8.0_221]
    at java.nio.channels.Channels.writeFully(Channels.java:98)
    ~[na:1.8.0_221]
    at java.nio.channels.Channels.access$000(Channels.java:61)
    ~[na:1.8.0_221]
    at java.nio.channels.Channels$1.write(Channels.java:174)
    ~[na:1.8.0_221]
    at
    
net.jpountz.lz4.LZ4BlockOutputStream.flushBufferedData(LZ4BlockOutputStream.java:205)
    ~[lz4-1.3.0.jar:na]
    at
    net.jpountz.lz4.LZ4BlockOutputStream.write(LZ4BlockOutputStream.java:158)
    ~[lz4-1.3.0.jar:na] (edited)

    Does this show any network fluctuations?

    Regards
    Manish

Re: Hanging repairs in Cassandra

Reply via email to