Hello, I am observing the following behavior during the cluster-wide state change (kern.log):
[27050.375909] drbd xcp-volume-ce0edf61-e363-4167-b904-da57d7615f3e: Preparing cluster-wide state change 2579620917 (0->-1 3/1) [27080.486801] drbd xcp-volume-ce0edf61-e363-4167-b904-da57d7615f3e: Aborting cluster-wide state change 2579620917 (30112ms) rv = -23 [27081.671615] drbd xcp-volume-ce0edf61-e363-4167-b904-da57d7615f3e: Preparing cluster-wide state change 3642399026 (0->-1 3/1) [27113.253606] drbd xcp-volume-ce0edf61-e363-4167-b904-da57d7615f3e: Aborting cluster-wide state change 3642399026 (31584ms) rv = -23 Here rv=-13 indicates SS_TIMEOUT. This continues for a while. Then, error changes to rv=-19 that indicates SS_CONCURRENT_ST_CH: [27914.141426] drbd xcp-volume-ce0edf61-e363-4167-b904-da57d7615f3e: Preparing cluster-wide state change 3371883425 (0->-1 3/1) [27926.280093] drbd xcp-volume-ce0edf61-e363-4167-b904-da57d7615f3e xosan3: sock_recvmsg returned -113 [27926.280099] drbd xcp-volume-ce0edf61-e363-4167-b904-da57d7615f3e xosan3: conn( Connected -> BrokenPipe ) peer( Secondary -> Unknown ) [27926.280101] drbd xcp-volume-ce0edf61-e363-4167-b904-da57d7615f3e/0 drbd1002 xosan3: pdsk( UpToDate -> DUnknown ) repl( Established -> Off ) [27926.280167] drbd xcp-volume-ce0edf61-e363-4167-b904-da57d7615f3e xosan3: ack_receiver terminated [27926.280168] drbd xcp-volume-ce0edf61-e363-4167-b904-da57d7615f3e xosan3: Terminating ack_recv thread [27926.280189] drbd xcp-volume-ce0edf61-e363-4167-b904-da57d7615f3e: Aborting cluster-wide state change 3371883425 (12140ms) rv = -19 [27926.280252] drbd xcp-volume-ce0edf61-e363-4167-b904-da57d7615f3e: State change failed: In transient state, retry after next state change State change continues failing but with -23: [27926.328265] drbd xcp-volume-ce0edf61-e363-4167-b904-da57d7615f3e: Preparing cluster-wide state change 3899801398 (0->-1 3/1) [27926.329319] drbd xcp-volume-ce0edf61-e363-4167-b904-da57d7615f3e xosan3: Connection closed [27926.329326] drbd xcp-volume-ce0edf61-e363-4167-b904-da57d7615f3e xosan3: helper command: /sbin/drbdadm disconnected [27926.332926] drbd xcp-volume-ce0edf61-e363-4167-b904-da57d7615f3e xosan3: helper command: /sbin/drbdadm disconnected exit code 0 [27926.332955] drbd xcp-volume-ce0edf61-e363-4167-b904-da57d7615f3e xosan3: conn( BrokenPipe -> Unconnected ) [27926.332980] drbd xcp-volume-ce0edf61-e363-4167-b904-da57d7615f3e xosan3: Restarting receiver thread [27956.998808] drbd xcp-volume-ce0edf61-e363-4167-b904-da57d7615f3e: Aborting cluster-wide state change 3899801398 (30672ms) rv = -23 [27956.998973] drbd xcp-volume-ce0edf61-e363-4167-b904-da57d7615f3e xosan3: conn( Unconnected -> Connecting ) Finally, the state changes: [27957.686777] drbd xcp-volume-ce0edf61-e363-4167-b904-da57d7615f3e: Preparing cluster-wide state change 2682176118 (0->2 499/146) [27957.710748] drbd xcp-volume-ce0edf61-e363-4167-b904-da57d7615f3e/0 drbd1002: discard_zeroes_data=0 and discard_zeroes_if_aligned=no: disabling discards [27957.710774] drbd xcp-volume-ce0edf61-e363-4167-b904-da57d7615f3e/0 drbd1002 xosan3: drbd_sync_handshake: [27957.710775] drbd xcp-volume-ce0edf61-e363-4167-b904-da57d7615f3e/0 drbd1002 xosan3: self 259E390F9A174E68:0000000000000000:78D7B860591E8E4E:0000000000000000 bits:0 flags:120 [27957.710777] drbd xcp-volume-ce0edf61-e363-4167-b904-da57d7615f3e/0 drbd1002 xosan3: peer 259E390F9A174E68:0000000000000000:0000000000000000:0000000000000000 bits:0 flags:120 [27957.710778] drbd xcp-volume-ce0edf61-e363-4167-b904-da57d7615f3e/0 drbd1002 xosan3: uuid_compare()=no-sync by rule=reconnected [27957.711207] drbd xcp-volume-ce0edf61-e363-4167-b904-da57d7615f3e: State change 2682176118: primary_nodes=0, weak_nodes=0 [27957.711208] drbd xcp-volume-ce0edf61-e363-4167-b904-da57d7615f3e: Committing cluster-wide state change 2682176118 (24ms) The state successes to change after nearly 20 minutes. Do you know what is going on or how could I debug this issue? I could provide more information if required. Thanks, Matias. _______________________________________________ Star us on GITHUB: https://github.com/LINBIT drbd-user mailing list drbd-user@lists.linbit.com https://lists.linbit.com/mailman/listinfo/drbd-user