We've tried to remove "sg" from the cluster so we can re-install the GlusterFS node on it, but the following command run on "br" also gives a timeout error:
gluster volume remove-brick gvol0 replica 1 sg:/nodirectwritedata/gluster/gvol0 force How can we tell "br" to just remove "sg" without trying to contact it? On Fri, 24 Feb 2023 at 10:31, David Cunningham <[email protected]> wrote: > Hello, > > We have a cluster with two nodes, "sg" and "br", which were running > GlusterFS 9.1, installed via the Ubuntu package manager. We updated the > Ubuntu packages on "sg" to version 9.6, and now have big problems. The "br" > node is still on version 9.1. > > Running "gluster volume status" on either host gives "Error : Request > timed out". On "sg" not all processes are running, compared to "br", as > below. Restarting the services on "sg" doesn't help. Can anyone advise how > we should proceed? This is a production system. > > root@sg:~# ps -ef | grep gluster > root 15196 1 0 22:37 ? 00:00:00 /usr/sbin/glusterd -p > /var/run/glusterd.pid --log-level INFO > root 15426 1 0 22:39 ? 00:00:00 /usr/bin/python3 > /usr/sbin/glustereventsd --pid-file /var/run/glustereventsd.pid > root 15457 15426 0 22:39 ? 00:00:00 /usr/bin/python3 > /usr/sbin/glustereventsd --pid-file /var/run/glustereventsd.pid > root 19341 13695 0 23:24 pts/1 00:00:00 grep --color=auto gluster > > root@br:~# ps -ef | grep gluster > root 2052 1 0 2022 ? 00:00:00 /usr/bin/python3 > /usr/sbin/glustereventsd --pid-file /var/run/glustereventsd.pid > root 2062 1 3 2022 ? 10-11:57:16 /usr/sbin/glusterfs > --fuse-mountopts=noatime --process-name fuse --volfile-server=br > --volfile-server=sg --volfile-id=/gvol0 --fuse-mountopts=noatime > /mnt/glusterfs > root 2379 2052 0 2022 ? 00:00:00 /usr/bin/python3 > /usr/sbin/glustereventsd --pid-file /var/run/glustereventsd.pid > root 5884 1 5 2022 ? 18-16:08:53 /usr/sbin/glusterfsd -s > br --volfile-id gvol0.br.nodirectwritedata-gluster-gvol0 -p > /var/run/gluster/vols/gvol0/br-nodirectwritedata-gluster-gvol0.pid -S > /var/run/gluster/61df1d4e1c65300e.socket --brick-name > /nodirectwritedata/gluster/gvol0 -l > /var/log/glusterfs/bricks/nodirectwritedata-gluster-gvol0.log > --xlator-option *-posix.glusterd-uuid=11e528b0-8c69-4b5d-82ed-c41dd25536d6 > --process-name brick --brick-port 49152 --xlator-option > gvol0-server.listen-port=49152 > root 10463 18747 0 23:24 pts/1 00:00:00 grep --color=auto gluster > root 27744 1 0 2022 ? 03:55:10 /usr/sbin/glusterfsd -s br > --volfile-id gvol0.br.nodirectwritedata-gluster-gvol0 -p > /var/run/gluster/vols/gvol0/br-nodirectwritedata-gluster-gvol0.pid -S > /var/run/gluster/61df1d4e1c65300e.socket --brick-name > /nodirectwritedata/gluster/gvol0 -l > /var/log/glusterfs/bricks/nodirectwritedata-gluster-gvol0.log > --xlator-option *-posix.glusterd-uuid=11e528b0-8c69-4b5d-82ed-c41dd25536d6 > --process-name brick --brick-port 49153 --xlator-option > gvol0-server.listen-port=49153 > root 48227 1 0 Feb17 ? 00:00:26 /usr/sbin/glusterd -p > /var/run/glusterd.pid --log-level INFO > > On "sg" in glusterd.log we're seeing: > > [2023-02-23 20:26:57.619318 +0000] E [rpc-clnt.c:181:call_bail] > 0-management: bailing out frame type(glusterd mgmt v3), op(--(6)), xid = > 0x11, unique = 27, sent = 2023-02-23 20:16:50.596447 +0000, timeout = 600 > for 10.20.20.11:24007 > [2023-02-23 20:26:57.619425 +0000] E [MSGID: 106115] > [glusterd-mgmt.c:122:gd_mgmt_v3_collate_errors] 0-management: Unlocking > failed on br. Please check log file for details. > [2023-02-23 20:26:57.619545 +0000] E [MSGID: 106151] > [glusterd-syncop.c:1655:gd_unlock_op_phase] 0-management: Failed to unlock > on some peer(s) > [2023-02-23 20:26:57.619693 +0000] W > [glusterd-locks.c:817:glusterd_mgmt_v3_unlock] > (-->/usr/lib/x86_64-linux-gnu/glusterfs/9.6/xlator/mgmt/glusterd.so(+0xe19b9) > [0x7fadf47fa9b9] > -->/usr/lib/x86_64-linux-gnu/glusterfs/9.6/xlator/mgmt/glusterd.so(+0xe0e20) > [0x7fadf47f9e20] > -->/usr/lib/x86_64-linux-gnu/glusterfs/9.6/xlator/mgmt/glusterd.so(+0xe7904) > [0x7fadf4800904] ) 0-management: Lock owner mismatch. Lock for vol gvol0 > held by 11e528b0-8c69-4b5d-82ed-c41dd25536d6 > [2023-02-23 20:26:57.619780 +0000] E [MSGID: 106117] > [glusterd-syncop.c:1679:gd_unlock_op_phase] 0-management: Unable to release > lock for gvol0 > [2023-02-23 20:26:57.619939 +0000] I > [socket.c:3811:socket_submit_outgoing_msg] 0-socket.management: not > connected (priv->connected = -1) > [2023-02-23 20:26:57.619969 +0000] E [rpcsvc.c:1567:rpcsvc_submit_generic] > 0-rpc-service: failed to submit message (XID: 0x3, Program: GlusterD svc > cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management) > [2023-02-23 20:26:57.619995 +0000] E [MSGID: 106430] > [glusterd-utils.c:678:glusterd_submit_reply] 0-glusterd: Reply submission > failed > > And in the brick log: > > [2023-02-23 20:22:56.717721 +0000] I [addr.c:54:compare_addr_and_update] > 0-/nodirectwritedata/gluster/gvol0: allowed = "*", received addr = > "10.20.20.11" > [2023-02-23 20:22:56.717817 +0000] I [login.c:110:gf_auth] 0-auth/login: > allowed user names: a26c7de4-1236-4e0a-944a-cb82de7f7f0e > [2023-02-23 20:22:56.717840 +0000] I [MSGID: 115029] > [server-handshake.c:561:server_setvolume] 0-gvol0-server: accepted client > from > CTX_ID:46b23c19-5114-4a20-9306-9ea6faf02d51-GRAPH_ID:0-PID:35568-HOST:br.m5voip.com-PC_NAME:gvol0-client-0-RECON_NO:-0 > (version: 9.1) with subvol /nodirectwritedata/gluster/gvol0 > [2023-02-23 20:22:56.741545 +0000] W [socket.c:766:__socket_rwv] > 0-tcp.gvol0-server: readv on 10.20.20.11:49144 failed (No data available) > [2023-02-23 20:22:56.741599 +0000] I [MSGID: 115036] > [server.c:500:server_rpc_notify] 0-gvol0-server: disconnecting connection > [{client-uid=CTX_ID:46b23c19-5114-4a20-9306-9ea6faf02d51-GRAPH_ID:0-PID:35568-HOST:br.m5voip.com-PC_NAME:gvol0-client-0-RECON_NO:-0}] > > [2023-02-23 20:22:56.741866 +0000] I [MSGID: 101055] > [client_t.c:397:gf_client_unref] 0-gvol0-server: Shutting down connection > CTX_ID:46b23c19-5114-4a20-9306-9ea6faf02d51-GRAPH_ID:0-PID:35568-HOST:br.m5voip.com-PC_NAME:gvol0-client-0-RECON_NO:-0 > > > Thanks for your help, > > -- > David Cunningham, Voisonics Limited > http://voisonics.com/ > USA: +1 213 221 1092 > New Zealand: +64 (0)28 2558 3782 > -- David Cunningham, Voisonics Limited http://voisonics.com/ USA: +1 213 221 1092 New Zealand: +64 (0)28 2558 3782
________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list [email protected] https://lists.gluster.org/mailman/listinfo/gluster-users
