Re: [Gluster-users] Self-Heal Daemon not starting after upgrade 6.10 to 7.8

Ravishankar N Tue, 03 Nov 2020 03:18:49 -0800


On 02/11/20 8:35 pm, Olaf Buitelaar wrote:

Dear Gluster users,
I'm trying to upgrade from gluster 6.10 to 7.8, i've currently triedthis on 2 hosts, but on both the Self-Heal Daemon refuses to start.It could be because not all not are updated yet, but i'm a bithesitant to continue, without the Self-Heal Daemon running.I'm not using quata's and i'm not seeing the peer reject messages, asother users reported in the mailing list.In fact gluster peer status and gluster pool list, display all nodesas connected.Also gluster v heal <vol> info shows all nodes as Status: connected,however some report pending heals, which don't really seem to progress.
Only in gluster v status <vol> the 2 upgraded nodes report not running;

Self-heal Daemon on localhost               N/A       N/A      N       N/A
Self-heal Daemon on 10.32.9.5               N/A       N/A    Y       24022
Self-heal Daemon on 10.201.0.4              N/A       N/A    Y       26704
Self-heal Daemon on 10.201.0.3              N/A       N/A    N       N/A
Self-heal Daemon on 10.32.9.4               N/A       N/A    Y       46294
Self-heal Daemon on 10.32.9.3               N/A       N/A    Y       22194
Self-heal Daemon on 10.201.0.9              N/A       N/A    Y       14902
Self-heal Daemon on 10.201.0.6              N/A       N/A    Y       5358
Self-heal Daemon on 10.201.0.5              N/A       N/A    Y       28073
Self-heal Daemon on 10.201.0.7              N/A       N/A    Y       15385
Self-heal Daemon on 10.201.0.1              N/A       N/A    Y       8917
Self-heal Daemon on 10.201.0.12             N/A       N/A    Y       56796
Self-heal Daemon on 10.201.0.8              N/A       N/A    Y       7990
Self-heal Daemon on 10.201.0.11             N/A       N/A    Y       68223
Self-heal Daemon on 10.201.0.10             N/A       N/A    Y       20828
After the upgrade i see thefile /var/lib/glusterd/vols/<vol>/<vol>-shd.vol being created, whichdoesn't exists on the 6.10 nodes.
in the logs i see these relevant messages;
log: glusterd.log
0-management: Regenerating volfiles due to a max op-version mismatchor glusterd.upgrade file not being present, op_versionretrieved:60000, max op_version: 70200

I think this is because of the shd multiplex(https://bugzilla.redhat.com/show_bug.cgi?id=1659708) added by Rafi.

Rafi, is there any workaround which can work for rolling upgrades? Orshould we just do an offline upgrade of all server nodes for the shd tocome online?


-Ravi

[2020-10-31 21:48:42.256193] W [MSGID: 106204][glusterd-store.c:3275:glusterd_store_update_volinfo] 0-management:Unknown key: tier-enabled[2020-10-31 21:48:42.256232] W [MSGID: 106204][glusterd-store.c:3275:glusterd_store_update_volinfo] 0-management:Unknown key: brick-0[2020-10-31 21:48:42.256240] W [MSGID: 106204][glusterd-store.c:3275:glusterd_store_update_volinfo] 0-management:Unknown key: brick-1[2020-10-31 21:48:42.256246] W [MSGID: 106204][glusterd-store.c:3275:glusterd_store_update_volinfo] 0-management:Unknown key: brick-2[2020-10-31 21:48:42.256251] W [MSGID: 106204][glusterd-store.c:3275:glusterd_store_update_volinfo] 0-management:Unknown key: brick-3[2020-10-31 21:48:42.256256] W [MSGID: 106204][glusterd-store.c:3275:glusterd_store_update_volinfo] 0-management:Unknown key: brick-4[2020-10-31 21:48:42.256261] W [MSGID: 106204][glusterd-store.c:3275:glusterd_store_update_volinfo] 0-management:Unknown key: brick-5[2020-10-31 21:48:42.256266] W [MSGID: 106204][glusterd-store.c:3275:glusterd_store_update_volinfo] 0-management:Unknown key: brick-6[2020-10-31 21:48:42.256271] W [MSGID: 106204][glusterd-store.c:3275:glusterd_store_update_volinfo] 0-management:Unknown key: brick-7[2020-10-31 21:48:42.256276] W [MSGID: 106204][glusterd-store.c:3275:glusterd_store_update_volinfo] 0-management:Unknown key: brick-8
[2020-10-31 21:51:36.049009] W [MSGID: 106617][glusterd-svc-helper.c:948:glusterd_attach_svc] 0-glusterd: attachfailed for glustershd(volume=backups)[2020-10-31 21:51:36.049055] E [MSGID: 106048][glusterd-shd-svc.c:482:glusterd_shdsvc_start] 0-glusterd: Failed toattach shd svc(volume=backups) to pid=9262[2020-10-31 21:51:36.049138] E [MSGID: 106615][glusterd-shd-svc.c:638:glusterd_shdsvc_restart] 0-management:Couldn't start shd for vol: backups on restart[2020-10-31 21:51:36.183133] I [MSGID: 106618][glusterd-svc-helper.c:901:glusterd_attach_svc] 0-glusterd: adding svcglustershd (volume=backups) to existing process with pid 9262
log: glustershd.log
[2020-10-31 21:49:55.976120] I [MSGID: 100041][glusterfsd-mgmt.c:1111:glusterfs_handle_svc_attach] 0-glusterfs:received attach request for volfile-id=shd/backups[2020-10-31 21:49:55.976136] W [MSGID: 100042][glusterfsd-mgmt.c:1137:glusterfs_handle_svc_attach] 0-glusterfs: gotattach for shd/backups but no active graph [Invalid argument]
So i suspect something in the logic for the self-heal daemon haschanged, since it has the new *.vol configuration for the shd.Question is, is this just a transitional state, till all nodes areupgraded. And thus safe to continue the update. Or is this somethingthat should be fixed, and if so, any clues how?
Thanks Olaf

________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
[email protected]
https://lists.gluster.org/mailman/listinfo/gluster-users

________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
[email protected]
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Self-Heal Daemon not starting after upgrade 6.10 to 7.8

Reply via email to