Just tried it, stopped all mds nodes and created one using orch. Result: 0/1 daemons up (1 failed), 1 standby. Same as before, and logs don’t show any errors as well.
I’ll probably try upgrading the orch-based setup to 16.2.6 over the weekend to match the exact non-dockerized MDS version, maybe it will work. > On 4 Oct 2021, at 13:41, 胡 玮文 <huw...@outlook.com> wrote: > > By saying upgrade, I mean upgrade from the non-dockerized 16.2.5 to cephadm > version 16.2.6. So I think you need to disable standby-replay and reduce the > number of ranks to 1, then stop all the non-dockerized mds, deploy new mds > with cephadm. Only scaling back up after you finish the migration. Did you > also tried that? > > In fact, similar issue has been reported several times on this list when > upgrade mds to 16.2.6, e.g. [1]. I have faced that too. So I’m pretty > confident that you are facing the same issue. > > [1]: > https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/KQ5A5OWRIUEOJBC7VILBGDIKPQGJQIWN/ > > <https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/KQ5A5OWRIUEOJBC7VILBGDIKPQGJQIWN/> > >> 在 2021年10月4日,19:00,Petr Belyaev <p.bely...@alohi.com> 写道: >> >> Hi Weiwen, >> >> Yes, we did that during the upgrade. In fact, we did that multiple times >> even after the upgrade to see if it will resolve the issue (disabling hot >> standby, scaling everything down to a single MDS, swapping it with the new >> one, scaling back up). >> >> The upgrade itself went fine, problems started during the migration to >> cephadm (which was done after migrating everything to Pacific). >> It only occurs when using dockerized MDS. Non-dockerized MDS nodes, also >> Pacific, everything runs fine. >> >> Petr >> >>> On 4 Oct 2021, at 12:43, 胡 玮文 <huw...@outlook.com >>> <mailto:huw...@outlook.com>> wrote: >>> >>> Hi Petr, >>> >>> Please read https://docs.ceph.com/en/latest/cephfs/upgrading/ >>> <https://docs.ceph.com/en/latest/cephfs/upgrading/> for MDS upgrade >>> procedure. >>> >>> In short, when upgrading to 16.2.6, you need to disable standby-replay and >>> reduce the number of ranks to 1. >>> >>> Weiwen Hu >>> >>> 从 Windows 版邮件 <https://go.microsoft.com/fwlink/?LinkId=550986>发送 >>> >>> 发件人: Petr Belyaev <mailto:p.bely...@alohi.com> >>> 发送时间: 2021年10月4日 18:00 >>> 收件人: ceph-users@ceph.io <mailto:ceph-users@ceph.io> >>> 主题: [ceph-users] MDS not becoming active after migrating to cephadm >>> >>> Hi, >>> >>> We’ve recently upgraded from Nautilus to Pacific, and tried moving our >>> services to cephadm/ceph orch. >>> For some reason, MDS nodes deployed through orch never become active (or at >>> least standby-replay). Non-dockerized MDS nodes can still be deployed and >>> work fine. Non-dockerized mds version is 16.2.6, docker image version is >>> 16.2.5-387-g7282d81d (came as a default). >>> >>> In the MDS log, the only related message is monitors assigning MDS as >>> standby. Increasing the log level does not help much, it only adds beacon >>> messages. >>> Monitor log also contains no differences compared to a non-dockerized MDS >>> startup. >>> Mds metadata command output is identical to that of a non-dockerized MDS. >>> >>> The only difference I can see in the log is the value in curly braces after >>> the node name, e.g. mds.storage{0:1234ff}. For dockerized MDS, the first >>> value is ffffffff, for non-dockerized it’s zero. Compat flags are identical. >>> >>> Could someone please advise me why the dockerized MDS is being stuck as a >>> standby? Maybe some config values missing or smth? >>> >>> Best regards, >>> Petr >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users@ceph.io <mailto:ceph-users@ceph.io> >>> To unsubscribe send an email to ceph-users-le...@ceph.io >>> <mailto:ceph-users-le...@ceph.io> _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io