Just tried it, stopped all mds nodes and created one using orch. Result: 0/1 
daemons up (1 failed), 1 standby. Same as before, and logs don’t show any 
errors as well.

I’ll probably try upgrading the orch-based setup to 16.2.6 over the weekend to 
match the exact non-dockerized MDS version, maybe it will work.


> On 4 Oct 2021, at 13:41, 胡 玮文 <huw...@outlook.com> wrote:
> 
> By saying upgrade, I mean upgrade from the non-dockerized 16.2.5 to cephadm 
> version 16.2.6. So I think you need to disable standby-replay and reduce the 
> number of ranks to 1, then stop all the non-dockerized mds, deploy new mds 
> with cephadm. Only scaling back up after you finish the migration. Did you 
> also tried that?
> 
> In fact, similar issue has been reported several times on this list when 
> upgrade mds to 16.2.6, e.g. [1]. I have faced that too. So I’m pretty 
> confident that you are facing the same issue.
> 
> [1]: 
> https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/KQ5A5OWRIUEOJBC7VILBGDIKPQGJQIWN/
>  
> <https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/KQ5A5OWRIUEOJBC7VILBGDIKPQGJQIWN/>
> 
>> 在 2021年10月4日,19:00,Petr Belyaev <p.bely...@alohi.com> 写道:
>> 
>>  Hi Weiwen,
>> 
>> Yes, we did that during the upgrade. In fact, we did that multiple times 
>> even after the upgrade to see if it will resolve the issue (disabling hot 
>> standby, scaling everything down to a single MDS, swapping it with the new 
>> one, scaling back up).
>> 
>> The upgrade itself went fine, problems started during the migration to 
>> cephadm (which was done after migrating everything to Pacific). 
>> It only occurs when using dockerized MDS. Non-dockerized MDS nodes, also 
>> Pacific, everything runs fine.
>> 
>> Petr
>> 
>>> On 4 Oct 2021, at 12:43, 胡 玮文 <huw...@outlook.com 
>>> <mailto:huw...@outlook.com>> wrote:
>>> 
>>> Hi Petr,
>>>  
>>> Please read https://docs.ceph.com/en/latest/cephfs/upgrading/ 
>>> <https://docs.ceph.com/en/latest/cephfs/upgrading/> for MDS upgrade 
>>> procedure.
>>>  
>>> In short, when upgrading to 16.2.6, you need to disable standby-replay and 
>>> reduce the number of ranks to 1.
>>>  
>>> Weiwen Hu
>>>  
>>> 从 Windows 版邮件 <https://go.microsoft.com/fwlink/?LinkId=550986>发送
>>>  
>>> 发件人: Petr Belyaev <mailto:p.bely...@alohi.com>
>>> 发送时间: 2021年10月4日 18:00
>>> 收件人: ceph-users@ceph.io <mailto:ceph-users@ceph.io>
>>> 主题: [ceph-users] MDS not becoming active after migrating to cephadm
>>>  
>>> Hi,
>>> 
>>> We’ve recently upgraded from Nautilus to Pacific, and tried moving our 
>>> services to cephadm/ceph orch.
>>> For some reason, MDS nodes deployed through orch never become active (or at 
>>> least standby-replay). Non-dockerized MDS nodes can still be deployed and 
>>> work fine. Non-dockerized mds version is 16.2.6, docker image version is 
>>> 16.2.5-387-g7282d81d (came as a default).
>>> 
>>> In the MDS log, the only related message is monitors assigning MDS as 
>>> standby. Increasing the log level does not help much, it only adds beacon 
>>> messages.
>>> Monitor log also contains no differences compared to a non-dockerized MDS 
>>> startup.
>>> Mds metadata command output is identical to that of a non-dockerized MDS.
>>> 
>>> The only difference I can see in the log is the value in curly braces after 
>>> the node name, e.g. mds.storage{0:1234ff}. For dockerized MDS, the first 
>>> value is ffffffff, for non-dockerized it’s zero. Compat flags are identical.
>>> 
>>> Could someone please advise me why the dockerized MDS is being stuck as a 
>>> standby? Maybe some config values missing or smth?
>>> 
>>> Best regards,
>>> Petr
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users@ceph.io <mailto:ceph-users@ceph.io>
>>> To unsubscribe send an email to ceph-users-le...@ceph.io 
>>> <mailto:ceph-users-le...@ceph.io>

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to