Re: [ceph-users] osds down after upgrade hammer to jewel

Brian Andrus Tue, 28 Mar 2017 06:42:22 -0700

What does
# ceph tell osd.* version

reveal? Any pre-v0.94.4 hammer OSDs running as the error states?



On Tue, Mar 28, 2017 at 1:21 AM, Jaime Ibar <ja...@tchpc.tcd.ie> wrote:

> Hi,
>
> I did change the ownership to user ceph. In fact, OSD processes are running
>
> ps aux | grep ceph
> ceph        2199  0.0  2.7 1729044 918792 ?      Ssl  Mar27   0:21
> /usr/bin/ceph-osd --cluster=ceph -i 42 -f --setuser ceph --setgroup ceph
> ceph        2200  0.0  2.7 1721212 911084 ?      Ssl  Mar27   0:20
> /usr/bin/ceph-osd --cluster=ceph -i 18 -f --setuser ceph --setgroup ceph
> ceph        2212  0.0  2.8 1732532 926580 ?      Ssl  Mar27   0:20
> /usr/bin/ceph-osd --cluster=ceph -i 3 -f --setuser ceph --setgroup ceph
> ceph        2215  0.0  2.8 1743552 935296 ?      Ssl  Mar27   0:20
> /usr/bin/ceph-osd --cluster=ceph -i 47 -f --setuser ceph --setgroup ceph
> ceph        2341  0.0  2.7 1715548 908312 ?      Ssl  Mar27   0:20
> /usr/bin/ceph-osd --cluster=ceph -i 51 -f --setuser ceph --setgroup ceph
> ceph        2383  0.0  2.7 1694944 893768 ?      Ssl  Mar27   0:20
> /usr/bin/ceph-osd --cluster=ceph -i 56 -f --setuser ceph --setgroup ceph
> [...]
>
> If I run one of the osd increasing debug
>
> ceph-osd --debug_osd 5 -i 31
>
> this is what I get in logs
>
> [...]
>
> 0 osd.31 14016 done with init, starting boot process
> 2017-03-28 09:19:15.280182 7f083df0c800  1 osd.31 14016 We are healthy,
> booting
> 2017-03-28 09:19:15.280685 7f081cad3700  1 osd.31 14016 osdmap indicates
> one or more pre-v0.94.4 hammer OSDs is running
> [...]
>
> It seems the osd is running but ceph is not aware of it
>
> Thanks
> Jaime
>
>
>
>
> On 27/03/17 21:56, George Mihaiescu wrote:
>
>> Make sure the OSD processes on the Jewel node are running. If you didn't
>> change the ownership to user ceph, they won't start.
>>
>>
>> On Mar 27, 2017, at 11:53, Jaime Ibar <ja...@tchpc.tcd.ie> wrote:
>>>
>>> Hi all,
>>>
>>> I'm upgrading ceph cluster from Hammer 0.94.9 to jewel 10.2.6.
>>>
>>> The ceph cluster has 3 servers (one mon and one mds each) and another 6
>>> servers with
>>> 12 osds each.
>>> The monitoring and mds have been succesfully upgraded to latest jewel
>>> release, however
>>> after upgrade the first osd server(12 osds), ceph is not aware of them
>>> and
>>> are marked as down
>>>
>>> ceph -s
>>>
>>> cluster 4a158d27-f750-41d5-9e7f-26ce4c9d2d45
>>>      health HEALTH_WARN
>>> [...]
>>>             12/72 in osds are down
>>>             noout flag(s) set
>>>      osdmap e14010: 72 osds: 60 up, 72 in; 14641 remapped pgs
>>>             flags noout
>>> [...]
>>>
>>> ceph osd tree
>>>
>>> 3   3.64000         osd.3          down  1.00000 1.00000
>>> 8   3.64000         osd.8          down  1.00000 1.00000
>>> 14   3.64000         osd.14         down  1.00000 1.00000
>>> 18   3.64000         osd.18         down  1.00000          1.00000
>>> 21   3.64000         osd.21         down  1.00000          1.00000
>>> 28   3.64000         osd.28         down  1.00000          1.00000
>>> 31   3.64000         osd.31         down  1.00000          1.00000
>>> 37   3.64000         osd.37         down  1.00000          1.00000
>>> 42   3.64000         osd.42         down  1.00000          1.00000
>>> 47   3.64000         osd.47         down  1.00000          1.00000
>>> 51   3.64000         osd.51         down  1.00000          1.00000
>>> 56   3.64000         osd.56         down  1.00000          1.00000
>>>
>>> If I run this command with one of the down osd
>>> ceph osd in 14
>>> osd.14 is already in.
>>> however ceph doesn't mark it as up and the cluster health remains
>>> in degraded state.
>>>
>>> Do I have to upgrade all the osds to jewel first?
>>> Any help as I'm running out of ideas?
>>>
>>> Thanks
>>> Jaime
>>>
>>> --
>>>
>>> Jaime Ibar
>>> High Performance & Research Computing, IS Services
>>> Lloyd Building, Trinity College Dublin, Dublin 2, Ireland.
>>> http://www.tchpc.tcd.ie/ | ja...@tchpc.tcd.ie
>>> Tel: +353-1-896-3725
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
> --
>
> Jaime Ibar
> High Performance & Research Computing, IS Services
> Lloyd Building, Trinity College Dublin, Dublin 2, Ireland.
> http://www.tchpc.tcd.ie/ | ja...@tchpc.tcd.ie
> Tel: +353-1-896-3725
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Brian Andrus | Cloud Systems Engineer | DreamHost
brian.and...@dreamhost.com | www.dreamhost.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] osds down after upgrade hammer to jewel

Reply via email to