Re: [ceph-users] Lessons learned upgrading Hammer -> Jewel

Sean Redmond Fri, 15 Jul 2016 01:54:36 -0700

Hi Matt,

I have too followed the upgrade from hammer to jewel, I think it is pretty
accepted to upgrade between LTS releases (H>J) skipping the 'stable'
releases (I) in the middle.


Thanks

On Fri, Jul 15, 2016 at 9:48 AM, Mart van Santen <m...@greenhost.nl> wrote:

>
> Hi Wido,
>
> Thank you, we are currently in the same process so this information is
> very usefull. Can you share why you upgraded from hammer directly to jewel,
> is there a reason to skip infernalis? So, I wonder why you didn't do a
> hammer->infernalis->jewel upgrade, as that seems the logical path for me.
>
> (we did indeed saw the same errors "Failed to encode map eXXX with
> expected crc" when upgrading to the latest hammer)
>
>
> Regards,
>
> Mart
>
>
>
>
>
>
>
>
> On 07/15/2016 03:08 AM, 席智勇 wrote:
>
> good job, thank you for sharing, Wido~
> it's very useful~
>
> 2016-07-14 14:33 GMT+08:00 Wido den Hollander <w...@42on.com>:
>
>> To add, the RGWs upgraded just fine as well.
>>
>> No regions in use here (yet!), so that upgraded as it should.
>>
>> Wido
>>
>> > Op 13 juli 2016 om 16:56 schreef Wido den Hollander <w...@42on.com>:
>> >
>> >
>> > Hello,
>> >
>> > The last 3 days I worked at a customer with a 1800 OSD cluster which
>> had to be upgraded from Hammer 0.94.5 to Jewel 10.2.2
>> >
>> > The cluster in this case is 99% RGW, but also some RBD.
>> >
>> > I wanted to share some of the things we encountered during this upgrade.
>> >
>> > All 180 nodes are running CentOS 7.1 on a IPv6-only network.
>> >
>> > ** Hammer Upgrade **
>> > At first we upgraded from 0.94.5 to 0.94.7, this went well except for
>> the fact that the monitors got spammed with these kind of messages:
>> >
>> >   "Failed to encode map eXXX with expected crc"
>> >
>> > Some searching on the list brought me to:
>> >
>> >   ceph tell osd.* injectargs -- --clog_to_monitors=false
>> >
>> >  This reduced the load on the 5 monitors and made recovery succeed
>> smoothly.
>> >
>> >  ** Monitors to Jewel **
>> >  The next step was to upgrade the monitors from Hammer to Jewel.
>> >
>> >  Using Salt we upgraded the packages and afterwards it was simple:
>> >
>> >    killall ceph-mon
>> >    chown -R ceph:ceph /var/lib/ceph
>> >    chown -R ceph:ceph /var/log/ceph
>> >
>> > Now, a systemd quirck. 'systemctl start ceph.target' does not work, I
>> had to manually enabled the monitor and start it:
>> >
>> >   systemctl enable ceph-mon@srv-zmb04-05.service
>> >   systemctl start ceph-mon@srv-zmb04-05.service
>> >
>> > Afterwards the monitors were running just fine.
>> >
>> > ** OSDs to Jewel **
>> > To upgrade the OSDs to Jewel we initially used Salt to update the
>> packages on all systems to 10.2.2, we then used a Shell script which we ran
>> on one node at a time.
>> >
>> > The failure domain here is 'rack', so we executed this in one rack,
>> then the next one, etc, etc.
>> >
>> > Script can be found on Github:
>> <https://gist.github.com/wido/06eac901bd42f01ca2f4f1a1d76c49a6>
>> https://gist.github.com/wido/06eac901bd42f01ca2f4f1a1d76c49a6
>> >
>> > Be aware that the chown can take a long, long, very long time!
>> >
>> > We ran into the issue that some OSDs crashed after start. But after
>> trying again they would start.
>> >
>> >   "void FileStore::init_temp_collections()"
>> >
>> > I reported this in the tracker as I'm not sure what is happening here:
>> http://tracker.ceph.com/issues/16672
>> >
>> > ** New OSDs with Jewel **
>> > We also had some new nodes which we wanted to add to the Jewel cluster.
>> >
>> > Using Salt and ceph-disk we ran into a partprobe issue in combination
>> with ceph-disk. There was already a Pull Request for the fix, but that was
>> not included in Jewel 10.2.2.
>> >
>> > We manually applied the PR and it fixed our issues:
>> https://github.com/ceph/ceph/pull/9330
>> >
>> > Hope this helps other people with their upgrades to Jewel!
>> >
>> > Wido
>> > _______________________________________________
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
>
> _______________________________________________
> ceph-users mailing 
> listceph-us...@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
> --
> Mart van Santen
> Greenhost
> E: m...@greenhost.nl
> T: +31 20 4890444
> W: https://greenhost.nl
>
> A PGP signature can be attached to this e-mail,
> you need PGP software to verify it.
> My public key is available in keyserver(s)
> see: http://tinyurl.com/openpgp-manual
>
> PGP Fingerprint: CA85 EB11 2B70 042D AF66  B29A 6437 01A1 10A3 D3A5
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Lessons learned upgrading Hammer -> Jewel

Reply via email to