Hi all,

I'm mid-upgrade on a large cluster now. The upgrade is not going smoothly
-- it looks like the ceph-mon's are getting bombarded by so many of these
crc error warnings that they go into elections.

Did anyone upgrade a large cluster from 0.94.6 to 0.94.7 ? If not I'd
advise waiting until this is better understood.

Cheers, Dan


On Tue, May 17, 2016 at 2:14 PM, Christian Balzer <ch...@gol.com> wrote:

>
> Hello,
>
> for the record, I did the exact same sequence (no MDS) on my test cluster
> with exactly the same results.
>
> Didn't report it as I assumed it to be a more noisy (but harmless)
> upgrade artifact.
>
> Christian
>
> On Tue, 17 May 2016 14:07:21 +0200 Dan van der Ster wrote:
>
> > On Tue, May 17, 2016 at 1:56 PM, Sage Weil <sw...@redhat.com> wrote:
> > > On Tue, 17 May 2016, Dan van der Ster wrote:
> > >> Hi Sage et al,
> > >>
> > >> I'm updating our pre-prod cluster from 0.94.6 to 0.94.7 and after
> > >> upgrading the ceph-mon's I'm getting loads of warnings like:
> > >>
> > >> 2016-05-17 10:01:29.314785 osd.76 [WRN] failed to encode map e103116
> > >> with expected crc
> > >>
> > >> I've seen that error is whitelisted in the qa-suite:
> > >> https://github.com/ceph/ceph-qa-suite/pull/602/files
> > >>
> > >> Is it really harmless? (This is the first time I've seen such a
> > >> warning).
> > >
> > > Are you sure you were upgrading from v0.94.6?
> >
> > Absolutely. I first updated the mons, which I restarted into quorum
> > with 0.96.7. Then any changes to the osdmap triggered the failed to
> > encode warning.
> > The upgrade sequence went like this:
> >
> > Update mons 0.94.6 to 0.94.7, restart, quorum. No warnings.
> > Update mds's 0.94.6 to 0.94.7, restart. Warnings from ~all osds.
> > Update osds 0.94.6 to 0.94.7, restart host by host. The 0.94.6 osds
> > printed warnings, the new OSDs did not.
> >
> > > I don't see anything that
> > > would trigger these warnings going from .6 to .7, which is strange.
> >
> > Could the osdmap GMT hitset changes have caused it? Commits Mar 24 here:
> >
> >    https://github.com/ceph/ceph/compare/v0.94.6...v0.94.7?expand=1
> >
> > > That said, the errors are generally harmless--it just means the
> > > monitors are running a different version of the code and the OSDs are
> > > pulling maps directly from a mon to ensure they are all in sync.  It's
> > > normal during many upgrades, but not expected for this particular
> > > jump...
> >
> > Then I'm curious if others are getting this from 0.94.6 to 0.94.7.
> > For now I'm waiting to update our prod cluster.
> >
> > Thanks!
> >
> > Dan
> >
> >
> > >
> > > sage
> > >
> > >
> > >
> > >
> > >> Thanks in advance!
> > >>
> > >> Dan
> > >>
> > >>
> > >>
> > >>
> > >> On Fri, May 13, 2016 at 4:21 PM, Sage Weil <s...@redhat.com> wrote:
> > >> > This Hammer point release fixes several minor bugs. It also
> > >> > includes a backport of an improved ‘ceph osd
> > >> > reweight-by-utilization’ command for handling OSDs with
> > >> > higher-than-average utilizations.
> > >> >
> > >> > We recommend that all hammer v0.94.x users upgrade.
> > >> >
> > >> > For more detailed information, see the release announcement at
> > >> >
> > >> >         http://ceph.com/releases/v0-94-7-hammer-released/
> > >> >
> > >> > or the complete changelog at
> > >> >
> > >> >         http://docs.ceph.com/docs/master/_downloads/v0.94.6.txt
> > >> >
> > >> > Getting Ceph
> > >> > ------------
> > >> >
> > >> > * Git at git://github.com/ceph/ceph.git
> > >> > * Tarball at http://download.ceph.com/tarballs/ceph-0.94.7.tar.gz
> > >> > * For packages, see
> http://ceph.com/docs/master/install/get-packages
> > >> > * For ceph-deploy, see
> > >> > http://ceph.com/docs/master/install/install-ceph-deploy
> > >> > _______________________________________________ ceph-users mailing
> > >> > list ceph-users@lists.ceph.com
> > >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > >> >
> > >>
> > >>
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
> --
> Christian Balzer        Network/Systems Engineer
> ch...@gol.com           Global OnLine Japan/Rakuten Communications
> http://www.gol.com/
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to