Re: [ceph-users] Running Ceph issues: HEALTH_WARN, unknown auth protocol, others

Wyatt Gorman Thu, 02 May 2013 05:11:43 -0700

Sorry, I forgot to hit reply all.

That did it, I'm getting a "HEALTH_OK"!! Now I can move on with the
process! Thanks guys, hopefully you won't see me back here too much ;)



On Wed, May 1, 2013 at 5:43 PM, Gregory Farnum <g...@inktank.com> wrote:

> [ Please keep all discussions on the list. :) ]
>
> Okay, so you've now got just 128 that are sad. Those are all in pool
> 2, which I believe is "rbd" — you'll need to set your replication
> level to 1 on all pools and that should fix it. :)
> Keep in mind that with 1x replication you've only got 1 copy of
> everything though, so if you lose one disk you're going to lose data.
> You really want to get enough disks to set 2x replication.
> -Greg
> Software Engineer #42 @ http://inktank.com | http://ceph.com
>
>
> On Wed, May 1, 2013 at 2:34 PM, Wyatt Gorman
> <wyattgor...@wyattgorman.com> wrote:
> > ceph -s
> >    health HEALTH_WARN 128 pgs degraded; 128 pgs stuck unclean
> >    monmap e1: 1 mons at {a=10.81.2.100:6789/0}, election epoch 1,
> quorum 0 a
> >    osdmap e40: 1 osds: 1 up, 1 in
> >     pgmap v759: 384 pgs: 256 active+clean, 128 active+degraded; 8699
> bytes
> > data, 3430 MB used, 47828 MB / 54002 MB avail
> >    mdsmap e41: 1/1/1 up {0=a=up:active}
> >
> >
> > http://pastebin.com/0d7UM5s4
> >
> > Thanks for your help, Greg.
> >
> >
> > On Wed, May 1, 2013 at 4:41 PM, Gregory Farnum <g...@inktank.com> wrote:
> >>
> >> On Wed, May 1, 2013 at 1:32 PM, Dino Yancey <dino2...@gmail.com> wrote:
> >> > Hi Wyatt,
> >> >
> >> > This is almost certainly a configuration issue.  If i recall, there
> is a
> >> > min_size setting in the CRUSH rules for each pool that defaults to two
> >> > which
> >> > you may also need to reduce to one.  I don't have the documentation in
> >> > front
> >> > of me, so that's just off the top of my head...
> >>
> >> Hmm, no. The min_size should be set automatically to 1/2 of the
> >> specified size (rounded up), which would be 1 in this case.
> >> What's the full output of ceph -s? Can you pastebin the output of
> >> "ceph pg dump" please?
> >> -Greg
> >> Software Engineer #42 @ http://inktank.com | http://ceph.com
> >>
> >> >
> >> > Dino
> >> >
> >> >
> >> > On Wed, May 1, 2013 at 3:19 PM, Wyatt Gorman
> >> > <wyattgor...@wyattgorman.com>
> >> > wrote:
> >> >>
> >> >> Okay! Dino, thanks for your response. I reduced my metadata pool size
> >> >> and
> >> >> data pool size to 1, which eliminated the "recovery 21/42 degraded
> >> >> (50.000%)" at the end of my HEALTH_WARN error. So now, when I run
> "ceph
> >> >> health" I get the following:
> >> >>
> >> >> HEALTH_WARN 384 pgs degraded; 384 pgs stale; 384 pgs stuck unclean
> >> >>
> >> >> So this seems to be from one single root cause. Any ideas? Again, is
> >> >> this
> >> >> a corrupted drive issue that I can clean up, or is this still a ceph
> >> >> configuration error?
> >> >>
> >> >>
> >> >> On Wed, May 1, 2013 at 12:52 PM, Dino Yancey <dino2...@gmail.com>
> >> >> wrote:
> >> >>>
> >> >>> Hi Wyatt,
> >> >>>
> >> >>> You need to reduce the replication level on your existing pools to
> 1,
> >> >>> or
> >> >>> bring up another OSD.  The default configuration specifies a
> >> >>> replication
> >> >>> level of 2, and the default crush rules want to place a replica on
> two
> >> >>> distinct OSDs.  With one OSD, CRUSH can't determine placement for
> the
> >> >>> replica and so Ceph is reporting a degraded state.
> >> >>>
> >> >>> Dino
> >> >>>
> >> >>>
> >> >>> On Wed, May 1, 2013 at 11:45 AM, Wyatt Gorman
> >> >>> <wyattgor...@wyattgorman.com> wrote:
> >> >>>>
> >> >>>> Well, those points solved the issue of the redefined host and the
> >> >>>> unidentified protocol. The
> >> >>>>
> >> >>>>
> >> >>>> "HEALTH_WARN 384 pgs degraded; 384 pgs stuck unclean; recovery
> 21/42
> >> >>>> degraded (50.000%)"
> >> >>>>
> >> >>>> error is still an issue, though. Is this something simple like some
> >> >>>> hard
> >> >>>> drive corruption that I can clean up with a fsck, or is this a ceph
> >> >>>> issue?
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>> On Wed, May 1, 2013 at 12:31 PM, Mike Dawson
> >> >>>> <mike.daw...@scholarstack.com> wrote:
> >> >>>>>
> >> >>>>> Wyatt,
> >> >>>>>
> >> >>>>> A few notes:
> >> >>>>>
> >> >>>>> - Yes, the second "host = ceph" under mon.a is redundant and
> should
> >> >>>>> be
> >> >>>>> deleted.
> >> >>>>>
> >> >>>>> - "auth client required = cephx [osd]" should be simply
> >> >>>>> auth client required = cephx".
> >> >>>>>
> >> >>>>> - Looks like you only have one OSD. You need at least as many (and
> >> >>>>> hopefully more) OSDs than highest replication level out of your
> >> >>>>> pools.
> >> >>>>>
> >> >>>>> Mike
> >> >>>>>
> >> >>>>>
> >> >>>>> On 5/1/2013 12:23 PM, Wyatt Gorman wrote:
> >> >>>>>>
> >> >>>>>> Here is my ceph.conf. I just figured out that the second host =
> >> >>>>>> isn't
> >> >>>>>> necessary, though it is like that on the 5-minute quick start
> >> >>>>>> guide...
> >> >>>>>> (Perhaps I'll submit my couple of fixes that I've had to
> implement
> >> >>>>>> so
> >> >>>>>> far). That fixes the "redefined host" issue, but none of the
> >> >>>>>> others.
> >> >>>>>>
> >> >>>>>> [global]
> >> >>>>>>      # For version 0.55 and beyond, you must explicitly enable or
> >> >>>>>>      # disable authentication with "auth" entries in [global].
> >> >>>>>>
> >> >>>>>>      auth cluster required = cephx
> >> >>>>>>      auth service required = cephx
> >> >>>>>>      auth client required = cephx [osd]
> >> >>>>>>      osd journal size = 1000
> >> >>>>>>
> >> >>>>>>      #The following assumes ext4 filesystem.
> >> >>>>>>      filestore xattr use omap = true
> >> >>>>>>      # For Bobtail (v 0.56) and subsequent versions, you may add
> >> >>>>>>      #settings for mkcephfs so that it will create and mount the
> >> >>>>>> file
> >> >>>>>>      #system on a particular OSD for you. Remove the comment `#`
> >> >>>>>>      #character for the following settings and replace the values
> >> >>>>>> in
> >> >>>>>>      #braces with appropriate values, or leave the following
> >> >>>>>> settings
> >> >>>>>>      #commented out to accept the default values. You must
> specify
> >> >>>>>>      #the --mkfs option with mkcephfs in order for the deployment
> >> >>>>>>      #script to utilize the following settings, and you must
> define
> >> >>>>>>      #the 'devs' option for each osd instance; see below. osd
> mkfs
> >> >>>>>>      #type = {fs-type} osd mkfs options {fs-type} = {mkfs
> options}
> >> >>>>>> #
> >> >>>>>>      #default for xfs is "-f" osd mount options {fs-type} =
> {mount
> >> >>>>>>      #options} # default mount option is "rw,noatime"
> >> >>>>>>      # For example, for ext4, the mount option might look like
> >> >>>>>> this:
> >> >>>>>>
> >> >>>>>>      #osd mkfs options ext4 = user_xattr,rw,noatime
> >> >>>>>>      # Execute $ hostname to retrieve the name of your host, and
> >> >>>>>>      # replace {hostname} with the name of your host. For the
> >> >>>>>>      # monitor, replace {ip-address} with the IP address of your
> >> >>>>>>      # host.
> >> >>>>>> [mon.a]
> >> >>>>>>      host = ceph
> >> >>>>>>      mon addr = 10.81.2.100:6789 <http://10.81.2.100:6789>
> [osd.0]
> >> >>>>>>
> >> >>>>>>      host = ceph
> >> >>>>>>
> >> >>>>>>      # For Bobtail (v 0.56) and subsequent versions, you may add
> >> >>>>>>      # settings for mkcephfs so that it will create and mount the
> >> >>>>>>      # file system on a particular OSD for you. Remove the
> comment
> >> >>>>>>      # `#` character for the following setting for each OSD and
> >> >>>>>>      # specify a path to the device if you use mkcephfs with the
> >> >>>>>>      # --mkfs option.
> >> >>>>>>
> >> >>>>>>      #devs = {path-to-device}
> >> >>>>>> [osd.1]
> >> >>>>>>      host = ceph
> >> >>>>>>      #devs = {path-to-device}
> >> >>>>>> [mds.a]
> >> >>>>>>      host = ceph
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> On Wed, May 1, 2013 at 12:14 PM, Mike Dawson
> >> >>>>>> <mike.daw...@scholarstack.com
> >> >>>>>> <mailto:mike.daw...@scholarstack.com>>
> >> >>>>>> wrote:
> >> >>>>>>
> >> >>>>>>     Wyatt,
> >> >>>>>>
> >> >>>>>>     Please post your ceph.conf.
> >> >>>>>>
> >> >>>>>>     - mike
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>     On 5/1/2013 12:06 PM, Wyatt Gorman wrote:
> >> >>>>>>
> >> >>>>>>         Hi everyone,
> >> >>>>>>
> >> >>>>>>         I'm setting up a test ceph cluster and am having trouble
> >> >>>>>> getting it
> >> >>>>>>         running (great for testing, huh?). I went through the
> >> >>>>>>         installation on
> >> >>>>>>         Debian squeeze, had to modify the mkcephfs script a bit
> >> >>>>>> because
> >> >>>>>>         it calls
> >> >>>>>>         monmaptool with too many paramaters in the $args variable
> >> >>>>>> (mine had
> >> >>>>>>         "--add a [ip address]:[port] [osd1]" and I had to get rid
> >> >>>>>> of
> >> >>>>>> the
> >> >>>>>>         [osd1]
> >> >>>>>>         part for the monmaptool command to take it). Anyway, so I
> >> >>>>>> got
> >> >>>>>> it
> >> >>>>>>         installed, started the service, waiting a little while
> for
> >> >>>>>> it
> >> >>>>>> to
> >> >>>>>>         build
> >> >>>>>>         the fs, and ran "ceph health" and got (and am still
> getting
> >> >>>>>>         after a day
> >> >>>>>>         and a reboot) the following error: (note: I have also
> been
> >> >>>>>>         getting the
> >> >>>>>>         first line in various calls, unsure why it is
> complaining,
> >> >>>>>> I
> >> >>>>>>         followed
> >> >>>>>>         the instructions...)
> >> >>>>>>
> >> >>>>>>         warning: line 34: 'host' in section 'mon.a' redefined
> >> >>>>>>         2013-05-01 12:04:39.801102 b733b710 -1 WARNING: unknown
> >> >>>>>> auth
> >> >>>>>>         protocol
> >> >>>>>>         defined: [osd]
> >> >>>>>>         HEALTH_WARN 384 pgs degraded; 384 pgs stuck unclean;
> >> >>>>>> recovery
> >> >>>>>> 21/42
> >> >>>>>>         degraded (50.000%)
> >> >>>>>>
> >> >>>>>>         Can anybody tell me the root of this issue, and how I can
> >> >>>>>> fix
> >> >>>>>>         it? Thank you!
> >> >>>>>>
> >> >>>>>>         - Wyatt Gorman
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>         _________________________________________________
> >> >>>>>>         ceph-users mailing list
> >> >>>>>>         ceph-users@lists.ceph.com
> >> >>>>>> <mailto:ceph-users@lists.ceph.com>
> >> >>>>>>
> http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
> >> >>>>>>         <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
> >> >>>>>>
> >> >>>>>>
> >> >>>>
> >> >>>>
> >> >>>> _______________________________________________
> >> >>>> ceph-users mailing list
> >> >>>> ceph-users@lists.ceph.com
> >> >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> >>>>
> >> >>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>> ______________________________
> >> >>> Dino Yancey
> >> >>> 2GNT.com Admin
> >> >>
> >> >>
> >> >
> >> >
> >> >
> >> > --
> >> > ______________________________
> >> > Dino Yancey
> >> > 2GNT.com Admin
> >> >
> >> > _______________________________________________
> >> > ceph-users mailing list
> >> > ceph-users@lists.ceph.com
> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> >
> >
> >
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Running Ceph issues: HEALTH_WARN, unknown auth protocol, others

Reply via email to