Sorry, I forgot to hit reply all. That did it, I'm getting a "HEALTH_OK"!! Now I can move on with the process! Thanks guys, hopefully you won't see me back here too much ;)
On Wed, May 1, 2013 at 5:43 PM, Gregory Farnum <g...@inktank.com> wrote: > [ Please keep all discussions on the list. :) ] > > Okay, so you've now got just 128 that are sad. Those are all in pool > 2, which I believe is "rbd" — you'll need to set your replication > level to 1 on all pools and that should fix it. :) > Keep in mind that with 1x replication you've only got 1 copy of > everything though, so if you lose one disk you're going to lose data. > You really want to get enough disks to set 2x replication. > -Greg > Software Engineer #42 @ http://inktank.com | http://ceph.com > > > On Wed, May 1, 2013 at 2:34 PM, Wyatt Gorman > <wyattgor...@wyattgorman.com> wrote: > > ceph -s > > health HEALTH_WARN 128 pgs degraded; 128 pgs stuck unclean > > monmap e1: 1 mons at {a=10.81.2.100:6789/0}, election epoch 1, > quorum 0 a > > osdmap e40: 1 osds: 1 up, 1 in > > pgmap v759: 384 pgs: 256 active+clean, 128 active+degraded; 8699 > bytes > > data, 3430 MB used, 47828 MB / 54002 MB avail > > mdsmap e41: 1/1/1 up {0=a=up:active} > > > > > > http://pastebin.com/0d7UM5s4 > > > > Thanks for your help, Greg. > > > > > > On Wed, May 1, 2013 at 4:41 PM, Gregory Farnum <g...@inktank.com> wrote: > >> > >> On Wed, May 1, 2013 at 1:32 PM, Dino Yancey <dino2...@gmail.com> wrote: > >> > Hi Wyatt, > >> > > >> > This is almost certainly a configuration issue. If i recall, there > is a > >> > min_size setting in the CRUSH rules for each pool that defaults to two > >> > which > >> > you may also need to reduce to one. I don't have the documentation in > >> > front > >> > of me, so that's just off the top of my head... > >> > >> Hmm, no. The min_size should be set automatically to 1/2 of the > >> specified size (rounded up), which would be 1 in this case. > >> What's the full output of ceph -s? Can you pastebin the output of > >> "ceph pg dump" please? > >> -Greg > >> Software Engineer #42 @ http://inktank.com | http://ceph.com > >> > >> > > >> > Dino > >> > > >> > > >> > On Wed, May 1, 2013 at 3:19 PM, Wyatt Gorman > >> > <wyattgor...@wyattgorman.com> > >> > wrote: > >> >> > >> >> Okay! Dino, thanks for your response. I reduced my metadata pool size > >> >> and > >> >> data pool size to 1, which eliminated the "recovery 21/42 degraded > >> >> (50.000%)" at the end of my HEALTH_WARN error. So now, when I run > "ceph > >> >> health" I get the following: > >> >> > >> >> HEALTH_WARN 384 pgs degraded; 384 pgs stale; 384 pgs stuck unclean > >> >> > >> >> So this seems to be from one single root cause. Any ideas? Again, is > >> >> this > >> >> a corrupted drive issue that I can clean up, or is this still a ceph > >> >> configuration error? > >> >> > >> >> > >> >> On Wed, May 1, 2013 at 12:52 PM, Dino Yancey <dino2...@gmail.com> > >> >> wrote: > >> >>> > >> >>> Hi Wyatt, > >> >>> > >> >>> You need to reduce the replication level on your existing pools to > 1, > >> >>> or > >> >>> bring up another OSD. The default configuration specifies a > >> >>> replication > >> >>> level of 2, and the default crush rules want to place a replica on > two > >> >>> distinct OSDs. With one OSD, CRUSH can't determine placement for > the > >> >>> replica and so Ceph is reporting a degraded state. > >> >>> > >> >>> Dino > >> >>> > >> >>> > >> >>> On Wed, May 1, 2013 at 11:45 AM, Wyatt Gorman > >> >>> <wyattgor...@wyattgorman.com> wrote: > >> >>>> > >> >>>> Well, those points solved the issue of the redefined host and the > >> >>>> unidentified protocol. The > >> >>>> > >> >>>> > >> >>>> "HEALTH_WARN 384 pgs degraded; 384 pgs stuck unclean; recovery > 21/42 > >> >>>> degraded (50.000%)" > >> >>>> > >> >>>> error is still an issue, though. Is this something simple like some > >> >>>> hard > >> >>>> drive corruption that I can clean up with a fsck, or is this a ceph > >> >>>> issue? > >> >>>> > >> >>>> > >> >>>> > >> >>>> On Wed, May 1, 2013 at 12:31 PM, Mike Dawson > >> >>>> <mike.daw...@scholarstack.com> wrote: > >> >>>>> > >> >>>>> Wyatt, > >> >>>>> > >> >>>>> A few notes: > >> >>>>> > >> >>>>> - Yes, the second "host = ceph" under mon.a is redundant and > should > >> >>>>> be > >> >>>>> deleted. > >> >>>>> > >> >>>>> - "auth client required = cephx [osd]" should be simply > >> >>>>> auth client required = cephx". > >> >>>>> > >> >>>>> - Looks like you only have one OSD. You need at least as many (and > >> >>>>> hopefully more) OSDs than highest replication level out of your > >> >>>>> pools. > >> >>>>> > >> >>>>> Mike > >> >>>>> > >> >>>>> > >> >>>>> On 5/1/2013 12:23 PM, Wyatt Gorman wrote: > >> >>>>>> > >> >>>>>> Here is my ceph.conf. I just figured out that the second host = > >> >>>>>> isn't > >> >>>>>> necessary, though it is like that on the 5-minute quick start > >> >>>>>> guide... > >> >>>>>> (Perhaps I'll submit my couple of fixes that I've had to > implement > >> >>>>>> so > >> >>>>>> far). That fixes the "redefined host" issue, but none of the > >> >>>>>> others. > >> >>>>>> > >> >>>>>> [global] > >> >>>>>> # For version 0.55 and beyond, you must explicitly enable or > >> >>>>>> # disable authentication with "auth" entries in [global]. > >> >>>>>> > >> >>>>>> auth cluster required = cephx > >> >>>>>> auth service required = cephx > >> >>>>>> auth client required = cephx [osd] > >> >>>>>> osd journal size = 1000 > >> >>>>>> > >> >>>>>> #The following assumes ext4 filesystem. > >> >>>>>> filestore xattr use omap = true > >> >>>>>> # For Bobtail (v 0.56) and subsequent versions, you may add > >> >>>>>> #settings for mkcephfs so that it will create and mount the > >> >>>>>> file > >> >>>>>> #system on a particular OSD for you. Remove the comment `#` > >> >>>>>> #character for the following settings and replace the values > >> >>>>>> in > >> >>>>>> #braces with appropriate values, or leave the following > >> >>>>>> settings > >> >>>>>> #commented out to accept the default values. You must > specify > >> >>>>>> #the --mkfs option with mkcephfs in order for the deployment > >> >>>>>> #script to utilize the following settings, and you must > define > >> >>>>>> #the 'devs' option for each osd instance; see below. osd > mkfs > >> >>>>>> #type = {fs-type} osd mkfs options {fs-type} = {mkfs > options} > >> >>>>>> # > >> >>>>>> #default for xfs is "-f" osd mount options {fs-type} = > {mount > >> >>>>>> #options} # default mount option is "rw,noatime" > >> >>>>>> # For example, for ext4, the mount option might look like > >> >>>>>> this: > >> >>>>>> > >> >>>>>> #osd mkfs options ext4 = user_xattr,rw,noatime > >> >>>>>> # Execute $ hostname to retrieve the name of your host, and > >> >>>>>> # replace {hostname} with the name of your host. For the > >> >>>>>> # monitor, replace {ip-address} with the IP address of your > >> >>>>>> # host. > >> >>>>>> [mon.a] > >> >>>>>> host = ceph > >> >>>>>> mon addr = 10.81.2.100:6789 <http://10.81.2.100:6789> > [osd.0] > >> >>>>>> > >> >>>>>> host = ceph > >> >>>>>> > >> >>>>>> # For Bobtail (v 0.56) and subsequent versions, you may add > >> >>>>>> # settings for mkcephfs so that it will create and mount the > >> >>>>>> # file system on a particular OSD for you. Remove the > comment > >> >>>>>> # `#` character for the following setting for each OSD and > >> >>>>>> # specify a path to the device if you use mkcephfs with the > >> >>>>>> # --mkfs option. > >> >>>>>> > >> >>>>>> #devs = {path-to-device} > >> >>>>>> [osd.1] > >> >>>>>> host = ceph > >> >>>>>> #devs = {path-to-device} > >> >>>>>> [mds.a] > >> >>>>>> host = ceph > >> >>>>>> > >> >>>>>> > >> >>>>>> On Wed, May 1, 2013 at 12:14 PM, Mike Dawson > >> >>>>>> <mike.daw...@scholarstack.com > >> >>>>>> <mailto:mike.daw...@scholarstack.com>> > >> >>>>>> wrote: > >> >>>>>> > >> >>>>>> Wyatt, > >> >>>>>> > >> >>>>>> Please post your ceph.conf. > >> >>>>>> > >> >>>>>> - mike > >> >>>>>> > >> >>>>>> > >> >>>>>> On 5/1/2013 12:06 PM, Wyatt Gorman wrote: > >> >>>>>> > >> >>>>>> Hi everyone, > >> >>>>>> > >> >>>>>> I'm setting up a test ceph cluster and am having trouble > >> >>>>>> getting it > >> >>>>>> running (great for testing, huh?). I went through the > >> >>>>>> installation on > >> >>>>>> Debian squeeze, had to modify the mkcephfs script a bit > >> >>>>>> because > >> >>>>>> it calls > >> >>>>>> monmaptool with too many paramaters in the $args variable > >> >>>>>> (mine had > >> >>>>>> "--add a [ip address]:[port] [osd1]" and I had to get rid > >> >>>>>> of > >> >>>>>> the > >> >>>>>> [osd1] > >> >>>>>> part for the monmaptool command to take it). Anyway, so I > >> >>>>>> got > >> >>>>>> it > >> >>>>>> installed, started the service, waiting a little while > for > >> >>>>>> it > >> >>>>>> to > >> >>>>>> build > >> >>>>>> the fs, and ran "ceph health" and got (and am still > getting > >> >>>>>> after a day > >> >>>>>> and a reboot) the following error: (note: I have also > been > >> >>>>>> getting the > >> >>>>>> first line in various calls, unsure why it is > complaining, > >> >>>>>> I > >> >>>>>> followed > >> >>>>>> the instructions...) > >> >>>>>> > >> >>>>>> warning: line 34: 'host' in section 'mon.a' redefined > >> >>>>>> 2013-05-01 12:04:39.801102 b733b710 -1 WARNING: unknown > >> >>>>>> auth > >> >>>>>> protocol > >> >>>>>> defined: [osd] > >> >>>>>> HEALTH_WARN 384 pgs degraded; 384 pgs stuck unclean; > >> >>>>>> recovery > >> >>>>>> 21/42 > >> >>>>>> degraded (50.000%) > >> >>>>>> > >> >>>>>> Can anybody tell me the root of this issue, and how I can > >> >>>>>> fix > >> >>>>>> it? Thank you! > >> >>>>>> > >> >>>>>> - Wyatt Gorman > >> >>>>>> > >> >>>>>> > >> >>>>>> _________________________________________________ > >> >>>>>> ceph-users mailing list > >> >>>>>> ceph-users@lists.ceph.com > >> >>>>>> <mailto:ceph-users@lists.ceph.com> > >> >>>>>> > http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com > >> >>>>>> <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com> > >> >>>>>> > >> >>>>>> > >> >>>> > >> >>>> > >> >>>> _______________________________________________ > >> >>>> ceph-users mailing list > >> >>>> ceph-users@lists.ceph.com > >> >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> >>>> > >> >>> > >> >>> > >> >>> > >> >>> -- > >> >>> ______________________________ > >> >>> Dino Yancey > >> >>> 2GNT.com Admin > >> >> > >> >> > >> > > >> > > >> > > >> > -- > >> > ______________________________ > >> > Dino Yancey > >> > 2GNT.com Admin > >> > > >> > _______________________________________________ > >> > ceph-users mailing list > >> > ceph-users@lists.ceph.com > >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> > > > > > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com