Hi Wyatt,

This is almost certainly a configuration issue.  If i recall, there is a
min_size setting in the CRUSH rules for each pool that defaults to two
which you may also need to reduce to one.  I don't have the documentation
in front of me, so that's just off the top of my head...

Dino


On Wed, May 1, 2013 at 3:19 PM, Wyatt Gorman <wyattgor...@wyattgorman.com>wrote:

> Okay! Dino, thanks for your response. I reduced my metadata pool size and
> data pool size to 1, which eliminated the "recovery 21/42 degraded
> (50.000%)" at the end of my HEALTH_WARN error. So now, when I run "ceph
> health" I get the following:
>
> HEALTH_WARN 384 pgs degraded; 384 pgs stale; 384 pgs stuck unclean
>
> So this seems to be from one single root cause. Any ideas? Again, is this
> a corrupted drive issue that I can clean up, or is this still a ceph
> configuration error?
>
>
> On Wed, May 1, 2013 at 12:52 PM, Dino Yancey <dino2...@gmail.com> wrote:
>
>> Hi Wyatt,
>>
>> You need to reduce the replication level on your existing pools to 1, or
>> bring up another OSD.  The default configuration specifies a replication
>> level of 2, and the default crush rules want to place a replica on two
>> distinct OSDs.  With one OSD, CRUSH can't determine placement for the
>> replica and so Ceph is reporting a degraded state.
>>
>> Dino
>>
>>
>> On Wed, May 1, 2013 at 11:45 AM, Wyatt Gorman <
>> wyattgor...@wyattgorman.com> wrote:
>>
>>> Well, those points solved the issue of the redefined host and the
>>> unidentified protocol. The
>>>
>>>
>>> "HEALTH_WARN 384 pgs degraded; 384 pgs stuck unclean; recovery 21/42
>>> degraded (50.000%)"
>>>
>>> error is still an issue, though. Is this something simple like some hard
>>> drive corruption that I can clean up with a fsck, or is this a ceph issue?
>>>
>>>
>>>
>>> On Wed, May 1, 2013 at 12:31 PM, Mike Dawson <
>>> mike.daw...@scholarstack.com> wrote:
>>>
>>>> Wyatt,
>>>>
>>>> A few notes:
>>>>
>>>> - Yes, the second "host = ceph" under mon.a is redundant and should be
>>>> deleted.
>>>>
>>>> - "auth client required = cephx [osd]" should be simply
>>>> auth client required = cephx".
>>>>
>>>> - Looks like you only have one OSD. You need at least as many (and
>>>> hopefully more) OSDs than highest replication level out of your pools.
>>>>
>>>> Mike
>>>>
>>>>
>>>> On 5/1/2013 12:23 PM, Wyatt Gorman wrote:
>>>>
>>>>> Here is my ceph.conf. I just figured out that the second host = isn't
>>>>> necessary, though it is like that on the 5-minute quick start guide...
>>>>> (Perhaps I'll submit my couple of fixes that I've had to implement so
>>>>> far). That fixes the "redefined host" issue, but none of the others.
>>>>>
>>>>> [global]
>>>>>      # For version 0.55 and beyond, you must explicitly enable or
>>>>>      # disable authentication with "auth" entries in [global].
>>>>>
>>>>>      auth cluster required = cephx
>>>>>      auth service required = cephx
>>>>>      auth client required = cephx [osd]
>>>>>      osd journal size = 1000
>>>>>
>>>>>      #The following assumes ext4 filesystem.
>>>>>      filestore xattr use omap = true
>>>>>      # For Bobtail (v 0.56) and subsequent versions, you may add
>>>>>      #settings for mkcephfs so that it will create and mount the file
>>>>>      #system on a particular OSD for you. Remove the comment `#`
>>>>>      #character for the following settings and replace the values in
>>>>>      #braces with appropriate values, or leave the following settings
>>>>>      #commented out to accept the default values. You must specify
>>>>>      #the --mkfs option with mkcephfs in order for the deployment
>>>>>      #script to utilize the following settings, and you must define
>>>>>      #the 'devs' option for each osd instance; see below. osd mkfs
>>>>>      #type = {fs-type} osd mkfs options {fs-type} = {mkfs options} #
>>>>>      #default for xfs is "-f" osd mount options {fs-type} = {mount
>>>>>      #options} # default mount option is "rw,noatime"
>>>>>      # For example, for ext4, the mount option might look like this:
>>>>>
>>>>>      #osd mkfs options ext4 = user_xattr,rw,noatime
>>>>>      # Execute $ hostname to retrieve the name of your host, and
>>>>>      # replace {hostname} with the name of your host. For the
>>>>>      # monitor, replace {ip-address} with the IP address of your
>>>>>      # host.
>>>>> [mon.a]
>>>>>      host = ceph
>>>>>      mon addr = 10.81.2.100:6789 <http://10.81.2.100:6789> [osd.0]
>>>>>
>>>>>      host = ceph
>>>>>
>>>>>      # For Bobtail (v 0.56) and subsequent versions, you may add
>>>>>      # settings for mkcephfs so that it will create and mount the
>>>>>      # file system on a particular OSD for you. Remove the comment
>>>>>      # `#` character for the following setting for each OSD and
>>>>>      # specify a path to the device if you use mkcephfs with the
>>>>>      # --mkfs option.
>>>>>
>>>>>      #devs = {path-to-device}
>>>>> [osd.1]
>>>>>      host = ceph
>>>>>      #devs = {path-to-device}
>>>>> [mds.a]
>>>>>      host = ceph
>>>>>
>>>>>
>>>>> On Wed, May 1, 2013 at 12:14 PM, Mike Dawson
>>>>> <mike.daw...@scholarstack.com 
>>>>> <mailto:mike.dawson@**scholarstack.com<mike.daw...@scholarstack.com>>>
>>>>> wrote:
>>>>>
>>>>>     Wyatt,
>>>>>
>>>>>     Please post your ceph.conf.
>>>>>
>>>>>     - mike
>>>>>
>>>>>
>>>>>     On 5/1/2013 12:06 PM, Wyatt Gorman wrote:
>>>>>
>>>>>         Hi everyone,
>>>>>
>>>>>         I'm setting up a test ceph cluster and am having trouble
>>>>> getting it
>>>>>         running (great for testing, huh?). I went through the
>>>>>         installation on
>>>>>         Debian squeeze, had to modify the mkcephfs script a bit because
>>>>>         it calls
>>>>>         monmaptool with too many paramaters in the $args variable
>>>>> (mine had
>>>>>         "--add a [ip address]:[port] [osd1]" and I had to get rid of
>>>>> the
>>>>>         [osd1]
>>>>>         part for the monmaptool command to take it). Anyway, so I got
>>>>> it
>>>>>         installed, started the service, waiting a little while for it
>>>>> to
>>>>>         build
>>>>>         the fs, and ran "ceph health" and got (and am still getting
>>>>>         after a day
>>>>>         and a reboot) the following error: (note: I have also been
>>>>>         getting the
>>>>>         first line in various calls, unsure why it is complaining, I
>>>>>         followed
>>>>>         the instructions...)
>>>>>
>>>>>         warning: line 34: 'host' in section 'mon.a' redefined
>>>>>         2013-05-01 12:04:39.801102 b733b710 -1 WARNING: unknown auth
>>>>>         protocol
>>>>>         defined: [osd]
>>>>>         HEALTH_WARN 384 pgs degraded; 384 pgs stuck unclean; recovery
>>>>> 21/42
>>>>>         degraded (50.000%)
>>>>>
>>>>>         Can anybody tell me the root of this issue, and how I can fix
>>>>>         it? Thank you!
>>>>>
>>>>>         - Wyatt Gorman
>>>>>
>>>>>
>>>>>         ______________________________**___________________
>>>>>         ceph-users mailing list
>>>>>         ceph-users@lists.ceph.com 
>>>>> <mailto:ceph-us...@lists.ceph.**com<ceph-users@lists.ceph.com>
>>>>> >
>>>>>         
>>>>> http://lists.ceph.com/__**listinfo.cgi/ceph-users-ceph._**_com<http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com>
>>>>>         
>>>>> <http://lists.ceph.com/**listinfo.cgi/ceph-users-ceph.**com<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
>>>>> >
>>>>>
>>>>>
>>>>>
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>>
>>
>>
>> --
>> ______________________________
>> Dino Yancey
>> 2GNT.com Admin
>>
>
>


-- 
______________________________
Dino Yancey
2GNT.com Admin
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to