Re: [ceph-users] Very Basic question

Luca Mazzaferro Thu, 13 Nov 2014 09:29:07 -0800

Hi,
thank you for your answer:

On 11/13/2014 06:17 PM, Gregory Farnum wrote:

What does "ceph -s" output when things are working?


Does the ceph.conf on your admin node

BEFORE the problem: from ceph -w because I don't have ceph -s

[rzgceph@admin-node my-cluster]$ ceph -w
    cluster 6fa39bb3-de2d-4ec5-9a86-9d96231d8b5b
     health HEALTH_OK

monmap e3: 3 mons at{ceph-node1=192.168.122.21:6789/0,ceph-node2=192.168.122.22:6789/0,ceph-node3=192.168.122.23:6789/0},election epoch 6, quorum 0,1,2 ceph-node1,ceph-node2,ceph-node3

     mdsmap e4: 1/1/1 up {0=ceph-node1=up:active}
     osdmap e13: 3 osds: 3 up, 3 in
      pgmap v26: 192 pgs, 3 pools, 1889 bytes data, 21 objects
            103 MB used, 76655 MB / 76759 MB avail
                 192 active+clean

2014-11-13 17:08:43.240961 mon.0 [INF] pgmap v26: 192 pgs: 192active+clean; 1889 bytes data, 103 MB used, 76655 MB / 76759 MB avail; 8B/s wr, 0 op/s

contain the address of each monitor? (Paste is the relevant lines.) itwill need to or the ceph tool won't be able to find the monitors eventhough the system is working.

No only the initial one... but the documentation doesn't say it, but itis reasonable.

I added the other two. This is my ceph.conf:

[global]
auth_service_required = cephx
filestore_xattr_use_omap = true
auth_client_required = cephx
auth_cluster_required = cephx
mon_host = 192.168.122.21 192.68.122.22 192.168.122.23
mon_initial_members = ceph-node1
fsid = 6fa39bb3-de2d-4ec5-9a86-9d96231d8b5b
osd pool default size = 2
public network = 192.168.0.0/16


and then:

ceph-deploy --overwrite-conf admin admin-node ceph-node1 ceph-node2ceph-node3


and now:

2014-11-13 18:24:57.522590 7fa4282d1700 0 -- 192.168.122.11:0/1003667>> 192.168.122.23:6789/0 pipe(0x7fa418001d40 sd=3 :0 s=1 pgs=0 cs=0 l=1c=0x7fa418001fb0).fault2014-11-13 18:25:06.524145 7fa4283d2700 0 -- 192.168.122.11:0/1003667>> 192.168.122.23:6789/0 pipe(0x7fa418002fa0 sd=3 :0 s=1 pgs=0 cs=0 l=1c=0x7fa418003210).fault2014-11-13 18:25:12.525096 7fa4283d2700 0 -- 192.168.122.11:0/1003667>> 192.168.122.23:6789/0 pipe(0x7fa418003bf0 sd=4 :0 s=1 pgs=0 cs=0 l=1c=0x7fa418003e60).fault2014-11-13 18:25:21.526622 7fa4282d1700 0 -- 192.168.122.11:0/1003667>> 192.168.122.23:6789/0 pipe(0x7fa4180085a0 sd=3 :0 s=1 pgs=0 cs=0 l=1c=0x7fa418008810).fault2014-11-13 18:25:33.528831 7fa4284d3700 0 -- 192.168.122.11:0/1003667>> 192.168.122.23:6789/0 pipe(0x7fa4180085a0 sd=3 :0 s=1 pgs=0 cs=0 l=1c=0x7fa418008810).fault2014-11-13 18:25:42.530185 7fa4284d3700 0 -- 192.168.122.11:0/1003667>> 192.168.122.23:6789/0 pipe(0x7fa418009740 sd=3 :0 s=1 pgs=0 cs=0 l=1c=0x7fa4180099b0).fault2014-11-13 18:25:51.531688 7fa4283d2700 0 -- 192.168.122.11:0/1003667>> 192.168.122.23:6789/0 pipe(0x7fa41800a330 sd=4 :0 s=1 pgs=0 cs=0 l=1c=0x7fa41800a5a0).fault2014-11-13 18:26:09.534223 7fa4284d3700 0 -- 192.168.122.11:0/1003667>> 192.168.122.23:6789/0 pipe(0x7fa41800d550 sd=3 :0 s=1 pgs=0 cs=0 l=1c=0x7fa41800e6b0).fault


Better, someone (ceph-node3) answers but not in the right way I see.

     Luca

-Greg

On Thu, Nov 13, 2014 at 9:11 AM Luca Mazzaferro<luca.mazzafe...@rzg.mpg.de <mailto:luca.mazzafe...@rzg.mpg.de>> wrote:


    Hi,

    On 11/13/2014 06:05 PM, Artem Silenkov wrote:

    Hello!

    Only 1 monitor instance? It won't work at most cases.
    Make more and ensure quorum to reach survivalability.

    No, three monitor instances, one for each ceph-node. As designed
    into the
    quick-ceph-deploy.

    I tried to kill one of them (the initial monitor) to see what
    happens and happens that.
    :-(
    Ciaoo


        Luca

    Regards, Silenkov Artem
    ---
    artem.silen...@gmail.com <mailto:artem.silen...@gmail.com>

    2014-11-13 20:02 GMT+03:00 Luca Mazzaferro
    <luca.mazzafe...@rzg.mpg.de <mailto:luca.mazzafe...@rzg.mpg.de>>:

        Dear Users,
        I followed the instruction of the storage cluster quick start
        here:

        http://ceph.com/docs/master/start/quick-ceph-deploy/

        I simulate a little storage with 4 VMs ceph-node[1,2,3] and
        an admin-node.
        Everything worked fine until I shut down the initial monitor
        node (ceph-node1).

        Also with the other monitors on.

        I restart the ceph-node1 but the ceph command (running from
        ceph-admin) fails after hanging for 5 minutes.
        with this exit code:
        2014-11-13 17:33:31.711410 7f6a5b1af700  0
        monclient(hunting): authenticate timed out after 300
        2014-11-13 17:33:31.711522 7f6a5b1af700  0 librados:
        client.admin authentication error (110) Connection timed out

        If I go to the ceph-node1 and restart the services:

        [root@ceph-node1 ~]# service ceph status
        === mon.ceph-node1 ===
        mon.ceph-node1: running {"version":"0.80.7"}
        === osd.2 ===
        osd.2: not running.
        === mds.ceph-node1 ===
        mds.ceph-node1: running {"version":"0.80.7"}
        [root@ceph-node1 ~]# service ceph status
        === mon.ceph-node1 ===
        mon.ceph-node1: running {"version":"0.80.7"}
        === osd.2 ===
        osd.2: not running.
        === mds.ceph-node1 ===
        mds.ceph-node1: running {"version":"0.80.7"}

        I don't understand how to properly restart a node.
        Can anyone help me?
        Thank you.
        Cheers.

            Luca





        _______________________________________________
        ceph-users mailing list
        ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
        http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


    _______________________________________________
    ceph-users mailing list
    ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
    http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Very Basic question

Reply via email to