Re: [ceph-users] Multi-MDS setup, one MDS stuck in resolve, 3 stuck in standby, can't make another MDS come live

Yan, Zheng Thu, 05 Jun 2014 19:08:30 -0700

On Fri, Jun 6, 2014 at 8:38 AM, David Jericho
<david.jeri...@aarnet.edu.au> wrote:
> Hi all,
>
>
>
> I did a bit of an experiment with multi-mds on firefly, and it worked fine
> until one of the MDS crashed when rebalancing. It's not the end of the
> world, and I could just start fresh with the cluster, but I'm keen to see if
> this can be fixed as running multi-mds is something I would like to do in
> production, as when it was working, it did reduce load and improve response
> time significantly.
>
>
>
> The output of ceph mds dump is:
>
>
>
> dumped mdsmap epoch 1232
>
> epoch   1232
>
> flags   0
>
> created 2014-03-24 23:24:35.584469
>
> modified        2014-06-06 00:17:54.336201
>
> tableserver     0
>
> root    0
>
> session_timeout 60
>
> session_autoclose       300
>
> max_file_size   1099511627776
>
> last_failure    1227
>
> last_failure_osd_epoch  24869
>
> compat  compat={},rocompat={},incompat={1=base v0.20,2=client writeable
> ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds
> uses versioned encoding,6=dirfrag is stored in omap}
>
> max_mds 2
>
> in      0,1
>
> up      {1=578616}
>
> failed
>
> stopped
>
> data_pools
> 0,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,101,105
>
> metadata_pool   1
>
> inline_data     disabled
>
> 578616: 10.60.8.18:6808/252227 'c' mds.1.36 up:resolve seq 2
>
> 577576: 10.60.8.19:6800/58928 'd' mds.-1.0 up:standby seq 1
>
> 577603: 10.60.8.2:6801/245281 'a' mds.-1.0 up:standby seq 1
>
> 578623: 10.60.8.3:6800/75325 'b' mds.-1.0 up:standby seq 1
>
>
>
> Modifying max_mds has no effect, and restarting/rebooting the cluster has no
> effect. No matter what combination of commands I try with the ceph-mds
> binary, or via the ceph tool, can I make a second MDS startup, causing mds.1
> to leave resolve and move to the next step. Running with -debug_mds 10
> provides no really enlightening information, nor does watching the mon logs.
> At a guess, it's looking for mds.0 to communicate with.
>
>


please run mds -debug_mds 10 and send both mds' log to me

Regards
Yan, Zheng

>
> Anyone have some pointers?
>
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Multi-MDS setup, one MDS stuck in resolve, 3 stuck in standby, can't make another MDS come live

Reply via email to