[ceph-users] fsid changed?

Mike Carlson Thu, 21 Jan 2016 10:07:56 -0800

Hey ceph-users,

One of of ceph environments changed its fsid for the cluster, and I would
like advice on how to get it corrected.


We added a new OSD node in hope of retiring one of the older OSD + MON
nodes.

Using ceph-deploy, we unfortunately ran "ceph-deploy mon create ..."
instead of mon add

The ceph.log file reported:


[ceph_deploy.conf][DEBUG ] found configuration file at:
/root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.23): /usr/bin/ceph-deploy mon create
ceph-osd7
[ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts ceph-osd7
[ceph_deploy.mon][DEBUG ] detecting platform for host ceph-osd7 ...
[ceph-osd7][DEBUG ] connection detected need for sudo
[ceph-osd7][DEBUG ] connected to host: ceph-osd7
[ceph-osd7][DEBUG ] detect platform information from remote host
[ceph-osd7][DEBUG ] detect machine type
[ceph_deploy.mon][INFO  ] distro info: Ubuntu 14.04 trusty
[ceph-osd7][DEBUG ] determining if provided host has same hostname in remote
[ceph-osd7][DEBUG ] get remote short hostname
[ceph-osd7][DEBUG ] deploying mon to ceph-osd7
[ceph-osd7][DEBUG ] get remote short hostname
[ceph-osd7][DEBUG ] remote hostname: ceph-osd7
[ceph-osd7][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-osd7][DEBUG ] create the mon path if it does not exist
[ceph-osd7][DEBUG ] checking for done path:
/var/lib/ceph/mon/ceph-ceph-osd7/done
[ceph-osd7][DEBUG ] done path does not exist:
/var/lib/ceph/mon/ceph-ceph-osd7/done
[ceph-osd7][INFO  ] creating keyring file:
/var/lib/ceph/tmp/ceph-ceph-osd7.mon.keyring
[ceph-osd7][DEBUG ] create the monitor keyring file
[ceph-osd7][INFO  ] Running command: sudo ceph-mon --cluster ceph --mkfs -i
ceph-osd7 --keyring /var/lib/ceph/tmp/ceph-ceph-osd7.mon.keyring
[ceph-osd7][DEBUG ] ceph-mon: set fsid to
6870cff2-6cbc-4e99-8615-c159ba3a0546


So, it looks like the fsid was changed
from e238f5b3-7d67-4b55-8563-52008828db51
to  6870cff2-6cbc-4e99-8615-c159ba3a0546

ceph -s shows the previous fsid:

ceph -s
    cluster e238f5b3-7d67-4b55-8563-52008828db51
     health HEALTH_WARN
            too few PGs per OSD (29 < min 30)
            1 mons down, quorum 1,2 ceph-mon,ceph-osd3
     monmap e9: 3 mons at {ceph-mon=
10.5.68.69:6789/0,ceph-osd3=10.5.68.92:6789/0,ceph-osd6=10.5.68.35:6789/0}
            election epoch 416, quorum 1,2 ceph-mon,ceph-osd3
     mdsmap e640: 1/1/1 up {0=ceph-mon=up:active}, 1 up:standby
     osdmap e6275: 58 osds: 58 up, 58 in
      pgmap v18874768: 848 pgs, 16 pools, 4197 GB data, 911 kobjects
            8691 GB used, 22190 GB / 30881 GB avail
                 848 active+clean
  client io 0 B/s rd, 345 kB/s wr, 58 op/s


What seems odd, is my ceph.conf never
had e238f5b3-7d67-4b55-8563-52008828db51 as the fsid. I even pulled from
backups, and  it has always been:

root@ceph-mon:~/RESTORE/2016-01-02/etc/ceph# cat ceph.conf
[global]
fsid = 6870cff2-6cbc-4e99-8615-c159ba3a0546
mon_initial_members = ceph-mon
mon_host = 10.5.68.69,10.5.68.65,10.5.68.92
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
filestore_xattr_use_omap = true
osd_pool_default_size = 2
public_network = 10.5.68.0/24
cluster_network = 10.7.1.0/24


The cluster seems to be "up", but I'm concerned that I only have 2
monitors, I cannot add a third since authentication to the cluster fails:

2016-01-20 16:41:09.544870 7f1f238ed8c0  0 ceph version 0.94.2
(5fb85614ca8f354284c713a2f9c610860720bbf3), process ceph-mon, pid 32010
2016-01-20 16:41:09.765043 7f1f238ed8c0  0 mon.ceph-osd6 does not exist in
monmap, will attempt to join an existing cluster
2016-01-20 16:41:09.773435 7f1f238ed8c0  0 using public_addr 10.5.68.35:0/0
-> 10.5.68.35:6789/0
2016-01-20 16:41:09.773517 7f1f238ed8c0  0 starting mon.ceph-osd6 rank -1
at 10.5.68.35:6789/0 mon_data /var/lib/ceph/mon/ceph-ceph-osd6 fsid
6870cff2-6cbc-4e99-8615-c159ba3a0546
2016-01-20 16:41:09.774549 7f1f238ed8c0  1 mon.ceph-osd6@-1(probing) e0
preinit fsid 6870cff2-6cbc-4e99-8615-
c159ba3a0546
2016-01-20 16:41:10.746413 7f1f1ec65700  0 log_channel(audit) log [DBG] :
from='admin socket' entity='admin socket' cmd='mon_status' args=[]: dispatch



And yes, we need to increase our PG count. This cluster has grown from a
few 2TB drives to multiple 600GB sas drives, but I don't want to touch
anything else until I can get this figured out.

This is running as our Openstack VM storage, so it is not something we can
simply rebuild.

Thanks,
Mike C

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] fsid changed?

Reply via email to