Hey ceph-users, One of of ceph environments changed its fsid for the cluster, and I would like advice on how to get it corrected.
We added a new OSD node in hope of retiring one of the older OSD + MON nodes. Using ceph-deploy, we unfortunately ran "ceph-deploy mon create ..." instead of mon add The ceph.log file reported: [ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (1.5.23): /usr/bin/ceph-deploy mon create ceph-osd7 [ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts ceph-osd7 [ceph_deploy.mon][DEBUG ] detecting platform for host ceph-osd7 ... [ceph-osd7][DEBUG ] connection detected need for sudo [ceph-osd7][DEBUG ] connected to host: ceph-osd7 [ceph-osd7][DEBUG ] detect platform information from remote host [ceph-osd7][DEBUG ] detect machine type [ceph_deploy.mon][INFO ] distro info: Ubuntu 14.04 trusty [ceph-osd7][DEBUG ] determining if provided host has same hostname in remote [ceph-osd7][DEBUG ] get remote short hostname [ceph-osd7][DEBUG ] deploying mon to ceph-osd7 [ceph-osd7][DEBUG ] get remote short hostname [ceph-osd7][DEBUG ] remote hostname: ceph-osd7 [ceph-osd7][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [ceph-osd7][DEBUG ] create the mon path if it does not exist [ceph-osd7][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-ceph-osd7/done [ceph-osd7][DEBUG ] done path does not exist: /var/lib/ceph/mon/ceph-ceph-osd7/done [ceph-osd7][INFO ] creating keyring file: /var/lib/ceph/tmp/ceph-ceph-osd7.mon.keyring [ceph-osd7][DEBUG ] create the monitor keyring file [ceph-osd7][INFO ] Running command: sudo ceph-mon --cluster ceph --mkfs -i ceph-osd7 --keyring /var/lib/ceph/tmp/ceph-ceph-osd7.mon.keyring [ceph-osd7][DEBUG ] ceph-mon: set fsid to 6870cff2-6cbc-4e99-8615-c159ba3a0546 So, it looks like the fsid was changed from e238f5b3-7d67-4b55-8563-52008828db51 to 6870cff2-6cbc-4e99-8615-c159ba3a0546 ceph -s shows the previous fsid: ceph -s cluster e238f5b3-7d67-4b55-8563-52008828db51 health HEALTH_WARN too few PGs per OSD (29 < min 30) 1 mons down, quorum 1,2 ceph-mon,ceph-osd3 monmap e9: 3 mons at {ceph-mon= 10.5.68.69:6789/0,ceph-osd3=10.5.68.92:6789/0,ceph-osd6=10.5.68.35:6789/0} election epoch 416, quorum 1,2 ceph-mon,ceph-osd3 mdsmap e640: 1/1/1 up {0=ceph-mon=up:active}, 1 up:standby osdmap e6275: 58 osds: 58 up, 58 in pgmap v18874768: 848 pgs, 16 pools, 4197 GB data, 911 kobjects 8691 GB used, 22190 GB / 30881 GB avail 848 active+clean client io 0 B/s rd, 345 kB/s wr, 58 op/s What seems odd, is my ceph.conf never had e238f5b3-7d67-4b55-8563-52008828db51 as the fsid. I even pulled from backups, and it has always been: root@ceph-mon:~/RESTORE/2016-01-02/etc/ceph# cat ceph.conf [global] fsid = 6870cff2-6cbc-4e99-8615-c159ba3a0546 mon_initial_members = ceph-mon mon_host = 10.5.68.69,10.5.68.65,10.5.68.92 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx filestore_xattr_use_omap = true osd_pool_default_size = 2 public_network = 10.5.68.0/24 cluster_network = 10.7.1.0/24 The cluster seems to be "up", but I'm concerned that I only have 2 monitors, I cannot add a third since authentication to the cluster fails: 2016-01-20 16:41:09.544870 7f1f238ed8c0 0 ceph version 0.94.2 (5fb85614ca8f354284c713a2f9c610860720bbf3), process ceph-mon, pid 32010 2016-01-20 16:41:09.765043 7f1f238ed8c0 0 mon.ceph-osd6 does not exist in monmap, will attempt to join an existing cluster 2016-01-20 16:41:09.773435 7f1f238ed8c0 0 using public_addr 10.5.68.35:0/0 -> 10.5.68.35:6789/0 2016-01-20 16:41:09.773517 7f1f238ed8c0 0 starting mon.ceph-osd6 rank -1 at 10.5.68.35:6789/0 mon_data /var/lib/ceph/mon/ceph-ceph-osd6 fsid 6870cff2-6cbc-4e99-8615-c159ba3a0546 2016-01-20 16:41:09.774549 7f1f238ed8c0 1 mon.ceph-osd6@-1(probing) e0 preinit fsid 6870cff2-6cbc-4e99-8615- c159ba3a0546 2016-01-20 16:41:10.746413 7f1f1ec65700 0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd='mon_status' args=[]: dispatch And yes, we need to increase our PG count. This cluster has grown from a few 2TB drives to multiple 600GB sas drives, but I don't want to touch anything else until I can get this figured out. This is running as our Openstack VM storage, so it is not something we can simply rebuild. Thanks, Mike C
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com