Hi all. I'm new to ceph, and after having serious problems in ceph stages 0, 1 and 2 that I could solve myself, now it seems that I have hit a wall harder than my head. :)
When I run salt-run state.orch ceph.stage.deploy, i monitor I see it going up to here: ####### [14/71] ceph.sysctl on node01....................................... ✓ (0.5s) node02........................................ ✓ (0.7s) node03....................................... ✓ (0.6s) node04......................................... ✓ (0.5s) node05....................................... ✓ (0.6s) node06.......................................... ✓ (0.5s) [15/71] ceph.osd on node01...................................... ❌ (0.7s) node02........................................ ❌ (0.7s) node03....................................... ❌ (0.7s) node04......................................... ❌ (0.6s) node05....................................... ❌ (0.6s) node06.......................................... ❌ (0.7s) Ended stage: ceph.stage.deploy succeeded=14/71 failed=1/71 time=624.7s Failures summary: ceph.osd (/srv/salt/ceph/osd): node02: deploy OSDs: Module function osd.deploy threw an exception. Exception: Mine on node02 for cephdisks.list node03: deploy OSDs: Module function osd.deploy threw an exception. Exception: Mine on node03 for cephdisks.list node01: deploy OSDs: Module function osd.deploy threw an exception. Exception: Mine on node01 for cephdisks.list node04: deploy OSDs: Module function osd.deploy threw an exception. Exception: Mine on node04 for cephdisks.list node05: deploy OSDs: Module function osd.deploy threw an exception. Exception: Mine on node05 for cephdisks.list node06: deploy OSDs: Module function osd.deploy threw an exception. Exception: Mine on node06 for cephdisks.list ####### Since this is a first attempt in 6 simple test machines, we are going to put the mon, osds, etc, in all nodes at first. Only the master is left in a single machine (node01) by now. As they are simple machines, they have a single hdd, which is partitioned as follows (the hda4 partition is unmounted and left for the ceph system): ########### # lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 465,8G 0 disk ├─sda1 8:1 0 500M 0 part /boot/efi ├─sda2 8:2 0 16G 0 part [SWAP] ├─sda3 8:3 0 49,3G 0 part / └─sda4 8:4 0 400G 0 part sr0 11:0 1 3,7G 0 rom # salt -I 'roles:storage' cephdisks.list node01: node02: node03: node04: node05: node06: # salt -I 'roles:storage' pillar.get ceph node02: ---------- storage: ---------- osds: ---------- /dev/sda4: ---------- format: bluestore standalone: True (and so on for all 6 machines) ########## Finally and just in case, my policy.cfg file reads: ######### #cluster-unassigned/cluster/*.sls cluster-ceph/cluster/*.sls profile-default/cluster/*.sls profile-default/stack/default/ceph/minions/*yml config/stack/default/global.yml config/stack/default/ceph/cluster.yml role-master/cluster/node01.sls role-admin/cluster/*.sls role-mon/cluster/*.sls role-mgr/cluster/*.sls role-mds/cluster/*.sls role-ganesha/cluster/*.sls role-client-nfs/cluster/*.sls role-client-cephfs/cluster/*.sls ########## Please, could someone help me and shed some light on this issue? Thanks a lot in advance, Regasrds, Jones On Thu, Aug 23, 2018 at 2:46 PM John Spray <jsp...@redhat.com> wrote: > On Thu, Aug 23, 2018 at 5:18 PM Steven Vacaroaia <ste...@gmail.com> wrote: > > > > Hi All, > > > > I am trying to enable prometheus plugin with no success due to "no > socket could be created" > > > > The instructions for enabling the plugin are very straightforward and > simple > > > > Note > > My ultimate goal is to use Prometheus with Cephmetrics > > Some of you suggested to deploy ceph-exporter but why do we need to do > that when there is a plugin already ? > > > > > > How can I troubleshoot this further ? > > > > nhandled exception from module 'prometheus' while running on mgr.mon01: > error('No socket could be created',) > > Aug 23 12:03:06 mon01 ceph-mgr: 2018-08-23 12:03:06.615 7fadab50e700 -1 > prometheus.serve: > > Aug 23 12:03:06 mon01 ceph-mgr: 2018-08-23 12:03:06.615 7fadab50e700 -1 > Traceback (most recent call last): > > Aug 23 12:03:06 mon01 ceph-mgr: File > "/usr/lib64/ceph/mgr/prometheus/module.py", line 720, in serve > > Aug 23 12:03:06 mon01 ceph-mgr: cherrypy.engine.start() > > Aug 23 12:03:06 mon01 ceph-mgr: File > "/usr/lib/python2.7/site-packages/cherrypy/process/wspbus.py", line 250, in > start > > Aug 23 12:03:06 mon01 ceph-mgr: raise e_info > > Aug 23 12:03:06 mon01 ceph-mgr: ChannelFailures: error('No socket could > be created',) > > The things I usually check if a process can't create a socket are: > - is there anything on the same node already listening on that port? > - are there security policies (e.g. selinux) that might be preventing it? > > John > > > > > _______________________________________________ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com