[ceph-users] Re: Converting to cephadm from ceph-deploy

Andre Goree Tue, 28 Dec 2021 16:29:27 -0800

The one issue I'm seeing and probably the root of my problem is that cephadm 
set the user 'ceph' uid to 167...it's something else entirely on my system 
(perhaps from the fact that it's an older Luminous cluster built with 
ceph-deploy).


However, even when changing the ceph uid to what cephadm/docker is looking for 
(167), something is changing the perms on /dev/dm-1.

Annnnnd I got it working using the udev rules you provided!  So, I think for my 
whole issue, I'll need to make sure the uid & gid for the ceph user is set to 
167 (not sure why that was set but the fix is easy enough) and have udev rules 
avail to properly set the perms on /dev/dm-X as such.

Thanks!

________________________________________
From: Andre Goree <ago...@staff.atlantic.net>
Sent: Tuesday, December 28, 2021 6:40 PM
To: Mazzystr
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Converting to cephadm from ceph-deploy

Thank you!  I did figure that it maybe should be a soft link, and in fact I 
tried to fix it by linking everything properly, but as you've shown with your 
'ls' example of that directory, I certainly missed a few things.  This helps 
immensely.

Oddly enough, however, even the dir '/var/lib/ceph/osd/ceph-X' itself does not 
exist, and if I'm not mistaken, is copied to '/var/lib/ceph/$FSID/osd-X'.  Easy 
enough to determine how that needs to be symlinked, and inside 'osd-X' I see 
the relevant 'block' link so it does appear that everything's there.  The perms 
are another aspect I hadn't considered.  I'm going to try to work this out and 
report back, thanks!

________________________________________
From: Mazzystr <mazzy...@gmail.com>
Sent: Tuesday, December 28, 2021 5:10 PM
To: Andre Goree
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Converting to cephadm from ceph-deploy


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you recognize the sender and know the content 
is safe.

/var/lib/ceph/osd/ceph-X/block is a soft link.  Track down the soft link chain 
to the devmapper device.  Make sure ceph:ceph owns it

Example:
blah:/var/lib/ceph/osd/ceph-0 # ls -la block*
total 44
lrwxrwxrwx 1 ceph ceph 23 Apr 11  2019 block -> /dev/mapper/ceph-0block
lrwxrwxrwx 1 ceph ceph 20 Apr 11  2019 block.db -> /dev/mapper/ceph-0db
lrwxrwxrwx 1 ceph ceph 21 Apr 11  2019 block.wal -> /dev/mapper/ceph-0wal

blah:/var/lib/ceph/osd/ceph-0 # ls -la /dev/mapper/ceph-0block
lrwxrwxrwx 1 root root 8 Dec 28 12:41 /dev/mapper/ceph-0block -> ../dm-30

blah:/var/lib/ceph/osd/ceph-0 # ls -la /dev/dm-30
brw-rw---- 1 ceph ceph 254, 30 Dec 28 14:05 /dev/dm-30


I land a udev rule to the host to help me correct the ownership problem

cat > /etc/udev/rules.d/99-ceph-osd-${OSD_ID}.rules << EOF
ENV{DM_NAME}=="ceph-${OSD_ID}" OWNER="ceph" GROUP="ceph" MODE="0660"
ENV{DM_NAME}=="ceph-${OSD_ID}wal" OWNER="ceph" GROUP="ceph" MODE="0660"
ENV{DM_NAME}=="ceph-${OSD_ID}db" OWNER="ceph" GROUP="ceph" MODE="0660"
ENV{DM_NAME}=="ceph-${OSD_ID}block" OWNER="ceph" GROUP="ceph" MODE="0660"
EOF



On Tue, Dec 28, 2021 at 12:42 PM Andre Goree 
<ago...@staff.atlantic.net<mailto:ago...@staff.atlantic.net>> wrote:
First off, I made a similar post on 12/11/21 but had not explicitly signed up 
for the new mailing list (this email is a remnant from when the list was run 
with mailman) and I didn't get a reply here and couldn't reply, so I have to 
make this again, I apologize of the noise).


Hello all.  I'm  upgrading a cluster from (Ubuntu 16.04) Luminous to Pacific, 
within
which I've upgraded to (18.04) Nautilus, then to (20.04) Octopus.  The cluster 
ran
flawlessly througout that upgrade process which I'm very happy about.

I'm now at the point of converting the cluster to cephadm (it was built with
ceph-deploy), but I'm running into trouble.  I've followed this doc:
https://docs.ceph.com/en/latest/cephadm/adoption/

3 MON nodes
4 OSD nodes

The trouble is two-fold:  (1) it seems to be that once I've adopted the MON & 
MGR
daemons, I can't seem to get the localhost MON to list with "ceph orch ps"
only the two other MON nodes:

#### On MON node ####
root@cephmon01test:~# ceph orch ps
NAME               HOST           PORTS  STATUS         REFRESHED  AGE  MEM USE 
 MEM LIM
VERSION  IMAGE ID      CONTAINER ID
mgr.cephmon02test  cephmon02test         running (21h)     8m ago  21h     365M 
       -
16.2.5   6933c2a0b7dd  e08de388b92e
mgr.cephmon03test  cephmon03test         running (21h)     6m ago  21h     411M 
       -
16.2.5   6933c2a0b7dd  d358b697e49b
mon.cephmon02test  cephmon02test         running (21h)     8m ago    -     934M 
   2048M
16.2.5   6933c2a0b7dd  f349d7cc6816
mon.cephmon03test  cephmon03test         running (21h)     6m ago    -     923M 
   2048M
16.2.5   6933c2a0b7dd  64880b0659cc

root@cephmon01test:~# ceph orch ls
NAME  PORTS  RUNNING  REFRESHED  AGE  PLACEMENT
mgr              2/0  8m ago     -    <unmanaged>
mon              2/0  8m ago     -    <unmanaged>


All of the 'cephadm adopt' commands for the MONs and MGRs were run from the 
above
node.

My second issue is that when I proceed to adopt the OSDs (again, following
https://docs.ceph.com/en/latest/cephadm/adoption/), they seem to drop out of 
the cluster:

### on OSD node ###
root@cephosd01test:~# cephadm ls
[
    {
        "style": "cephadm:v1",
        "name": "osd.0",
        "fsid": "4cfa6467-6647-41e9-8184-1cacc408265c",
        "systemd_unit":
&quot;ceph-4cfa6467-6647-41e9-8184-1cacc408265c(a)osd.0&quot;sd.0",
        "enabled": true,
        "state": "error",
        "container_id": null,
        "container_image_name": "ceph/ceph:v16",
        "container_image_id": null,
        "version": null,
        "started": null,
        "created": null,
        "deployed": "2021-12-11T00:19:24.799615Z",
        "configured": null
    },
    {
        "style": "cephadm:v1",
        "name": "osd.1",
        "fsid": "4cfa6467-6647-41e9-8184-1cacc408265c",
        "systemd_unit":
&quot;ceph-4cfa6467-6647-41e9-8184-1cacc408265c(a)osd.1&quot;sd.1",
        "enabled": true,
        "state": "error",
        "container_id": null,
        "container_image_name": "ceph/ceph:v16",
        "container_image_id": null,
        "version": null,
        "started": null,
        "created": null,
        "deployed": "2021-12-11T21:20:02.170515Z",
        "configured": null
    }
]

Ceph health snippet:
  services:
    mon: 3 daemons, quorum cephmon02test,cephmon03test,cephmon01test (age 21h)
    mgr: cephmon03test(active, since 21h), standbys: cephmon02test
    osd: 8 osds: 6 up (since 39m), 8 in
         flags noout

Is there a specific way to get those OSDs adopted by cephadm to be shown 
properly in the
cluster and ceph orchestrator?

I asked the same question elsewhere and was asked if I could see my containers 
running, I have a reply for that:

Further background info, this cluster was build with 'ceph-deploy' on 12.2.4, 
I'm not sure if that's an issue _specifically_ for the conversion to cephadm, 
but I've been able to upgrade from Ubuntu Xenial & Luminous to Ubuntu Focal & 
Pacific -- it's just this conversion to cephadm that I'm having the issue with. 
This cluster is _only_ used for RBD devices (via Libvirt).

When I run "bash -x /var/lib/ceph/$FSID/osd.0/unit.run" I find that it's 
failing after looking for a block device that doesn't exist -- namely 
/var/lib/ceph/osd/ceph-0. This device was accurate for the ceph-deploy-built 
OSDs, but after 'cephadm adopt' has been run, the correct block device is 
'/dev/dm-1' if I'm not mistaken.

Looking at the cephadm logs, it appears this was by design as far as cephadm is 
concerned, however this is clearly the wrong device and so the containers fail 
to start.

debug 2021-12-28T03:33:58.368+0000 7f4b3207c080 -1 
bluestore(/var/lib/ceph/osd/ceph-0/block) _read_bdev_label failed to open 
/var/lib/ceph/osd/ceph-0/block: (13) Permission denied
debug 2021-12-28T03:33:58.368+0000 7f4b3207c080  1 
bluestore(/var/lib/ceph/osd/ceph-0) _mount path /var/lib/ceph/osd/ceph-0
debug 2021-12-28T03:33:58.368+0000 7f4b3207c080  0 
bluestore(/var/lib/ceph/osd/ceph-0) _open_db_and_around read-only:0 repair:0
debug 2021-12-28T03:33:58.368+0000 7f4b3207c080 -1 
bluestore(/var/lib/ceph/osd/ceph-0/block) _read_bdev_label failed to open 
/var/lib/ceph/osd/ceph-0/block: (13) Permission denied
debug 2021-12-28T03:33:58.368+0000 7f4b3207c080  1 bdev(0x5642f6a9a400 
/var/lib/ceph/osd/ceph-0/block) open path /var/lib/ceph/osd/ceph-0/block
debug 2021-12-28T03:33:58.368+0000 7f4b3207c080 -1 bdev(0x5642f6a9a400 
/var/lib/ceph/osd/ceph-0/block) open open got: (13) Permission denied
debug 2021-12-28T03:33:58.368+0000 7f4b3207c080 -1 osd.0 0 OSD:init: unable to 
mount object store
debug 2021-12-28T03:33:58.368+0000 7f4b3207c080 -1  ** ERROR: osd init failed: 
(13) Permission denied
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io>
To unsubscribe send an email to 
ceph-users-le...@ceph.io<mailto:ceph-users-le...@ceph.io>
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Converting to cephadm from ceph-deploy

Reply via email to