Hi Robert,

this is a bit less trivial than it might look right now. The ceph user is 
usually created by installing the package ceph-common. By default it will use 
id 167. If the ceph user already exists, I would assume it will use the 
existing user to allow an operator to avoid UID collisions (if 167 is used 
already).

If you use docker, the ceph UID on the host and inside the container should 
match (or need to be translated). If they don't, you will have a lot of fun 
re-owning stuff all the time, because deployments will use the symbolic name 
ceph, which has different UIDs on the host and inside the container in your 
case.

I would recommend removing this discrepancy as soon as possible:

1) Find out why there was a ceph user with UID different from 167 before 
installation of ceph-common.
   Did you create it by hand? Was UID 167 allocated already?
2) If you can safely change the GID and UID of ceph to 167, just do 
groupmod+usermod with new GID and UID.
3) If 167 is used already by another service, you will have to map the UIDs 
between host and container.

To prevent ansible from deploying dockerized ceph with mismatching user ID for 
ceph, add these tasks to an appropriate part of your deployment (general host 
preparation or so):

- name: "Create group 'ceph'."
  group:
    name: ceph
    gid: 167
    local: yes
    state: present
    system: yes

- name: "Create user 'ceph'."
  user:
    name: ceph
    password: "!"
    comment: "ceph-container daemons"
    uid: 167
    group: ceph
    shell: "/sbin/nologin"
    home: "/var/lib/ceph"
    create_home: no
    local: yes
    state: present
    system: yes

This should err if a group and user ceph already exist with IDs different from 
167.

Best regards,

=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: ceph-users <ceph-users-boun...@lists.ceph.com> on behalf of Robert 
LeBlanc <rob...@leblancnet.us>
Sent: 28 August 2019 23:23:06
To: ceph-users
Subject: Re: [ceph-users] Failure to start ceph-mon in docker

Turns out /var/lib/ceph was ceph.ceph and not 167.167, chowning it made things 
work. I guess only monitor needs that permission, rgw,mgr,osd are all happy 
without needing it to be 167.167.
----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Wed, Aug 28, 2019 at 1:45 PM Robert LeBlanc 
<rob...@leblancnet.us<mailto:rob...@leblancnet.us>> wrote:
We are trying to set up a new Nautilus cluster using ceph-ansible with 
containers. We got things deployed, but I couldn't run `ceph s` on the host so 
decided to `apt install ceph-common and installed the Luminous version from 
Ubuntu 18.04. For some reason the docker container that was running the monitor 
restarted and won't restart. I added the repo for Nautilus and upgraded 
ceph-common, but the problem persists. The Manager and OSD docker containers 
don't seem to be affected at all. I see this in the journal:

Aug 28 20:40:55 sun-gcs02-osd01 systemd[1]: Starting Ceph Monitor...
Aug 28 20:40:55 sun-gcs02-osd01 docker[2926]: Error: No such container: 
ceph-mon-sun-gcs02-osd01
Aug 28 20:40:55 sun-gcs02-osd01 systemd[1]: Started Ceph Monitor.
Aug 28 20:40:55 sun-gcs02-osd01 docker[2949]: WARNING: Your kernel does not 
support swap limit capabilities or the cgroup is not mounted. Memory limited 
without swap.
Aug 28 20:40:56 sun-gcs02-osd01 docker[2949]: 2019-08-28 20:40:56  
/opt/ceph-container/bin/entrypoint.sh: Existing mon, trying to rejoin cluster...
Aug 28 20:40:56 sun-gcs02-osd01 docker[2949]: warning: line 41: 
'osd_memory_target' in section 'osd' redefined
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: 2019-08-28 20:41:03  
/opt/ceph-container/bin/entrypoint.sh: /etc/ceph/ceph.conf is already memory 
tuned
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: 2019-08-28 20:41:03  
/opt/ceph-container/bin/entrypoint.sh: SUCCESS
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: exec: PID 368: spawning 
/usr/bin/ceph-mon --cluster ceph --default-log-to-file=false 
--default-mon-cluster-log-to-file=false --setuser ceph --setgroup ceph -d 
--mon-cluster-log-to-stderr --log-stderr-prefix=debug  -i sun-gcs02-osd01 
--mon-data /var/lib/ceph/mon/ceph-sun-gcs02-osd01 --public-addr 10.65.101.21
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: exec: Waiting 368 to quit
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: warning: line 41: 
'osd_memory_target' in section 'osd' redefined
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: debug 2019-08-28 20:41:03.835 
7f401283c180  0 set uid:gid to 167:167 (ceph:ceph)
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: debug 2019-08-28 20:41:03.835 
7f401283c180  0 ceph version 14.2.2 (4f8fa0a0024755aae7d95567c63f11d6862d55be) 
nautilus (stable), process ceph-mon, pid 368
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: debug 2019-08-28 20:41:03.835 
7f401283c180 -1 stat(/var/lib/ceph/mon/ceph-sun-gcs02-osd01) (13) Permission 
denied
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: debug 2019-08-28 20:41:03.835 
7f401283c180 -1 error accessing monitor data directory at 
'/var/lib/ceph/mon/ceph-sun-gcs02-osd01': (13) Permission denied
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: teardown: managing teardown after 
SIGCHLD
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: teardown: Waiting PID 368 to 
terminate
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: teardown: Process 368 is 
terminated
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: teardown: Bye Bye, container will 
die with return code -1
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: teardown: if you don't want me to 
die and have access to a shell to debug this situation, next time run me with 
'-e DEBUG=stayalive'
Aug 28 20:41:04 sun-gcs02-osd01 systemd[1]: ceph-mon@sun-gcs02-osd01.service: 
Main process exited, code=exited, status=255/n/a
Aug 28 20:41:04 sun-gcs02-osd01 systemd[1]: ceph-mon@sun-gcs02-osd01.service: 
Failed with result 'exit-code'.

The directories for the monitor are owned by 167.167 and matches the UID.GID 
that the container reports.

oot@sun-gcs02-osd01:~# ls -lhd /var/lib/ceph/
drwxr-x--- 14 ceph ceph 4.0K Jul 30 22:15 /var/lib/ceph/
root@sun-gcs02-osd01:~# ls -lh /var/lib/ceph/
total 56K
drwxr-xr-x   2 167 167 4.0K Jul 30 22:16 bootstrap-mds
drwxr-xr-x   2 167 167 4.0K Jul 30 22:16 bootstrap-mgr
drwxr-xr-x   2 167 167 4.0K Jul 30 22:16 bootstrap-osd
drwxr-xr-x   2 167 167 4.0K Jul 30 22:16 bootstrap-rbd
drwxr-xr-x   2 167 167 4.0K Jul 30 22:16 bootstrap-rbd-mirror
drwxr-xr-x   2 167 167 4.0K Jul 30 22:16 bootstrap-rgw
drwxr-xr-x   3 167 167 4.0K Jul 30 22:15 mds
drwxr-xr-x   3 167 167 4.0K Jul 30 22:15 mgr
drwxr-xr-x   3 167 167 4.0K Jul 30 22:15 mon
drwxr-xr-x  14 167 167 4.0K Jul 30 22:28 osd
drwxr-xr-x   4 167 167 4.0K Aug  1 23:36 radosgw
drwxr-xr-x 254 167 167  12K Aug 28 20:44 tmp
root@sun-gcs02-osd01:~# ls -lh /var/lib/ceph/mon/
total 4.0K
drwxr-xr-x 3 167 167 4.0K Jul 30 22:16 ceph-sun-gcs02-osd01
root@sun-gcs02-osd01:~# ls -lh /var/lib/ceph/mon/ceph-sun-gcs02-osd01/
total 16K
-rw------- 1 167 167   77 Jul 30 22:15 keyring
-rw-r--r-- 1 167 167    8 Jul 30 22:15 kv_backend
-rw-r--r-- 1 167 167    3 Jul 30 22:16 min_mon_release
drwxr-xr-x 2 167 167 4.0K Aug 28 19:16 store.db
root@sun-gcs02-osd01:~# ls -lh /var/lib/ceph/mon/ceph-sun-gcs02-osd01/store.db/
total 149M
-rw-r--r-- 1 167 167 1.7M Aug 28 19:16 050225.log
-rw-r--r-- 1 167 167  65M Aug 28 19:16 050227.sst
-rw-r--r-- 1 167 167  45M Aug 28 19:16 050228.sst
-rw-r--r-- 1 167 167   16 Aug 16 07:40 CURRENT
-rw-r--r-- 1 167 167   37 Jul 30 22:15 IDENTITY
-rw-r--r-- 1 167 167    0 Jul 30 22:15 LOCK
-rw-r--r-- 1 167 167 1.3M Aug 28 19:16 MANIFEST-027846
-rw-r--r-- 1 167 167 4.7K Aug  1 23:38 OPTIONS-002825
-rw-r--r-- 1 167 167 4.7K Aug 16 07:40 OPTIONS-027849

----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to