Hi all,

two days ago we upgraded our cluster from octopus to pacific. Everything went 
well and we see lots of improvements. Thanks for releasing the last stable 
version with all its fixes. I do have some questions though and this hiccup is 
one for starters:

After the upgrade to pacific we started getting the error message 
"admin_socket: exception getting command descriptions: [Errno 2] No such file 
or directory" when using the ceph daemon command. Here is the output of a full 
session:

[root@ceph-adm:ceph-26 ~]# ceph daemon mon.ceph-26 version | jq .release 
"pacific"

[root@ceph-adm:ceph-26 ~]# ceph --id admin daemon mon.ceph-26 version | jq 
.release
admin_socket: exception getting command descriptions: [Errno 2] No such file or 
directory

[root@ceph-adm:ceph-26 ~]# ceph --id admin daemon 
/var/run/ceph/ceph-mon.ceph-26.asok version | jq .release
"pacific"

[root@ceph-adm:ceph-26 ~]# ceph daemon /var/run/ceph/ceph-mon.ceph-26.asok 
version | jq .release
"pacific"

We observe that it is impossible to use the ceph daemon command in its simple 
form whenever a --id argument is present. This, unfortunately, creates an 
unnecessary restrictions, we can't use non-admin users any more. here is why 
this fails:

[root@ceph-adm:ceph-26 ~]# strace ceph daemon mon.ceph-26 version |& grep asok
stat("/var/run/ceph/ceph-mon.ceph-26.asok", {st_mode=S_IFSOCK|0755, st_size=0, 
...}) = 0
connect(3, {sa_family=AF_UNIX, sun_path="/var/run/ceph/ceph-mon.ceph-26.asok"}, 
37) = 0
getpeername(3, {sa_family=AF_UNIX, 
sun_path="/var/run/ceph/ceph-mon.ceph-26.asok"}, [110 => 38]) = 0
connect(3, {sa_family=AF_UNIX, sun_path="/var/run/ceph/ceph-mon.ceph-26.asok"}, 
37) = 0
getpeername(3, {sa_family=AF_UNIX, 
sun_path="/var/run/ceph/ceph-mon.ceph-26.asok"}, [110 => 38]) = 0

[root@ceph-adm:ceph-26 ~]# strace ceph --id admin daemon mon.ceph-26 version |& 
grep asok
stat("/var/run/ceph/ceph-mon.admin.asok", 0x7fffa65e9f00) = -1 ENOENT (No such 
file or directory)
connect(3, {sa_family=AF_UNIX, sun_path="/var/run/ceph/ceph-mon.admin.asok"}, 
35) = -1 ENOENT (No such file or directory)

As you can see, the daemon name "ceph-26" was replaced with the user name 
"admin" passed with the argument to --id. As a result the command looks for a 
non-existent file. Passing the full path "fixes" this. This is clearly a bug 
and I wonder if there is a way out, for example, by setting an explicit daemon 
path template in the config.

I will open a tracker if a user on quincy or newer confirms that this is 
present in newer versions as well. I wonder if this is a fall-out of 
https://docs.ceph.com/en/latest/releases/pacific/#id39 Point 3: "$pid expansion 
in config paths like admin_socket will now properly expand to the daemon pid 
for commands like ceph-mds or ceph-osd. Previously only ceph-fuse/rbd-nbd 
expanded $pid with the actual daemon pid."

Thanks for any pointers on how to work around this issue.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to