I'm not certain what the correct behavior should be in this case, so
maybe it is not a bug, but here is what is happening:

When an OSD becomes full, a process fails and we unmount the rbd
attempt to remove the lock associated with the rbd for the process.
The unmount works fine, but removing the lock is failing right now
because the list_lockers() function call never returns.

Here is a code snippet I tried with a fake rbd lock on a test cluster:

import rbd
import rados
with rados.Rados(conffile='/etc/ceph/ceph.conf') as cluster:
  with cluster.open_ioctx('rbd') as ioctx:
    with rbd.Image(ioctx, 'msd1') as image:
      image.list_lockers()

The process never returns, even after the ceph cluster is returned to
healthy.  The only indication of the error is an error in the
/var/log/messages file:

Jul 11 23:25:05 node-172-16-0-13 python: 2013-07-11 23:25:05.826793
7ffc66d72700  0 client.6911.objecter  FULL, paused modify
0x7ffc687c6050 tid 2

Any help would be greatly appreciated.

ceph version:

ceph version 0.61.4 (1669132fcfc27d0c0b5e5bb93ade59d147e23404)
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to