On Wed, Sep 18, 2013 at 6:33 AM, Dan Van Der Ster
<daniel.vanders...@cern.ch> wrote:
> Hi,
> We just finished debugging a problem with RBD-backed Glance image creation 
> failures, and thought our workaround would be useful for others. Basically, 
> we found that during an image upload, librbd on the glance api server was 
> consuming many many processes, eventually hitting the 1024 nproc limit of 
> non-root users in RHEL. The failure occurred when uploading to pools with 
> 2048 PGs, but didn't fail when uploading to pools with 512 PGs (we're 
> guessing that librbd is opening one thread per accessed-PG, and not closing 
> those threads until the whole processes completes.)
>
> If you hit this same problem (and you run RHEL like us), you'll need to 
> modify at least /etc/security/limits.d/90-nproc.conf (adding your non-root 
> user that should be allowed > 1024 procs), and then also possibly run ulimit 
> -u in the init script of your client process. Ubuntu should have some similar 
> limits.

Did your pools with 2048 PGs have a significantly larger number of
OSDs in them? Or are both pools on a pool with a lot of OSDs relative
to the PG counts?
The PG count shouldn't matter for this directly, but RBD (and other
clients) will create a couple messenger threads for each OSD it talks
to, and while they'll eventually shut down on idle it doesn't
proactively close them. I'd expect this to be a problem around 500
OSDs.
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to