/etc/libvirt/qemu.conf: max_files=XXXX I expect this should always work, even on systemd b0rked systems... Only solves the problem for QEMU, not for other librbd users.
Jan > On 03 Sep 2015, at 14:48, Vasiliy Angapov <anga...@gmail.com> wrote: > > And what to do for those with systemd? Because systemd totally ignores > limits.conf and manages limits on per-service basis... > What actual services should be tuned WRT LimitNOFILE? > Or should the DefaultLimitNOFILE increased in /etc/systemd/system.conf? > > Thanks in advance! > > 2015-09-03 17:46 GMT+08:00 Jan Schermer <j...@schermer.cz > <mailto:j...@schermer.cz>>: > You're like the 5th person here (including me) that was hit by this. > > Could I get some input from someone using CEPH with RBD and thousands of > OSDs? How high did you have to go? > > I only have ~200 OSDs and I had to bump the limit up to 10000 for VMs that > have multiple volumes attached, this doesn't seem right? I understand this is > the effect of striping a volume accross multiple PGs, but shouldn't this be > more limited or somehow garbage collected? > > And to get deeper - I suppose there will be one connection from QEMU to OSD > for each NCQ queue? Or how does this work? blk-mq will likely be different > again... Or is it decoupled from the virtio side of things by RBD cache if > that's enabled? > > Anyway, out of the box, at least on OpenStack installations > 1) anyone having more than a few OSDs should really bump this up by default. > 2) librbd should handle this situation gracefully by recycling connections, > instead of hanging > 3) at least we should get a warning somewhere (in the libvirt/qemu log) - I > don't think there's anything when the issue hits > > Should I make tickets for this? > > Jan > >> On 03 Sep 2015, at 02:57, Rafael Lopez <rafael.lo...@monash.edu >> <mailto:rafael.lo...@monash.edu>> wrote: >> >> Hi Jan, >> >> Thanks for the advice, hit the nail on the head. >> >> I checked the limits and watched the no. of fd's and as it reached the soft >> limit (1024) thats when the transfer came to a grinding halt and the vm >> started locking up. >> >> After your reply I also did some more googling and found another old thread: >> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-December/026187.html >> >> <http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-December/026187.html> >> >> I increased the max_files in qemu.conf and restarted libvirtd and the VM (as >> per Dan's solution in thread above), and now it seems to be happy copying >> any size files to the rbd. Confirmed the fd count is going past the previous >> soft limit of 1024 also. >> >> Thanks again!! >> Raf >> >> On 2 September 2015 at 18:44, Jan Schermer <j...@schermer.cz >> <mailto:j...@schermer.cz>> wrote: >> 1) Take a look at the number of file descriptors the QEMU process is using, >> I think you are over the limits >> >> pid=pid of qemu process >> >> cat /proc/$pid/limits >> echo /proc/$pid/fd/* | wc -w >> >> 2) Jumbo frames may be the cause, are they enabled on the rest of the >> network? In any case, get rid of NetworkManager ASAP and set it manually, >> though it looks like your NIC might not support them. >> >> Jan >> >> >> >> > On 02 Sep 2015, at 01:44, Rafael Lopez <rafael.lo...@monash.edu >> > <mailto:rafael.lo...@monash.edu>> wrote: >> > >> > Hi ceph-users, >> > >> > Hoping to get some help with a tricky problem. I have a rhel7.1 VM guest >> > (host machine also rhel7.1) with root disk presented from ceph 0.94.2-0 >> > (rbd) using libvirt. >> > >> > The VM also has a second rbd for storage presented from the same ceph >> > cluster, also using libvirt. >> > >> > The VM boots fine, no apparent issues with the OS root rbd. I am able to >> > mount the storage disk in the VM, and create a file system. I can even >> > transfer small files to it. But when I try to transfer a moderate size >> > files, eg. greater than 1GB, it seems to slow to a grinding halt and >> > eventually it locks up the whole system, and generates the kernel messages >> > below. >> > >> > I have googled some *similar* issues around, but haven't come across some >> > solid advice/fix. So far I have tried modifying the libvirt disk cache >> > settings, tried using the latest mainline kernel (4.2+), different file >> > systems (ext4, xfs, zfs) all produce similar results. I suspect it may be >> > network related, as when I was using the mainline kernel I was >> > transferring some files to the storage disk and this message came up, and >> > the transfer seemed to stop at the same time: >> > >> > Sep 1 15:31:22 nas1-rds NetworkManager[724]: <error> [1441085482.078646] >> > [platform/nm-linux-platform.c:2133] sysctl_set(): sysctl: failed to set >> > '/proc/sys/net/ipv6/conf/eth0/mtu' to '9000': (22) Invalid argument >> > >> > I think maybe the key info to troubleshooting is that it seems to be OK >> > for files under 1GB. >> > >> > Any ideas would be appreciated. >> > >> > Cheers, >> > Raf >> > >> > >> > Sep 1 16:04:15 nas1-rds kernel: INFO: task kworker/u8:1:60 blocked for >> > more than 120 seconds. >> > Sep 1 16:04:15 nas1-rds kernel: "echo 0 > >> > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> > Sep 1 16:04:15 nas1-rds kernel: kworker/u8:1 D ffff88023fd93680 0 >> > 60 2 0x00000000 >> > Sep 1 16:04:15 nas1-rds kernel: Workqueue: writeback bdi_writeback_workfn >> > (flush-252:80) >> > Sep 1 16:04:15 nas1-rds kernel: ffff880230c136b0 0000000000000046 >> > ffff8802313c4440 ffff880230c13fd8 >> > Sep 1 16:04:15 nas1-rds kernel: ffff880230c13fd8 ffff880230c13fd8 >> > ffff8802313c4440 ffff88023fd93f48 >> > Sep 1 16:04:15 nas1-rds kernel: ffff880230c137b0 ffff880230fbcb08 >> > ffffe8ffffd80ec0 ffff88022e827590 >> > Sep 1 16:04:15 nas1-rds kernel: Call Trace: >> > Sep 1 16:04:15 nas1-rds kernel: [<ffffffff8160955d>] >> > io_schedule+0x9d/0x130 >> > Sep 1 16:04:15 nas1-rds kernel: [<ffffffff812b8d5f>] bt_get+0x10f/0x1a0 >> > Sep 1 16:04:15 nas1-rds kernel: [<ffffffff81098230>] ? >> > wake_up_bit+0x30/0x30 >> > Sep 1 16:04:15 nas1-rds kernel: [<ffffffff812b90ef>] >> > blk_mq_get_tag+0xbf/0xf0 >> > Sep 1 16:04:15 nas1-rds kernel: [<ffffffff812b4f3b>] >> > __blk_mq_alloc_request+0x1b/0x1f0 >> > Sep 1 16:04:15 nas1-rds kernel: [<ffffffff812b68a1>] >> > blk_mq_map_request+0x181/0x1e0 >> > Sep 1 16:04:15 nas1-rds kernel: [<ffffffff812b7a1a>] >> > blk_sq_make_request+0x9a/0x380 >> > Sep 1 16:04:15 nas1-rds kernel: [<ffffffff812aa28f>] ? >> > generic_make_request_checks+0x24f/0x380 >> > Sep 1 16:04:15 nas1-rds kernel: [<ffffffff812aa4a2>] >> > generic_make_request+0xe2/0x130 >> > Sep 1 16:04:15 nas1-rds kernel: [<ffffffff812aa561>] submit_bio+0x71/0x150 >> > Sep 1 16:04:15 nas1-rds kernel: [<ffffffffa01ddc55>] >> > ext4_io_submit+0x25/0x50 [ext4] >> > Sep 1 16:04:15 nas1-rds kernel: [<ffffffffa01dde09>] >> > ext4_bio_write_page+0x159/0x2e0 [ext4] >> > Sep 1 16:04:15 nas1-rds kernel: [<ffffffffa01d4f6d>] >> > mpage_submit_page+0x5d/0x80 [ext4] >> > Sep 1 16:04:15 nas1-rds kernel: [<ffffffffa01d5232>] >> > mpage_map_and_submit_buffers+0x172/0x2a0 [ext4] >> > Sep 1 16:04:15 nas1-rds kernel: [<ffffffffa01da313>] >> > ext4_writepages+0x733/0xd60 [ext4] >> > Sep 1 16:04:15 nas1-rds kernel: [<ffffffff81162b6e>] >> > do_writepages+0x1e/0x40 >> > Sep 1 16:04:15 nas1-rds kernel: [<ffffffff811efe10>] >> > __writeback_single_inode+0x40/0x220 >> > Sep 1 16:04:15 nas1-rds kernel: [<ffffffff811f0b0e>] >> > writeback_sb_inodes+0x25e/0x420 >> > Sep 1 16:04:15 nas1-rds kernel: [<ffffffff811f0d6f>] >> > __writeback_inodes_wb+0x9f/0xd0 >> > Sep 1 16:04:15 nas1-rds kernel: [<ffffffff811f15b3>] >> > wb_writeback+0x263/0x2f0 >> > Sep 1 16:04:15 nas1-rds kernel: [<ffffffff811f2aec>] >> > bdi_writeback_workfn+0x1cc/0x460 >> > Sep 1 16:04:15 nas1-rds kernel: [<ffffffff8108f0ab>] >> > process_one_work+0x17b/0x470 >> > Sep 1 16:04:15 nas1-rds kernel: [<ffffffff8108fe8b>] >> > worker_thread+0x11b/0x400 >> > Sep 1 16:04:15 nas1-rds kernel: [<ffffffff8108fd70>] ? >> > rescuer_thread+0x400/0x400 >> > Sep 1 16:04:15 nas1-rds kernel: [<ffffffff8109726f>] kthread+0xcf/0xe0 >> > Sep 1 16:04:15 nas1-rds kernel: [<ffffffff810971a0>] ? >> > kthread_create_on_node+0x140/0x140 >> > Sep 1 16:04:15 nas1-rds kernel: [<ffffffff81613cfc>] >> > ret_from_fork+0x7c/0xb0 >> > Sep 1 16:04:15 nas1-rds kernel: [<ffffffff810971a0>] ? >> > kthread_create_on_node+0x140/0x140 >> > >> > >> > _______________________________________________ >> > ceph-users mailing list >> > ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com> >> >> >> >> >> -- >> Rafael Lopez >> Data Storage Administrator >> Servers & Storage (eSolutions) >> +61 3 990 59118 >> > > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com> > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com