> Op 26 april 2016 om 22:31 schreef Andrei Mikhailovsky <and...@arhont.com>: > > > Hi Wido, > > Thanks for your reply. We have a very simple ceph network. A single 40gbit/s > infiniband switch where the osd servers and hosts are connected to. There are > no default gates on the storage network. The IB is used only for ceph; > everything else goes over the ethernet. > > I've checked the stats on the IB interfaces of osd servers and there are no > errors. The ipoib interface has a very small number of dropped packets > (0.0003%). > > What kind of network tests would you suggest that I run? What do you mean by > "I would suggest that you check if the network towards clients is also OK."? > By clients do you mean the host servers? >
With clients I mean you verify if the hosts talking to the Ceph cluster can reach each machine running OSDs. In my case there was packet loss from certain clients which caused the issues to occur. Wido > Many thanks > > Andrei > > ----- Original Message ----- > > From: "Wido den Hollander" <w...@42on.com> > > To: "ceph-users" <ceph-users@lists.ceph.com>, "Andrei Mikhailovsky" > > <and...@arhont.com> > > Sent: Tuesday, 26 April, 2016 21:17:59 > > Subject: Re: [ceph-users] Hammer broke after adding 3rd osd server > > >> Op 26 april 2016 om 17:52 schreef Andrei Mikhailovsky <and...@arhont.com>: > >> > >> > >> Hello everyone, > >> > >> I've recently performed a hardware upgrade on our small two osd server ceph > >> cluster, which seems to have broke the ceph cluster. We are using ceph for > >> cloudstack rbd images for vms.All of our servers are Ubuntu 14.04 LTS with > >> latest updates and kernel 4.4.6 from ubuntu repo. > >> > >> Previous hardware: > >> > >> 2 x osd servers with 9 sas osds, 32gb ram and 12 core Intel cpu 2620 @ > >> 2Ghz each > >> and 2 consumer SSDs for journal. Infiniband 40gbit/s networking using > >> IPoIB. > >> > >> The following things were upgraded: > >> > >> 1. journal ssds were upgraded from consumer ssd to Intel 3710 200gb. We > >> now have > >> 5 osds per single ssd. > >> 2. added additional osd server with 64gb ram, 10 osds, Intel 2670 cpu @ > >> 2.6Ghz > >> 3. Upgraded ram on osd servers to become 64gb > >> 4. Installed additional osd disk to have 10 osds per server. > >> > >> After adding the third osd server and finishing the initial sync, the > >> cluster > >> worked okay for 1-2 days. No issues were noticed. On a third day my > >> monitoring > >> system started reporting a bunch of issues from the ceph cluster as well as > >> from our virtual machines. This tend to happen between 7:20am and 7:40am > >> and > >> lasts for about 2-3 hours before things become normal again. I've checked > >> the > >> osd servers and there is nothing that I could find in cron or otherwise > >> that > >> starts around 7:20am. > >> > >> The problem is as follows: the new osd server's load goes to 400+ with > >> ceph-osd > >> processes consuming all cpu resources. The ceph -w shows a high number of > >> slow > >> requests which relate to osds belonging to the new osd server. The log > >> files > >> show the following: > >> > >> 2016-04-20 07:39:04.346459 osd.7 192.168.168.200:6813/2650 2 : cluster > >> [WRN] > >> slow request 30.032033 seconds old, received at 2016-04-20 07:38:34.314014: > >> osd_op(client.140476549.0:13203438 > >> rbd_data.2c9de71520eedd1.0000000000000621 > >> [stat,set-alloc-hint object_size 4194304 write_size 4194304,write > >> 2572288~4096] > >> 5.6c3bece2 ack+ondisk+write+known_if_redirected e83912) currently waiting > >> for > >> subops from 22 > >> 2016-04-20 07:39:04.346465 osd.7 192.168.168.200:6813/2650 3 : cluster > >> [WRN] > >> slow request 30.031878 seconds old, received at 2016-04-20 07:38:34.314169: > >> osd_op(client.140476549.0:13203439 > >> rbd_data.2c9de71520eedd1.0000000000000621 > >> [stat,set-alloc-hint object_size 4194304 write_size 4194304,write > >> 1101824~8192] > >> 5.6c3bece2 ack+ondisk+write+known_if_redirected e83912) currently waiting > >> for > >> rw locks > >> > >> > >> > >> There are practically every osd involved in the slow requests and they > >> tend to > >> be between the old two osd servers and the new one. There were no issues > >> as far > >> as I can see between the old two servers. > >> > >> The first thing i've checked is the networking. No issue was identified > >> from > >> running ping -i .1 <servername> as well as using hping3 for the tcp > >> connection > >> checks. The network tests were running for over a week and not a single > >> packet > >> was lost. The slow requests took place while the network tests were > >> running. > >> > >> I've also checked the osd and ssd disks and I was not able to identify > >> anything > >> problematic. > >> > >> Stopping all osds on the new server causes no issues between the old two > >> osd > >> servers. I've left the new server disconnected for a few days and had no > >> issues > >> with the cluster. > >> > >> I am a bit lost on what else to try and how to debug the issue. Could > >> someone > >> please help me? > >> > > > > I would still say this is a network issue. > > > > "currently waiting for rw locks" is usually a network problem. > > > > I found this out myself a few weeks ago: > > http://blog.widodh.nl/2016/01/slow-requests-with-ceph-waiting-for-rw-locks/ > > > > The problem there was a wrong gateway on some machines. > > > > In that situation the OSDs could talk just fine, but they had problems with > > sending traffic back to the clients which lead to buffers filling up. > > > > I would suggest that you check if the network towards clients is also OK. > > > > Wido > > > >> Many thanks > >> > >> Andrei > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> _______________________________________________ > >> ceph-users mailing list > >> ceph-users@lists.ceph.com > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com