One more thing I've missed to say. All failures that i've seen happen when 
there is a deep scrubbing process running. 

Andrei 
----- Original Message -----

> From: "Andrei Mikhailovsky" <and...@arhont.com>
> To: sj...@redhat.com
> Cc: ceph-users@lists.ceph.com
> Sent: Thursday, 27 November, 2014 1:05:30 PM
> Subject: Re: [ceph-users] Giant upgrade - stability issues

> Sam,

> I've done more network testing, this time over 2 days and I believe I
> have enough evidence to conclude that the osd disconnects are not
> caused by the network. I have ran about 140 million TCP connects on
> each osd and host server over the course of about two days.
> Generating about 800-900 connections per seconds. I've not had a
> single error/packet drop and the latency / standard deviation was
> very minimal.

> While the tests were running I did see a number of osds being marked
> as down by other osds. According to the logs it happened at least 3
> times in the course of two days. However, this time the cluster IO
> was available. The osds simply connected back with the message that
> they were wrongly marked down.

> I was not able to set the full debug logging on the cluster as it
> would have consumed the disk space in less than 30 mins. So I am not
> really sure how to debug this particular problem.

> What I have done is I have rebooted both osd servers and so far I've
> not see any osd disconnects. The servers are up 3 days already.
> Perhaps the problem could be down to the kernel stability, but if
> this was the case, I would have seen similar issues on Firefly,
> which I did not. Not sure what to think now.

> Andrei
> ----- Original Message -----

> > From: "Andrei Mikhailovsky" <and...@arhont.com>
> 
> > To: sj...@redhat.com
> 
> > Cc: ceph-users@lists.ceph.com
> 
> > Sent: Thursday, 20 November, 2014 4:50:21 PM
> 
> > Subject: Re: [ceph-users] Giant upgrade - stability issues
> 

> > Thanks, I will try that.
> 

> > Andrei
> 
> > ----- Original Message -----
> 

> > From: "Samuel Just" <sam.j...@inktank.com>
> 
> > To: "Andrei Mikhailovsky" <and...@arhont.com>
> 
> > Cc: ceph-users@lists.ceph.com
> 
> > Sent: Thursday, 20 November, 2014 4:26:00 PM
> 
> > Subject: Re: [ceph-users] Giant upgrade - stability issues
> 

> > You can try to capture logging at
> 

> > debug osd = 20
> 
> > debug ms = 20
> 
> > debug filestore = 20
> 

> > while an osd is misbehaving.
> 
> > -Sam
> 

> > On Thu, Nov 20, 2014 at 7:34 AM, Andrei Mikhailovsky
> > <and...@arhont.com> wrote:
> 
> > > Sam,
> 
> > >
> 
> > > further to your email I have done the following:
> 
> > >
> 
> > > 1. Upgraded both osd servers with the latest updates and
> > > restarted
> > > each
> 
> > > server in turn
> 
> > > 2. fired up nping utility to generate TCP connections (3 way
> > > handshake) from
> 
> > > each of the servers as well as from the host servers. In total
> > > i've
> > > ran 5
> 
> > > tests. The nping utility was establishing connects on port 22 (as
> > > all
> 
> > > servers have this port open) with the delay of 1ms. The command
> > > used to
> 
> > > generate the traffic was as follows:
> 
> > >
> 
> > > nping --tcp-connect -p 22 --delay 1ms <hostname> -v2 -c 36000000
> > > |
> > > gzip
> 
> > >>/root/nping-hostname-output.gz
> 
> > >
> 
> > > The tests took just over 12 hours to complete. The results did
> > > not
> > > show any
> 
> > > problems as far as I can see. Here is the tailed output of one of
> > > the
> 
> > > findings:
> 
> > >
> 
> > >
> 
> > > SENT (37825.7303s) Starting TCP Handshake > arh-ibstorage1-ib:22
> 
> > > (192.168.168.200:22)
> 
> > > RECV (37825.7303s) Handshake with arh-ibstorage1-ib:22
> > > (192.168.168.200:22)
> 
> > > completed
> 
> > >
> 
> > > Max rtt: 4.447ms | Min rtt: 0.008ms | Avg rtt: 0.008ms
> 
> > > TCP connection attempts: 36000000 | Successful connections:
> > > 36000000 |
> 
> > > Failed: 0 (0.00%)
> 
> > > Tx time: 37825.72833s | Tx bytes/s: 76138.65 | Tx pkts/s: 951.73
> 
> > > Rx time: 37825.72939s | Rx bytes/s: 38069.33 | Rx pkts/s: 951.73
> 
> > > Nping done: 1 IP address pinged in 37844.55 seconds
> 
> > >
> 
> > >
> 
> > > As you can see from the above, there are no failed connects at
> > > all
> > > from the
> 
> > > 36 million established connections. The average delay is 0.008ms
> > > and it was
> 
> > > sending on average almost 1000 packets per second. I've got the
> > > same results
> 
> > > from other servers.
> 
> > >
> 
> > > Unless you have other tests in mind, I think there are no issues
> > > with the
> 
> > > network.
> 
> > >
> 
> > > I fire up another test for 24 hours this time to see if it makes
> > > a
> 
> > > difference.
> 
> > >
> 
> > > Thanks
> 
> > >
> 
> > > Andrei
> 
> > >
> 
> > >
> 
> > > ________________________________
> 
> > > From: "Samuel Just" <sam.j...@inktank.com>
> 
> > > To: "Andrei Mikhailovsky" <and...@arhont.com>
> 
> > > Cc: ceph-users@lists.ceph.com
> 
> > > Sent: Wednesday, 19 November, 2014 9:45:40 PM
> 
> > >
> 
> > > Subject: Re: [ceph-users] Giant upgrade - stability issues
> 
> > >
> 
> > > Well, the heartbeats are failing due to networking errors
> > > preventing
> 
> > > the heartbeats from arriving. That is causing osds to go down,
> > > and
> 
> > > that is causing pgs to become degraded. You'll have to work out
> > > what
> 
> > > is preventing the tcp connections from being stable.
> 
> > > -Sam
> 
> > >
> 
> > > On Wed, Nov 19, 2014 at 1:39 PM, Andrei Mikhailovsky
> > > <and...@arhont.com>
> 
> > > wrote:
> 
> > >>
> 
> > >>>You indicated that osd 12 and 16 were the ones marked down, but
> > >>>it
> 
> > >>>looks like only 0,1,2,3,7 were marked down in the ceph.log you
> > >>>sent.
> 
> > >>>The logs for 12 and 16 did indicate that they had been
> > >>>partitioned
> 
> > >>>from the other nodes. I'd bet that you are having intermittent
> 
> > >>>network trouble since the heartbeats are intermittently failing.
> 
> > >>>-Sam
> 
> > >>
> 
> > >> AM: I will check the logs further for the osds 12 and 16.
> > >> Perhaps
> > >> I've
> 
> > >> missed something, but the ceph osd tree output was showing 12
> > >> and
> > >> 16 as
> 
> > >> down.
> 
> > >>
> 
> > >> Regarding the failure of heartbeats, Wido has suggested that I
> > >> should
> 
> > >> investigate the reason for it's failure. The obvious thing to
> > >> look
> > >> at is
> 
> > >> the
> 
> > >> network and this is what I've initially done. However, I do not
> > >> see any
> 
> > >> signs of the network issues. There are no errors on the physical
> > >> interface
> 
> > >> and ifconfig is showing a very small number of TX dropped
> > >> packets
> 
> > >> (0.00006%)
> 
> > >> and 0 errors:
> 
> > >>
> 
> > >>
> 
> > >> # ifconfig ib0
> 
> > >> ib0 Link encap:UNSPEC HWaddr
> 
> > >> 80-00-00-48-FE-80-00-00-00-00-00-00-00-00-00-00
> 
> > >> inet addr:192.168.168.200 Bcast:192.168.168.255
> 
> > >> Mask:255.255.255.0
> 
> > >> inet6 addr: fe80::223:7dff:ff94:e2a5/64 Scope:Link
> 
> > >> UP BROADCAST RUNNING MULTICAST MTU:65520 Metric:1
> 
> > >> RX packets:1812895801 errors:0 dropped:52 overruns:0 frame:0
> 
> > >> TX packets:1835002992 errors:0 dropped:1037 overruns:0 carrier:0
> 
> > >> collisions:0 txqueuelen:2048
> 
> > >> RX bytes:6252740293262 (6.2 TB) TX bytes:11343307665152 (11.3
> 
> > >> TB)
> 
> > >>
> 
> > >>
> 
> > >> How would I investigate what is happening with the hearbeats and
> > >> the
> 
> > >> reason
> 
> > >> for their failures? I have a suspetion that this will solve the
> > >> issues
> 
> > >> with
> 
> > >> frequent reporting of degraded PGs on the cluster and
> > >> intermittent
> > >> high
> 
> > >> levels of IO wait on vms.
> 
> > >>
> 
> > >> And also, as i've previously mentioned, the issues started to
> > >> happen after
> 
> > >> the upgrade to Giant. I've not had these problems with Firefly,
> > >> Emperor or
> 
> > >> Dumpling releases on the same hardware and same cluster loads.
> 
> > >>
> 
> > >> Thanks
> 
> > >>
> 
> > >> Andrei
> 
> > >>
> 
> > >>
> 
> > >>
> 
> > >>
> 
> > >> On Tue, Nov 18, 2014 at 3:34 PM, Andrei Mikhailovsky
> > >> <and...@arhont.com>
> 
> > >> wrote:
> 
> > >>> Sam,
> 
> > >>>
> 
> > >>> Pastebin or similar will not take tens of megabytes worth of
> > >>> logs. If we
> 
> > >>> are
> 
> > >>> talking about debug_ms 10 setting, I've got about 7gb worth of
> > >>> logs
> 
> > >>> generated every half an hour or so. Not really sure what to do
> > >>> with that
> 
> > >>> much data. Anything more constructive?
> 
> > >>>
> 
> > >>> Thanks
> 
> > >>> ________________________________
> 
> > >>> From: "Samuel Just" <sam.j...@inktank.com>
> 
> > >>> To: "Andrei Mikhailovsky" <and...@arhont.com>
> 
> > >>> Cc: ceph-users@lists.ceph.com
> 
> > >>> Sent: Tuesday, 18 November, 2014 8:53:47 PM
> 
> > >>>
> 
> > >>> Subject: Re: [ceph-users] Giant upgrade - stability issues
> 
> > >>>
> 
> > >>> pastebin or something, probably.
> 
> > >>> -Sam
> 
> > >>>
> 
> > >>> On Tue, Nov 18, 2014 at 12:34 PM, Andrei Mikhailovsky
> > >>> <and...@arhont.com>
> 
> > >>> wrote:
> 
> > >>>> Sam, the logs are rather large in size. Where should I post it
> > >>>> to?
> 
> > >>>>
> 
> > >>>> Thanks
> 
> > >>>> ________________________________
> 
> > >>>> From: "Samuel Just" <sam.j...@inktank.com>
> 
> > >>>> To: "Andrei Mikhailovsky" <and...@arhont.com>
> 
> > >>>> Cc: ceph-users@lists.ceph.com
> 
> > >>>> Sent: Tuesday, 18 November, 2014 7:54:56 PM
> 
> > >>>> Subject: Re: [ceph-users] Giant upgrade - stability issues
> 
> > >>>>
> 
> > >>>>
> 
> > >>>> Ok, why is ceph marking osds down? Post your ceph.log from one
> > >>>> of the
> 
> > >>>> problematic periods.
> 
> > >>>> -Sam
> 
> > >>>>
> 
> > >>>> On Tue, Nov 18, 2014 at 1:35 AM, Andrei Mikhailovsky
> > >>>> <and...@arhont.com>
> 
> > >>>> wrote:
> 
> > >>>>> Hello cephers,
> 
> > >>>>>
> 
> > >>>>> I need your help and suggestion on what is going on with my
> > >>>>> cluster. A
> 
> > >>>>> few
> 
> > >>>>> weeks ago i've upgraded from Firefly to Giant. I've
> > >>>>> previously
> > >>>>> written
> 
> > >>>>> about
> 
> > >>>>> having issues with Giant where in two weeks period the
> > >>>>> cluster's IO
> 
> > >>>>> froze
> 
> > >>>>> three times after ceph down-ed two osds. I have in total just
> > >>>>> 17 osds
> 
> > >>>>> between two osd servers, 3 mons. The cluster is running on
> > >>>>> Ubuntu 12.04
> 
> > >>>>> with
> 
> > >>>>> latest updates.
> 
> > >>>>>
> 
> > >>>>> I've got zabbix agents monitoring the osd servers and the
> > >>>>> cluster. I
> 
> > >>>>> get
> 
> > >>>>> alerts of any issues, such as problems with PGs, etc. Since
> > >>>>> upgrading
> 
> > >>>>> to
> 
> > >>>>> Giant, I am now frequently seeing emails alerting of the
> > >>>>> cluster having
> 
> > >>>>> degraded PGs. I am getting around 10-15 such emails per day
> > >>>>> stating
> 
> > >>>>> that
> 
> > >>>>> the
> 
> > >>>>> cluster has degraded PGs. The number of degraded PGs very
> > >>>>> between a
> 
> > >>>>> couple
> 
> > >>>>> of PGs to over a thousand. After several minutes the cluster
> > >>>>> repairs
> 
> > >>>>> itself.
> 
> > >>>>> The total number of PGs in the cluster is 4412 between all
> > >>>>> the
> > >>>>> pools.
> 
> > >>>>>
> 
> > >>>>> I am also seeing more alerts from vms stating that there is a
> > >>>>> high IO
> 
> > >>>>> wait
> 
> > >>>>> and also seeing hang tasks. Some vms reporting over 50% io
> > >>>>> wait.
> 
> > >>>>>
> 
> > >>>>> This has not happened on Firefly or the previous releases of
> > >>>>> ceph. Not
> 
> > >>>>> much
> 
> > >>>>> has changed in the cluster since the upgrade to Giant.
> > >>>>> Networking and
> 
> > >>>>> hardware is still the same and it is still running the same
> > >>>>> version of
> 
> > >>>>> Ubuntu OS. The cluster load hasn't changed as well. Thus, I
> > >>>>> think the
> 
> > >>>>> issues
> 
> > >>>>> above are related to the upgrade of ceph to Giant.
> 
> > >>>>>
> 
> > >>>>> Here is the ceph.conf that I use:
> 
> > >>>>>
> 
> > >>>>> [global]
> 
> > >>>>> fsid = 51e9f641-372e-44ec-92a4-b9fe55cbf9fe
> 
> > >>>>> mon_initial_members = arh-ibstorage1-ib, arh-ibstorage2-ib,
> 
> > >>>>> arh-cloud13-ib
> 
> > >>>>> mon_host = 192.168.168.200,192.168.168.201,192.168.168.13
> 
> > >>>>> auth_supported = cephx
> 
> > >>>>> osd_journal_size = 10240
> 
> > >>>>> filestore_xattr_use_omap = true
> 
> > >>>>> public_network = 192.168.168.0/24
> 
> > >>>>> rbd_default_format = 2
> 
> > >>>>> osd_recovery_max_chunk = 8388608
> 
> > >>>>> osd_recovery_op_priority = 1
> 
> > >>>>> osd_max_backfills = 1
> 
> > >>>>> osd_recovery_max_active = 1
> 
> > >>>>> osd_recovery_threads = 1
> 
> > >>>>> filestore_max_sync_interval = 15
> 
> > >>>>> filestore_op_threads = 8
> 
> > >>>>> filestore_merge_threshold = 40
> 
> > >>>>> filestore_split_multiple = 8
> 
> > >>>>> osd_disk_threads = 8
> 
> > >>>>> osd_op_threads = 8
> 
> > >>>>> osd_pool_default_pg_num = 1024
> 
> > >>>>> osd_pool_default_pgp_num = 1024
> 
> > >>>>> osd_crush_update_on_start = false
> 
> > >>>>>
> 
> > >>>>> [client]
> 
> > >>>>> rbd_cache = true
> 
> > >>>>> admin_socket = /var/run/ceph/$name.$pid.asok
> 
> > >>>>>
> 
> > >>>>>
> 
> > >>>>> I would like to get to the bottom of these issues. Not sure
> > >>>>> if
> > >>>>> the
> 
> > >>>>> issues
> 
> > >>>>> could be fixed with changing some settings in ceph.conf or a
> > >>>>> full
> 
> > >>>>> downgrade
> 
> > >>>>> back to the Firefly. Is the downgrade even possible on a
> > >>>>> production
> 
> > >>>>> cluster?
> 
> > >>>>>
> 
> > >>>>> Thanks for your help
> 
> > >>>>>
> 
> > >>>>> Andrei
> 
> > >>>>>
> 
> > >>>>> _______________________________________________
> 
> > >>>>> ceph-users mailing list
> 
> > >>>>> ceph-users@lists.ceph.com
> 
> > >>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> > >>>>>
> 
> > >>>>
> 
> > >>>
> 
> > >>
> 
> > >
> 
> > >
> 
> > > _______________________________________________
> 
> > > ceph-users mailing list
> 
> > > ceph-users@lists.ceph.com
> 
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> > >
> 

> > _______________________________________________
> 
> > ceph-users mailing list
> 
> > ceph-users@lists.ceph.com
> 
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to