Good day! Thank you, but it's not clear for me what is a bottleneck here.
- Hardware node - load average, disk IO - underlying file system problem on osd or disk bad. - ceph journal problem Ceph osd partition is a part of block device which has practically no load Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sda 12,00 0,00 0,12 0 0 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sda 12,00 0,00 0,14 0 0 Disk with osd is good, just checked it and have good r/w speed with appropriate iops and latency. But hardware node is working hard and have high load average. I fear that ceph-osd process lack resources. Is there any way to fix it? May be raise some kind of timeout when syncing or make this osd less weight or so? Or its better to move this osd to another server? Regards, Artem Silenkov, 2GIS TM. --- 2GIS LLChttp://2gis.rua.silenkov at 2gis.ru <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com> gtalk:artem.silenkov at gmail.com <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com> cell:+79231534853 2013/6/5 Gregory Farnum <g...@inktank.com> > This would be easier to see with a log than with all the GDB stuff, but > the reference in the backtrace to "SyncEntryTimeout::finish(int)" tells > me that the filesystem is taking too long to sync things to disk. Either > this disk is bad or you're somehow subjecting it to a much heavier load > than the others. > -Greg > > On Wednesday, June 5, 2013, Artem Silenkov wrote: > >> Good day! >> >> Tried to nullify thid osd and reinject it with no success. It works a >> little bit then the crash again. >> >> >> Regards, Artem Silenkov, 2GIS TM. >> --- >> 2GIS LLC >> http://2gis.ru >> a.silen...@2gis.ru >> gtalk:artem.silen...@gmail.com >> cell:+79231534853 >> >> >> 2013/6/5 Artem Silenkov <artem.silen...@gmail.com> >> >> Hello! >> We have simple setup as follows: >> >> Debian GNU/Linux 6.0 x64 >> Linux h08 2.6.32-19-pve #1 SMP Wed May 15 07:32:52 CEST 2013 x86_64 >> GNU/Linux >> >> ii ceph 0.61.2-1~bpo60+1 >> distributed storage and file system >> ii ceph-common 0.61.2-1~bpo60+1 common >> utilities to mount and interact with a ceph storage cluster >> ii ceph-fs-common 0.61.2-1~bpo60+1 common >> utilities to mount and interact with a ceph file system >> ii ceph-fuse 0.61.2-1~bpo60+1 >> FUSE-based client for the Ceph distributed file system >> ii ceph-mds 0.61.2-1~bpo60+1 >> metadata server for the ceph distributed file system >> ii libcephfs1 0.61.2-1~bpo60+1 Ceph >> distributed file system client library >> ii libc-bin 2.11.3-4 >> Embedded GNU C Library: Binaries >> ii libc-dev-bin 2.11.3-4 >> Embedded GNU C Library: Development binaries >> ii libc6 2.11.3-4 >> Embedded GNU C Library: Shared libraries >> ii libc6-dev 2.11.3-4 >> Embedded GNU C Library: Development Libraries and Header Files >> >> All programs are running fine except osd.2 which is crashing repeatedly. >> All other nodes have the same operating system onboard and all the system >> environment is quite identical. >> >> #cat /etc/ceph/ceph.conf >> [global] >> pid file = /var/run/ceph/$name.pid >> auth cluster required = none >> auth service required = none >> auth client required = none >> max open files = 65000 >> >> [mon] >> [mon.0] >> host = h01 >> mon addr = 10.1.1.3:6789 >> [mon.1] >> host = h07 >> mon addr = 10.1.1.10:6789 >> [mon.2] >> host = h08 >> mon addr = 10.1.1.11:6789 >> >> [mds] >> [mds.3] >> host = h09 >> >> [mds.4] >> host = h06 >> >> [osd] >> osd journal size = 10000 >> osd journal = /var/lib/ceph/journal/$cluster-$id/journal >> osd mkfs type = xfs >> >> [osd.0] >> host = h01 >> addr = 10.1.1.3 >> devs = /dev/sda3 >> [osd.1] >> host = h07 >> addr = 10.1.1.10 >> devs = /dev/sda3 >> [osd.2] >> host = h08 >> addr = 10.1.1.11 >> devs = /dev/sda3 >> [osd.3] >> host = h09 >> addr = 10.1.1.12 >> devs = /dev/sda3 >> >> [osd.4] >> host = h06 >> addr = 10.1.1.9 >> devs = /dev/sda3 >> >> >> ~#ceph osd tree >> >> # id weight type name up/down reweight >> -1 5 root default >> -3 5 rack unknownrack >> -2 1 host h01 >> 0 1 osd.0 up 1 >> -4 1 host h07 >> 1 1 osd.1 up 1 >> -5 1 host h08 >> 2 1 osd.2 down 0 >> -6 1 host h09 >> 3 1 osd.3 up 1 >> -7 1 host h06 >> 4 1 osd.4 up 1 >> >> > > -- > Software Engineer #42 @ http://inktank.com | http://ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com