Cristian, I'm not sure off hand what's up - but can you increase the logging levels, then rerun the test:
http://docs.ceph.com/docs/master/rados/troubleshooting/log-and-debug/ See the "Runtime" section for injecting the logging arguments after starting - or change the {cluster}.conf (eg /etc/ceph/ceph.conf) settings with the a stanza like: [osd] debug osd = 20/20 debug journal = 20/20 debug monc = 20/20 ~~shane On 6/21/15, 8:22 AM, "ceph-users on behalf of Cristian Falcas" <ceph-users-boun...@lists.ceph.com<mailto:ceph-users-boun...@lists.ceph.com> on behalf of cristi.fal...@gmail.com<mailto:cristi.fal...@gmail.com>> wrote: Those are the logs from that moment from "ceph -w": 2015-06-21 14:09:34.542891 mon.0 [INF] pgmap v172617: 512 pgs: 512 active+clean; 502 GB data, 183 GB used, 2279 GB / 2469 GB avail; 0 B/s rd, 12695 B/s wr, 3 op/s 2015-06-21 14:09:39.544302 mon.0 [INF] pgmap v172618: 512 pgs: 512 active+clean; 502 GB data, 183 GB used, 2279 GB / 2469 GB avail; 103 kB/s rd, 9419 B/s wr, 22 op/s 2015-06-21 14:09:44.544762 mon.0 [INF] pgmap v172619: 512 pgs: 512 active+clean; 502 GB data, 183 GB used, 2279 GB / 2469 GB avail; 209 kB/s rd, 9009 B/s wr, 28 op/s 2015-06-21 14:09:38.489980 osd.0 [INF] 8.4c scrub starts 2015-06-21 14:09:39.143548 osd.0 [INF] 8.4c scrub ok 2015-06-21 14:09:39.490283 osd.0 [INF] 8.4d scrub starts 2015-06-21 14:09:40.170572 osd.0 [INF] 8.4d scrub ok 2015-06-21 14:09:41.490652 osd.0 [INF] 8.4e scrub starts 2015-06-21 14:09:42.269054 osd.0 [INF] 8.4e scrub ok 2015-06-21 14:09:44.491206 osd.0 [INF] 8.4f scrub starts 2015-06-21 14:09:45.213658 osd.0 [INF] 8.4f scrub ok 2015-06-21 14:09:49.629596 mon.0 [INF] pgmap v172620: 512 pgs: 512 active+clean; 502 GB data, 183 GB used, 2279 GB / 2469 GB avail; 104 kB/s rd, 8528 B/s wr, 7 op/s 2015-06-21 14:09:54.630316 mon.0 [INF] pgmap v172621: 512 pgs: 512 active+clean; 502 GB data, 183 GB used, 2279 GB / 2469 GB avail; 0 B/s rd, 20306 B/s wr, 5 op/s 2015-06-21 14:09:55.443987 mon.0 [INF] osd.1 marked down after no pg stats for 904.221819seconds 2015-06-21 14:09:55.453660 mon.0 [INF] osdmap e122: 2 osds: 1 up, 2 in 2015-06-21 14:09:55.458644 mon.0 [INF] pgmap v172622: 512 pgs: 128 stale+active+clean, 384 active+clean; 502 GB data, 183 GB used, 2279 GB / 2469 GB avail; 0 B/s rd, 28136 B/s wr, 8 op/s 2015-06-21 14:09:47.491759 osd.0 [INF] 8.50 scrub starts 2015-06-21 14:09:48.574902 osd.0 [INF] 8.50 scrub ok 2015-06-21 14:09:48.575136 osd.0 [INF] 8.50 scrub starts 2015-06-21 14:09:48.678662 osd.0 [INF] 8.50 scrub ok 2015-06-21 14:09:52.575940 osd.0 [INF] 8.51 scrub starts 2015-06-21 14:09:53.314203 osd.0 [INF] 8.51 scrub ok 2015-06-21 14:09:59.650334 mon.0 [INF] pgmap v172623: 512 pgs: 128 stale+active+clean, 384 active+clean; 502 GB data, 183 GB used, 2279 GB / 2469 GB avail; 2359 kB/s rd, 12272 B/s wr, 69 op/s 2015-06-21 14:10:04.633154 mon.0 [INF] pgmap v172624: 512 pgs: 128 stale+active+clean, 384 active+clean; 502 GB data, 183 GB used, 2279 GB / 2469 GB avail; 1286 kB/s rd, 20523 B/s wr, 41 op/s 2015-06-21 14:10:00.578299 osd.0 [INF] 8.52 scrub starts 2015-06-21 14:10:01.172525 osd.0 [INF] 8.52 scrub ok 2015-06-21 14:10:02.578690 osd.0 [INF] 8.53 scrub starts 2015-06-21 14:10:03.178836 osd.0 [INF] 8.53 scrub ok 2015-06-21 14:10:09.634306 mon.0 [INF] pgmap v172625: 512 pgs: 128 stale+active+clean, 384 active+clean; 502 GB data, 183 GB used, 2279 GB / 2469 GB avail; 0 B/s rd, 24171 B/s wr, 4 On Sun, Jun 21, 2015 at 6:19 PM, Cristian Falcas <cristi.fal...@gmail.com<mailto:cristi.fal...@gmail.com>> wrote: Hello, When doing a fio test on a vm, after some time the osd goes down with this error: osd.1 marked down after no pg stats for 904.221819seconds Anyone can help me with this error? I can't find any errors on the physical mascine at that time. Only one vm is running, the one with the fio test. Also this is repeatable, meaning that if I reboot the vm and restart the test, after a while the osd goes down again. Version used: # rpm -qa | grep ceph | sort ceph-0.94.2-0.el7.centos.x86_64 ceph-common-0.94.2-0.el7.centos.x86_64 libcephfs1-0.94.2-0.el7.centos.x86_64 python-ceph-compat-0.94.2-0.el7.centos.x86_64 python-cephfs-0.94.2-0.el7.centos.x86_64 I don't know if that matters, but the physical machine is a ceph+openstack all in one installation. Thank you, Cristian Falcas
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com