Ilya, I will try doing that once again tonight as this is a production cluster and when dds trigger that dmesg error the cluster's io becomes very bad and I have to reboot the server to get things on track. Most of my vms start having 70-90% iowait until that server is rebooted.
I've actually checked what you've asked last time i've ran the test. When I do 4 dds concurrently nothing aprears in the dmesg output. No messages at all. The kern.log file that i've sent last time is what I got about a minute after i've started 8 dds. I've pasted the full output. The 8 dds did actually complete, but it took a rather long time. I was getting about 6MB/s per dd process compared to around 70MB/s per dd process when 4 dds were running. Do you still want me to run this or is the information i've provided enough? Cheers Andrei ----- Original Message ----- > From: "Ilya Dryomov" <ilya.dryo...@inktank.com> > To: "Andrei Mikhailovsky" <and...@arhont.com> > Cc: "ceph-users" <ceph-users@lists.ceph.com>, "Gregory Farnum" > <g...@gregs42.com> > Sent: Monday, 1 December, 2014 8:22:08 AM > Subject: Re: [ceph-users] Giant + nfs over cephfs hang tasks > On Mon, Dec 1, 2014 at 12:30 AM, Andrei Mikhailovsky > <and...@arhont.com> wrote: > > > > Ilya, further to your email I have switched back to the 3.18 kernel > > that > > you've sent and I got similar looking dmesg output as I had on the > > 3.17 > > kernel. Please find it attached for your reference. As before, this > > is the > > command I've ran on the client: > > > > > > time dd if=/dev/zero of=4G00 bs=4M count=5K oflag=direct & time dd > > if=/dev/zero of=4G11 bs=4M count=5K oflag=direct &time dd > > if=/dev/zero > > of=4G22 bs=4M count=5K oflag=direct &time dd if=/dev/zero of=4G33 > > bs=4M > > count=5K oflag=direct & time dd if=/dev/zero of=4G44 bs=4M count=5K > > oflag=direct & time dd if=/dev/zero of=4G55 bs=4M count=5K > > oflag=direct > > &time dd if=/dev/zero of=4G66 bs=4M count=5K oflag=direct &time dd > > if=/dev/zero of=4G77 bs=4M count=5K oflag=direct & > Can you run that command again - on 3.18 kernel, to completion - and > paste > - the entire dmesg > - "time" results for each dd > ? > Compare those to your results with four dds (or any other number > which > doesn't trigger page allocation failures). > Thanks, > Ilya
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com