On Wed, Mar 5, 2014 at 8:53 AM, Peter Lieven <p...@kamp.de> wrote: > Am 05.03.2014 16:20, schrieb Marcus: >> I think this is a more generic sysadmin problem. I've seen the same >> thing in the past with simply snapshotting a logical volume or zfs >> zvol and copying it off somewhere. Page cache bloats, the system >> starts swapping. To avoid it, we wrote a small C program that calls >> FADV_DONTNEED on a file, and fork off a process to call it on the >> source file every X seconds in our backup scripts. > I do not call FADV_DONTNEED on the whole file, but only > on the block that has just been read.
Yes, I suppose that's one of the advantages of having it integrated into the reader. >> >> It's a little strange to me to have qemu-img do this, just like it >> would be strange if 'cp' did it, but I can see it as a very useful >> shortcut if it's an optional flag. qemu-img to me is just an admin >> tool, and the admin should decide if they want their tool's reads >> cached. Some additional things that come to mind: >> >> * If you are running qemu-img on a running VM's source file, >> FADV_DONTNEED may ruin the cache you wanted if the VM is not running >> cache=none. > You would normally not run it on the source directly. In my case > I run it on a snapshot of an logical volume, but I see your point. Totally depends on the situation, just thought it was worth consideration. > > So you can confirm my oberservations and would be happy if > this behaviour could be toggled with a cmdline switch? Yes, I've seen the same behavior you mention just with 'cp'. It was with a version of the CentOS 6.2 kernel, at least, before we added FADV_DONTNEED into the backup scripts. >> >> * O_DIRECT I think will cause unexpected problems, for example the >> zfsonlinux guys (and tmpfs as mentioned) don't yet support it. If it >> is used, there has to be a fallback or a way to turn it off. > I don't use O_DIRECT. Its an option for the destination file only at the > moment. You can set it with -t none as qemu-img argument. I just mentioned it because setting it on the source was suggested originally and subsequently discussed. > > Peter >> >> On Wed, Mar 5, 2014 at 7:44 AM, Peter Lieven <p...@kamp.de> wrote: >>> Am 04.03.2014 10:24, schrieb Stefan Hajnoczi: >>>> On Mon, Mar 03, 2014 at 01:20:21PM +0100, Peter Lieven wrote: >>>>> On 03.03.2014 13:03, Stefan Hajnoczi wrote: >>>>>> So what is the actual performance problem you are trying to solve and >>>>>> what benchmark output are you getting when you compare with >>>>>> FADV_DONTNEED against without FADV_DONTNEED? >>>>> I found the performance to be identical. For the problem see below please. >>>>>> I think there's a danger that the discussion will go around in circles. >>>>>> Please post the performance results that kicked off this whole effort >>>>>> and let's focus on the data. That way it's much easier to evaluate what >>>>>> changes to QEMU are a win and which are not necessary. >>>>> I found that under memory pressure situations the increasing buffers >>>>> leads to vserver memory being swapped out. This caused trouble >>>>> especially in overcommit scenarios (where all memory is backed by >>>>> swap). >>>> I think the general idea is qemu-img should not impact running guests, >>>> even on a heavily loaded machine. But again, this needs to be discussed >>>> using concrete benchmarks with configurations and results posted to the >>>> list. >>> Sure, this is why I started to look at this. I found that under high memory >>> pressure a backup (local storage -> NFS) causes swapping. I started to >>> use libnfs as destination to avoid influence of the kernel NFS client. But >>> I saw that the buffers still increase while a backup is running. With the >>> proposed patch I sent recently >>> >>> [PATCH] block: introduce BDRV_O_SEQUENTIAL >>> >>> I don't see this behaviour while I have not yet observed a performance >>> penalty. >>> >>> Peter >>>> Stefan >>> >