Hi again,
I've been looking around the backup/restore code a bit. I'm focused on
restore acceleration on Ceph RBD right know.
Sorry if I have something mistaken, I have never developed for Proxmox/Qemu.
I see in line 563 of file
https://git.proxmox.com/?p=pve-qemu-kvm.git;a=blob;f=debian/patches/pve/0011-introduce-new-vma-archive-format.patch;h=1c26209648c210f3b18576abc2c5a23768fd7c7b;hb=HEAD
the function restore_write_data, it is calling full_write (for direct to
file restore) and bdrv_write (what I suppose is a QEMU abstraction of
block device).
This is called from restore_extents, where a comment precisely says "try
to write whole clusters to speedup restore", so this means we're writing
64KB-8Byte chunks, which is giving a hard time to Ceph-RBD because this
means lots of ~64KB IOPS.
So, I suggest the following solution to your consideration:
- Create a write buffer on startup (let's asume it's 4MB for example, a
number ceph rbd would like much more than 64KB). This could even be
configurable and skip the buffer altogether if buffer_size=cluster_size
- Wrap current "restore_write_data" with a
"restore_write_data_with_buffer", that does a copy to the 4MB buffer,
and only calls "restore_write_data" when it's full.
* Create a new "flush_restore_write_data_buffer" to flush the write
buffer when device restore reading is complete.
Do you think this is a good idea? If so I will find time to implement
and test this to check whether restore time improves.
Thanks a lot
Eneko
El 20/07/16 a las 08:24, Eneko Lacunza escribió:
El 16/02/16 a las 15:52, Stefan Priebe - Profihost AG escribió:
Am 16.02.2016 um 15:50 schrieb Dmitry Petuhov:
16.02.2016 13:20, Dietmar Maurer wrote:
Storage Backend is ceph using 2x 10Gbit/s and i'm able to read
from it
with 500-1500MB/s. See below for an example.
The backup process reads 64KB blocks, and it seems this slows down
ceph.
This is a known behavior, but I found no solution to speed it up.
Just done script to speedup my backups from ceph. It's simply does
(actually little more):
rbd snap create $SNAP
rbd export $SNAP $DUMPDIR/$POOL-$VOLUME-$DATE.raw
rbd snap rm $SNAP
for every image in selected pools.
When exporting to file, it's faster than my temporary HDD can write
(about 120MB/s). But exporting to STDOUT ('-' instead of filename, with
compression or without it) noticeably decreases speed to qemu's levels
(20-30MB/s). That's little strange.
This method is incompatible with PVE's backup-restore tools, but good
enough for manual disaster recovery from CLI.
right - that'S working for me too but just at night and not when a
single user wants RIGHT now a backup incl. config.
Do we have any improvement related to this in the pipeline? Yesterday
our 9-osd 3-node cluster restored a backup at 6MB/s... it was very
boring, painfull and expensive to wait for it :) (I decided to buy a
new server to replace our 7.5-year IBM while waiting ;) )
Our backups are slow too, but we do those during weekend... but
usually we want to restore fast... :)
Dietmar, I haven't looked at the backup/restore code, but do you
think we could do something to read/write to storage in larger chunks
than the actual 64KB? I'm out of a high work load period and maybe
could look at this issue this summer.
Thanks
Eneko
--
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943493611
943324914
Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa)
www.binovo.es
_______________________________________________
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel