>IMO we'd better to use backend storage optimized approach to access >remote image from compute node instead of using iSCSI only. And from >my experience, I'm sure iSCSI is short of stability under heavy I/O >workload in product environment, it could causes either VM filesystem >to be marked as readonly or VM kernel panic.Yes, in this situation, the >problem lies in the backend storage, so no otherprotocol will perform better. >However, P2P transferring will greatly reduceworkload on the backend storage, >so as to increase responsiveness.
>As I said currently Nova already has image caching mechanism, so in >this case P2P is just an approach could be used for downloading or >preheating for image caching. Nova's image caching is file level, while VMThunder's is block-level. And VMThunder is for working in conjunction with Cinder, not Glance. VMThunder currently uses facebook's flashcache to realize caching, and dm-cache, bcache are also options in the future. >I think P2P transferring/pre-caching sounds a good way to go, as I >mentioned as well, but actually for the area I'd like to see something >like zero-copy + CoR. On one hand we can leverage the capability of >on-demand downloading image bits by zero-copy approach, on the other >hand we can prevent to reading data from remote image every time by >CoR. Yes, on-demand transferring is what you mean by "zero-copy", and caching is something close to CoR. In fact, we are working on a kernel module called foolcache that realize a true CoR. See https://github.com/lihuiba/dm-foolcache. National Key Laboratory for Parallel and Distributed Processing, College of Computer Science, National University of Defense Technology, Changsha, Hunan Province, P.R. China 410073 At 2014-04-17 17:11:48,"Zhi Yan Liu" <lzy....@gmail.com> wrote: >On Thu, Apr 17, 2014 at 4:41 PM, lihuiba <magazine.lihu...@163.com> wrote: >>>IMHO, zero-copy approach is better >> VMThunder's "on-demand transferring" is the same thing as your "zero-copy >> approach". >> VMThunder is uses iSCSI as the transferring protocol, which is option #b of >> yours. >> > >IMO we'd better to use backend storage optimized approach to access >remote image from compute node instead of using iSCSI only. And from >my experience, I'm sure iSCSI is short of stability under heavy I/O >workload in product environment, it could causes either VM filesystem >to be marked as readonly or VM kernel panic. > >> >>>Under #b approach, my former experience from our previous similar >>>Cloud deployment (not OpenStack) was that: under 2 PC server storage >>>nodes (general *local SAS disk*, without any storage backend) + >>>2-way/multi-path iSCSI + 1G network bandwidth, we can provisioning 500 >>>VMs in a minute. >> suppose booting one instance requires reading 300MB of data, so 500 ones >> require 150GB. Each of the storage server needs to send data at a rate of >> 150GB/2/60 = 1.25GB/s on average. This is absolutely a heavy burden even >> for high-end storage appliances. In production systems, this request >> (booting >> 500 VMs in one shot) will significantly disturb other running instances >> accessing the same storage nodes. >> >> VMThunder eliminates this problem by P2P transferring and on-compute-node >> caching. Even a pc server with one 1gb NIC (this is a true pc server!) can >> boot >> 500 VMs in a minute with ease. For the first time, VMThunder makes bulk >> provisioning of VMs practical for production cloud systems. This is the >> essential >> value of VMThunder. >> > >As I said currently Nova already has image caching mechanism, so in >this case P2P is just an approach could be used for downloading or >preheating for image caching. > >I think P2P transferring/pre-caching sounds a good way to go, as I >mentioned as well, but actually for the area I'd like to see something >like zero-copy + CoR. On one hand we can leverage the capability of >on-demand downloading image bits by zero-copy approach, on the other >hand we can prevent to reading data from remote image every time by >CoR. > >zhiyan > >> >> >> >> =================================================== >> From: Zhi Yan Liu <lzy....@gmail.com> >> Date: 2014-04-17 0:02 GMT+08:00 >> Subject: Re: [openstack-dev] [Nova][blueprint] Accelerate the booting >> process of a number of vms via VMThunder >> To: "OpenStack Development Mailing List (not for usage questions)" >> <openstack-dev@lists.openstack.org> >> >> >> >> Hello Yongquan Fu, >> >> My thoughts: >> >> 1. Currently Nova has already supported image caching mechanism. It >> could caches the image on compute host which VM had provisioning from >> it before, and next provisioning (boot same image) doesn't need to >> transfer it again only if cache-manger clear it up. >> 2. P2P transferring and prefacing is something that still based on >> copy mechanism, IMHO, zero-copy approach is better, even >> transferring/prefacing could be optimized by such approach. (I have >> not check "on-demand transferring" of VMThunder, but it is a kind of >> transferring as well, at last from its literal meaning). >> And btw, IMO, we have two ways can go follow zero-copy idea: >> a. when Nova and Glance use same backend storage, we could use storage >> special CoW/snapshot approach to prepare VM disk instead of >> copy/transferring image bits (through HTTP/network or local copy). >> b. without "unified" storage, we could attach volume/LUN to compute >> node from backend storage as a base image, then do such CoW/snapshot >> on it to prepare root/ephemeral disk of VM. This way just like >> boot-from-volume but different is that we do CoW/snapshot on Nova side >> instead of Cinder/storage side. >> >> For option #a, we have already got some progress: >> https://blueprints.launchpad.net/nova/+spec/image-multiple-location >> https://blueprints.launchpad.net/nova/+spec/rbd-clone-image-handler >> https://blueprints.launchpad.net/nova/+spec/vmware-clone-image-handler >> >> Under #b approach, my former experience from our previous similar >> Cloud deployment (not OpenStack) was that: under 2 PC server storage >> nodes (general *local SAS disk*, without any storage backend) + >> 2-way/multi-path iSCSI + 1G network bandwidth, we can provisioning 500 >> VMs in a minute. >> >> For vmThunder topic I think it sounds a good idea, IMO P2P, prefacing >> is one of optimized approach for image transferring valuably. >> >> zhiyan >> >> On Wed, Apr 16, 2014 at 9:14 PM, yongquan Fu <quanyo...@gmail.com> wrote: >>> >>> Dear all, >>> >>> >>> >>> We would like to present an extension to the vm-booting functionality of >>> Nova when a number of homogeneous vms need to be launched at the same >>> time. >>> >>> >>> >>> The motivation for our work is to increase the speed of provisioning vms >>> for >>> large-scale scientific computing and big data processing. In that case, we >>> often need to boot tens and hundreds virtual machine instances at the same >>> time. >>> >>> >>> Currently, under the Openstack, we found that creating a large number >>> of >>> virtual machine instances is very time-consuming. The reason is the >>> booting >>> procedure is a centralized operation that involve performance bottlenecks. >>> Before a virtual machine can be actually started, OpenStack either copy >>> the >>> image file (swift) or attach the image volume (cinder) from storage server >>> to compute node via network. Booting a single VM need to read a large >>> amount >>> of image data from the image storage server. So creating a large number of >>> virtual machine instances would cause a significant workload on the >>> servers. >>> The servers become quite busy even unavailable during the deployment >>> phase. >>> It would consume a very long time before the whole virtual machine cluster >>> useable. >>> >>> >>> >>> Our extension is based on our work on vmThunder, a novel mechanism >>> accelerating the deployment of large number virtual machine instances. It >>> is >>> written in Python, can be integrated with OpenStack easily. VMThunder >>> addresses the problem described above by following improvements: on-demand >>> transferring (network attached storage), compute node caching, P2P >>> transferring and prefetching. VMThunder is a scalable and cost-effective >>> accelerator for bulk provisioning of virtual machines. >>> >>> >>> >>> We hope to receive your feedbacks. Any comments are extremely welcome. >>> Thanks in advance. >>> >>> >>> >>> PS: >>> >>> >>> >>> VMThunder enhanced nova blueprint: >>> https://blueprints.launchpad.net/nova/+spec/thunderboost >>> >>> VMThunder standalone project: https://launchpad.net/vmthunder; >>> >>> VMThunder prototype: https://github.com/lihuiba/VMThunder >>> >>> VMThunder etherpad: https://etherpad.openstack.org/p/vmThunder >>> >>> VMThunder portal: http://www.vmthunder.org/ >>> >>> VMThunder paper: >>> http://www.computer.org/csdl/trans/td/preprint/06719385.pdf >>> >>> >>> >>> Regards >>> >>> >>> >>> vmThunder development group >>> >>> PDL >>> >>> National University of Defense Technology >>> >>> >>> _______________________________________________ >>> OpenStack-dev mailing list >>> OpenStack-dev@lists.openstack.org >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>> >> >> _______________________________________________ >> OpenStack-dev mailing list >> OpenStack-dev@lists.openstack.org >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> >> >> -- >> Yongquan Fu >> PhD, Assistant Professor, >> National Key Laboratory for Parallel and Distributed >> Processing, College of Computer Science, National University of Defense >> Technology, Changsha, Hunan Province, P.R. China >> 410073 >> >> _______________________________________________ >> OpenStack-dev mailing list >> OpenStack-dev@lists.openstack.org >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> > >_______________________________________________ >OpenStack-dev mailing list >OpenStack-dev@lists.openstack.org >http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
_______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev