Anthony, This is one of the reasons that Im working on VM snapshot on PS, (instead of volume snapshot)
I don't think it's easy to improve vdi-copy, considering it needs to coalesce incremental snapshots and verify the result. mice -----Original Message----- From: Anthony Xu [mailto:xuefei...@citrix.com] Sent: 2012-12-4 (星期二) 3:08 To: cloudstack-dev@incubator.apache.org Subject: RE: XenServer & VM Snapshots You are right, Vdi-copy is slow. we have reported this to XenServer team, they are working on this, but no time/road map is provided on this so far. Anthony > -----Original Message----- > From: Mice Xia [mailto:mice_...@tcloudcomputing.com] > Sent: Monday, December 03, 2012 11:05 AM > To: cloudstack-dev@incubator.apache.org > Subject: 答复: XenServer & VM Snapshots > > It is slow to take volume snapshot if your volume is huge, the reason > is vdi-copy, which is used to backup snapshot to SS, has performance > problem. > > You can't speed it up much for a full snapshot, perhaps you can try > increasing dom0 memory, or, adjust the ratio between full snapshot and > incremental snapshot to reduce the times of full snapshot. > > Mice > > > -----Original Message----- > From: Matthew Hartmann [mailto:mhartm...@tls.net] > Sent: 2012-12-4 (星期二) 2:31 > To: cloudstack-us...@incubator.apache.org > Cc: 'Cloudstack Developers' > Subject: RE: XenServer & VM Snapshots > > Anthony: > > Thank you for the prompt and informative reply. > > > I'm pretty sure mount and copy are using the same XenServe host. > > The behavior I have witnessed with CS 3.0.2 is that it doesn't always > do the > mount & copy on the same host. Out of the 12 tests I've performed, only > once > was the mount & copy performed on the same host that the VM was running > on. > > > I think the issue is the backup takes a long time because the data > volume > is big and network rate is low. > > You can increase "BackupSnapshotWait" in global configuration table > to let > the backup operation finish. > > I increased this in global settings from the default of 9 hours to 16 > hours. > The snapshot still doesn't complete on time; it on average copies about > ~460G before it times out. I'm pretty confident the network rate isn't > the > bottle neck as ISOs and imported VHDs install quickly. We have the > SecondaryP > Storage server set as the only internal site allowed to host files. I > upload > my ISO or VHD to Secondary Storage server and install using SSVM which > completes in a very timely manner. With a 1Gb network link, 1TB should > copy > in roughly 2 hours (if the link is saturated by the copy process); I've > only > found snapshotting (template creation appears to work flawlessly) to > take an > insanely long time to complete. > > Is there anything else I can do to increase performance or logs I > should > check? > > Cheers, > > Matthew > > > Matthew Hartmann > Systems Administrator | V: 812.378.4100 x 850 | E: mhartm...@tls.net > > TLS.NET, Inc. > http://www.tls.net > > > -----Original Message----- > From: Anthony Xu [mailto:xuefei...@citrix.com] > Sent: Monday, December 03, 2012 12:31 PM > To: Cloudstack Users > Cc: Cloudstack Developers > Subject: RE: XenServer & VM Snapshots > > Hi Matthew, > > You analysis is correct except following, > > >I must mention that the same Compute Node that ran sparse_dd or > mounted > Secondary Storage is not always the same. It appears the Management > Server > is simply round-robining through the list of >Compute Nodes and using > the > first one that is available. > > I'm pretty sure mount and copy are using the same XenServe host. > > I think the issue is the backup takes a long time because the data > volume is > big and network rate is low. > You can increase "BackupSnapshotWait" in global configuration table to > let > the backup operation finish. > > > Since CS takes the advantage of XenServer image format VHD, it uses VHD > to > do snapshot and clone, it requires snapshot to be backed up through > XenServer host. > The ideal solution for this issue might be leverage storage snapshot > and > clone functionality, Then snapshot back up is executed by storage host, > relieve some of the limitation. > Currently CS doesn't support this, it is not hard to support this > after > Edison finishes storage frame change, it should be just another storage > plug-in. > When CS uses storage server snapshot and clone function, CS needs to > consider number of snapshot , number of volume limitation of storage > server. > > > Anthony > > > > > > > > > > > > > > > From: Matthew Hartmann [mailto:mhartm...@tls.net] > Sent: Monday, December 03, 2012 9:08 AM > To: Cloudstack Users > Cc: Cloudstack Developers > Subject: XenServer & VM Snapshots > > Hello! I'm hoping someone can help me troubleshoot the following issue: > > I have a client who has a 960G data volume which contains their VM's > Exchange Data Store. When starting a snapshot, I found that a process > is > started on one of my Compute Nodes titled "sparse_dd". I found that > this > process is then sending the output of "sparse_dd" through another > Compute > Node's xapi before placing it into the "snapshot store" on Secondary > Storage. It appears that this is part of the bottle neck as all of our > systems are connected via gigabit link and should not take 15+ hours to > create a snapshot. The following is the behavior that I have analyzed > from > within my environment: > > > 1) Snapshot is started (either via Manual or Scheduled). > > 2) Compute Node 1 "processes the snapshot" by exposing the VDI > which > "sparse_dd" then creates a "thin provisioned" snapshot. > > 3) The output of sparse_dd is delivered over HTTP to xapi on > Compute > Node 2 where the Management Server mounted Secondary Storage. > > 4) Compute Node 2 (receiving the snapshot via xapi) stores the > snapshot > in the Secondary Storage mount point. > > Based on the behavior, I have devise the following logic that I believe > CloudStack is utilizing: > > > 1) CloudStack creates a "snapshot VDI" via XenServer Pool Master's > API. > > 2) CloudStack finds a Compute Node that can mount Secondary Storage. > > 3) CloudStack finds a Compute Node that can run "sparse_dd". > > 4) CloudStack uses available Compute node to output the VDI to xapi > on > the Compute Node that mounted Secondary Storage. > > I must mention that the same Compute Node that ran sparse_dd or mounted > Secondary Storage is not always the same. It appears the Management > Server > is simply round-robining through the list of Compute Nodes and using > the > first one that is available. > > Does anyone have any input on the issue I'm having or analysis of how > CloudStack/XenServer snapshots operate? > > Thanks! > > Cheers, > > Matthew > > > > Matthew Hartmann > Systems Administrator | V: 812.378.4100 x 850 | E: mhartm...@tls.net > > [cid:image017.jpg@01CDD14E.DBAA2E70]<http://www.tls.net/?utm_campaign=s > ignat > ure&utm_source=home&utm_medium=email> > > [cid:image018.jpg@01CDD14E.DBAA2E70] > > > [cid:image019.jpg@01CDD14E.DBAA2E70]<http://www.tls.net/enterprise_clou > d/clo > ud.php?utm_campaign=signature&utm_source=enterprise_cloud&utm_medium=em > ail> > > [cid:image020.jpg@01CDD14E.DBAA2E70] > > [cid:image021.jpg@01CDD14E.DBAA2E70]<http://www.tls.net/solutions/voip_ > servi > ces/hosted_pbx.php?utm_campaign=signature&utm_source=voip_services&utm_ > mediu > m=email> > > [cid:image020.jpg@01CDD14E.DBAA2E70] > > [cid:image022.jpg@01CDD14E.DBAA2E70]<http://www.tls.net/solutions/netwo > rk_en > gineering.php?utm_campaign=signature&utm_source=network_engineering&utm > _medi > um=email> > > [cid:image020.jpg@01CDD14E.DBAA2E70] > > [cid:image023.jpg@01CDD14E.DBAA2E70]<http://www.tls.net/data_centers/da > ta_ce > nters.php?utm_campaign=signature&utm_source=data_centers&utm_medium=ema > il> > > > > > > >