If I may, we've detected very poor performance executing snapshots.

We think it's due to XenServer's API, I don't know how and why, but the API
is very slow and runs one task at a time (if it is doing paralelization
it's almost nothing).

Do you know if there's a way to improve IO rates on XS side?

thx.



On Mon, Dec 3, 2012 at 8:07 PM, Matthew Hartmann <mhartm...@tls.net> wrote:

> Thank you Anthony! :)
>
> Cheers,
>
> Matthew
>
>
>
> Matthew Hartmann
> Systems Administrator | V: 812.378.4100 x 850 | E: mhartm...@tls.net
>
> TLS.NET, Inc.
> http://www.tls.net
>
>
> -----Original Message-----
> From: Anthony Xu [mailto:xuefei...@citrix.com]
> Sent: Monday, December 03, 2012 1:59 PM
> To: 'Cloudstack Developers'; cloudstack-us...@incubator.apache.org
> Subject: RE: XenServer & VM Snapshots
>
> CS 3.0.2 is too old version.
>
> I'm pretty sure mount & copy on the same host in 3.0.4 and 3.0.5.
> If mount & copy might be on different hosts, the issue is very likely to
> happen.
> I didn't hear this issue from QA and users.
>
> I just checked vmopsSnapshot plug-in for XenServer, at /etc/xapi.d/plugins,
> Which mounts secondary storage just before sparse-dd.
>
> I recommend you to upgrade new version.
>
> If you still see the issue,
>
> Please post related management server log and /var/log/SMlog in XenServer.
>
>
> Anthony
>
>
>
>
>
>
>
>
>
>
>
> > -----Original Message-----
> > From: Matthew Hartmann [mailto:mhartm...@tls.net]
> > Sent: Monday, December 03, 2012 10:31 AM
> > To: cloudstack-us...@incubator.apache.org
> > Cc: 'Cloudstack Developers'
> > Subject: RE: XenServer & VM Snapshots
> >
> > Anthony:
> >
> > Thank you for the prompt and informative reply.
> >
> > > I'm pretty sure mount and copy are using the same XenServe host.
> >
> > The behavior I have witnessed with CS 3.0.2 is that it doesn't always
> > do the
> > mount & copy on the same host. Out of the 12 tests I've performed, only
> > once
> > was the mount & copy performed on the same host that the VM was running
> > on.
> >
> > > I think the issue is the backup takes a long time because the data
> > volume
> > is big and network rate is low.
> > > You can increase "BackupSnapshotWait" in global configuration table
> > to let
> > the backup operation finish.
> >
> > I increased this in global settings from the default of 9 hours to 16
> > hours.
> > The snapshot still doesn't complete on time; it on average copies about
> > ~460G before it times out. I'm pretty confident the network rate isn't
> > the
> > bottle neck as ISOs and imported VHDs install quickly. We have the
> > Secondary
> > Storage server set as the only internal site allowed to host files. I
> > upload
> > my ISO or VHD to Secondary Storage server and install using SSVM which
> > completes in a very timely manner. With a 1Gb network link, 1TB should
> > copy
> > in roughly 2 hours (if the link is saturated by the copy process); I've
> > only
> > found snapshotting (template creation appears to work flawlessly) to
> > take an
> > insanely long time to complete.
> >
> > Is there anything else I can do to increase performance or logs I
> > should
> > check?
> >
> > Cheers,
> >
> > Matthew
> >
> >
> > Matthew Hartmann
> > Systems Administrator | V: 812.378.4100 x 850 | E: mhartm...@tls.net
> >
> > TLS.NET, Inc.
> > http://www.tls.net
> >
> >
> > -----Original Message-----
> > From: Anthony Xu [mailto:xuefei...@citrix.com]
> > Sent: Monday, December 03, 2012 12:31 PM
> > To: Cloudstack Users
> > Cc: Cloudstack Developers
> > Subject: RE: XenServer & VM Snapshots
> >
> > Hi Matthew,
> >
> > You analysis is correct except following,
> >
> > >I must mention that the same Compute Node that ran sparse_dd or
> > mounted
> > Secondary Storage is not always the same. It appears the Management
> > Server
> > is simply round-robining through the list of >Compute Nodes and using
> > the
> > first one that is available.
> >
> > I'm pretty sure mount and copy are using the same XenServe host.
> >
> > I think the issue is the backup takes a long time because the data
> > volume is
> > big and network rate is low.
> > You can increase "BackupSnapshotWait" in global configuration table to
> > let
> > the backup operation finish.
> >
> >
> > Since CS takes the advantage of XenServer image format VHD, it uses VHD
> > to
> > do snapshot and clone, it requires snapshot to be backed up through
> > XenServer host.
> > The ideal solution for this issue might be leverage storage snapshot
> > and
> > clone functionality, Then snapshot back up is executed by storage host,
> > relieve some of the limitation.
> > Currently CS doesn't support this,  it is not hard to support this
> > after
> > Edison finishes storage frame change, it should be just another storage
> > plug-in.
> > When CS uses storage server snapshot and clone function, CS needs to
> > consider number of snapshot , number of volume limitation of storage
> > server.
> >
> >
> > Anthony
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > From: Matthew Hartmann [mailto:mhartm...@tls.net]
> > Sent: Monday, December 03, 2012 9:08 AM
> > To: Cloudstack Users
> > Cc: Cloudstack Developers
> > Subject: XenServer & VM Snapshots
> >
> > Hello! I'm hoping someone can help me troubleshoot the following issue:
> >
> > I have a client who has a 960G data volume which contains their VM's
> > Exchange Data Store. When starting a snapshot, I found that a process
> > is
> > started on one of my Compute Nodes titled "sparse_dd". I found that
> > this
> > process is then sending the output of "sparse_dd" through another
> > Compute
> > Node's xapi before placing it into the "snapshot store" on Secondary
> > Storage. It appears that this is part of the bottle neck as all of our
> > systems are connected via gigabit link and should not take 15+ hours to
> > create a snapshot. The following is the behavior that I have analyzed
> > from
> > within my environment:
> >
> >
> > 1)     Snapshot is started (either via Manual or Scheduled).
> >
> > 2)     Compute Node 1 "processes the snapshot" by exposing the VDI
> > which
> > "sparse_dd" then creates a "thin provisioned" snapshot.
> >
> > 3)     The output of sparse_dd is delivered over HTTP to xapi on
> > Compute
> > Node 2 where the Management Server mounted Secondary Storage.
> >
> > 4)     Compute Node 2 (receiving the snapshot via xapi) stores the
> > snapshot
> > in the Secondary Storage mount point.
> >
> > Based on the behavior, I have devise the following logic that I believe
> > CloudStack is utilizing:
> >
> >
> > 1)     CloudStack creates a "snapshot VDI" via XenServer Pool Master's
> > API.
> >
> > 2)     CloudStack finds a Compute Node that can mount Secondary Storage.
> >
> > 3)     CloudStack finds a Compute Node that can run "sparse_dd".
> >
> > 4)     CloudStack uses available Compute node to output the VDI to xapi
> > on
> > the Compute Node that mounted Secondary Storage.
> >
> > I must mention that the same Compute Node that ran sparse_dd or mounted
> > Secondary Storage is not always the same. It appears the Management
> > Server
> > is simply round-robining through the list of Compute Nodes and using
> > the
> > first one that is available.
> >
> > Does anyone have any input on the issue I'm having or analysis of how
> > CloudStack/XenServer snapshots operate?
> >
> > Thanks!
> >
> > Cheers,
> >
> > Matthew
> >
> >
> >
> > Matthew Hartmann
> > Systems Administrator | V: 812.378.4100 x 850 | E: mhartm...@tls.net
> >
> > [cid:image017.jpg@01CDD14E.DBAA2E70]<http://www.tls.net/?utm_campaign=s
> > ignat
> > ure&utm_source=home&utm_medium=email>
> >
> > [cid:image018.jpg@01CDD14E.DBAA2E70]
> >
> >
> > [cid:image019.jpg@01CDD14E.DBAA2E70]<http://www.tls.net/enterprise_clou
> > d/clo
> > ud.php?utm_campaign=signature&utm_source=enterprise_cloud&utm_medium=em
> > ail>
> >
> > [cid:image020.jpg@01CDD14E.DBAA2E70]
> >
> > [cid:image021.jpg@01CDD14E.DBAA2E70]<http://www.tls.net/solutions/voip_
> > servi
> > ces/hosted_pbx.php?utm_campaign=signature&utm_source=voip_services&utm_
> > mediu
> > m=email>
> >
> > [cid:image020.jpg@01CDD14E.DBAA2E70]
> >
> > [cid:image022.jpg@01CDD14E.DBAA2E70]<http://www.tls.net/solutions/netwo
> > rk_en
> > gineering.php?utm_campaign=signature&utm_source=network_engineering&utm
> > _medi
> > um=email>
> >
> > [cid:image020.jpg@01CDD14E.DBAA2E70]
> >
> > [cid:image023.jpg@01CDD14E.DBAA2E70]<http://www.tls.net/data_centers/da
> > ta_ce
> > nters.php?utm_campaign=signature&utm_source=data_centers&utm_medium=ema
> > il>
> >
> >
> >
> >
> >
> >
>
>
>

Reply via email to