Re: [DISCUSS] Split Marvin to its own repository

2016-07-02 Thread Wido den Hollander

> Op 28 juni 2016 om 13:40 schreef Rohit Yadav :
> 
> 
> All,
> 
> 
> I've made few changes which allows for a standalone Marvin. We currently have 
> Marvin in the CloudStack repository because of the 'cloudstackAPI' code 
> generation that is done during build time. Marvin allows 'cloudstackAPI' to 
> be generated at runtime if a URL end-point is provided.
> 
> 
> Some of the known build/test environments such as Travis, Trillian (upcoming) 
> [1], bubble [2] etc. have Marvin dependency that is tied to the 
> repository/branch. By splitting Marvin apart, build/test environments such as 
> Trillian can simply get Marvin, generate cloudstackAPI against a running mgmt 
> server and then run Marvin based tests. This is useful as Trillian, bubble 
> etc perform tests on CloudStack packages (rpm/deb) instead of the maven.
> 
> 
> The only cons of this approach is that Marvin library (if split) will need to 
> be backward compatible wrt tests and CloudStack versions, and looking at the 
> core Marvin library history it seems Marvin from latest-master branch is 
> backward compatible to at least 4.5.2 (tested today). (Note: integration 
> tests may be still tied to branch/version).
> 
> 
> As an experiment, in the following PR the marvin library is not installed 
> from the maven build but rather after CloudStack mgmt server runs, we use the 
> Marvin codegenerator to generate `cloudstackAPI` from the API end-point 
> (unauthenticated, on port 8096): 
> https://github.com/apache/cloudstack/pull/1599
> 
> 
> As an example, here is the marvin library with git history use-able as a 
> separate repository: github.com/rhtyd/marvin
> 
> 
> Thoughts, comments?
> 

I would say it's a good thing. This way you can also have PRs for Marvin go 
through a different review then the primary CloudStack code.

I'd say +1 for splitting it into it's own repo.

Wido

> 
> Regards.
> 
> 
> [1] https://github.com/shapeblue/Trillian
> 
> [2] https://github.com/MissionCriticalCloud/bubble-toolkit
> 
> [https://avatars3.githubusercontent.com/u/16591787?v=3&s=400]
> 
> GitHub - MissionCriticalCloud/bubble-toolkit: Shared repo 
> ...
> github.com
> README.md Bubble Toolkit. This repository contains all tools used with 
> so-called Bubbles. To setup a Bubble, follow the instructions here. This repo 
> is available om ...
> 
> 
> 
> 
> [https://avatars3.githubusercontent.com/u/6001764?v=3&s=400]
> 
> shapeblue/Trillian
> github.com
> Trillian - flexible monkey powered CI/CD
> 
> 
> 
> rohit.ya...@shapeblue.com 
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>   
>  
>


Re: Ceph RBD related host agent segfault

2016-07-02 Thread Wido den Hollander

> Op 30 juni 2016 om 18:29 schreef Aaron Hurt :
> 
> 
> In preparation to roll a new platform built on 4.8 with a Ceph storage 
> backend we’ve been encountering segfaults that appear to be related to 
> snapshot operations via java-jados (librbd) on the host agent.  We’ve been 
> able to isolate this to two possible places in the code:
> 
> lines ~866-875 in 
> plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/storage/LibvirtStorageAdaptor.java
> 
> for (RbdSnapInfo snap : snaps) {
> if (image.snapIsProtected(snap.name)) {
> s_logger.debug("Unprotecting snapshot " + 
> pool.getSourceDir() + "/" + uuid + "@" + snap.name);
> image.snapUnprotect(snap.name);
> } else {
> s_logger.debug("Snapshot " + pool.getSourceDir() + 
> "/" + uuid + "@" + snap.name + " is not protected.");
> }
> s_logger.debug("Removing snapshot " + pool.getSourceDir() 
> + "/" + uuid + "@" + snap.name);
> image.snapRemove(snap.name);
> }
> 
> Should we be checking if the unprotect actually failed/succeeded before 
> attempting to remove the snapshot?
> 
> Code from PR #1230 (https://github.com/apache/cloudstack/pull/1230 
> ) duplicates some of this 
> functionality and there doesn’t seem to be any protection preventing 
> deletePhysicalDisk and the cleanup routine being run simultaneously.
> 
> 
> To Reproduce (with ceph/rbd primary storage)
> 
> 1.  Set global concurrent.snapshots.threshold.perhost to the default NULL 
> value
> 2.  Set global snapshot.poll.interval and storage.cleanup.interval to a low 
> interval … 10 seconds
> 3.  Restart management server
> 4.  Deploy several VMs from templates
> 5.  Destroy+expunge the VMs after they are running
> 6.  Observe segfaults in management server
> 
> 
> Workaround
> 
> We’ve been able to eliminate the segfaults of the host agent in our testing 
> by simply setting concurrent.snapshots.threshold.perhost to 1 even with the 
> decreased poll intervals.
> 
> Segfault Logs
> 
> https://slack-files.com/T0RJECUV7-F1M39K4F5-f9c6b3986d 
> 
> 
> https://slack-files.com/T0RJECUV7-F1KCTRNNN-8d36665b56 
> 
> 
> We would really appreciate any feedback and/or confirmation from the 
> community around the above issues.  I’d also be happy to provide any 
> additional information needed to get this addressed.

What seems to be happening is that it failed to unprotect the snapshot of the 
volume. This could have various reasons, for example if there is a child image 
of the snapshot. I don't think it's the case however.

It could still be that it tries to remove the master/golden image from the 
template while it still has childs attached to that snapshot.

I'm not sure if this is due to rados-java or a bug in librados. The Java could 
should just throw a exception and not completely crash the JVM. This happens 
lower in the code and not in Java.

The assert shows this also happens when Java is talking to libvirt. I guess a 
librados bug, but now completely sure.

Wido

> 
> — Aaron


[GitHub] cloudstack issue #1600: Support Backup of Snapshots for Managed Storage

2016-07-02 Thread syed
Github user syed commented on the issue:

https://github.com/apache/cloudstack/pull/1600
  
Good catch @mike-tutkowski. I will fix it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] cloudstack issue #1600: Support Backup of Snapshots for Managed Storage

2016-07-02 Thread mike-tutkowski
Github user mike-tutkowski commented on the issue:

https://github.com/apache/cloudstack/pull/1600
  
Can you return the new locationtype parameter in the listSnapshots API call?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---