marvin_refactor: docs for design and internals
Project: http://git-wip-us.apache.org/repos/asf/cloudstack/repo Commit: http://git-wip-us.apache.org/repos/asf/cloudstack/commit/f1eb7235 Tree: http://git-wip-us.apache.org/repos/asf/cloudstack/tree/f1eb7235 Diff: http://git-wip-us.apache.org/repos/asf/cloudstack/diff/f1eb7235 Branch: refs/heads/marvin_refactor Commit: f1eb72359e4624b780e4cbe9c344cb2fb92f480d Parents: c4f9855 Author: Prasanna Santhanam <t...@apache.org> Authored: Wed Oct 2 11:26:55 2013 +0530 Committer: Prasanna Santhanam <t...@apache.org> Committed: Thu Oct 31 13:54:25 2013 +0530 ---------------------------------------------------------------------- tools/marvin/docs/DESIGN.markdown | 447 +++++++++++++++++++++++++++++++++ tools/marvin/docs/errata.markdown | 29 +++ tools/marvin/docs/errata.md | 22 -- 3 files changed, 476 insertions(+), 22 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/cloudstack/blob/f1eb7235/tools/marvin/docs/DESIGN.markdown ---------------------------------------------------------------------- diff --git a/tools/marvin/docs/DESIGN.markdown b/tools/marvin/docs/DESIGN.markdown new file mode 100644 index 0000000..b88c43c --- /dev/null +++ b/tools/marvin/docs/DESIGN.markdown @@ -0,0 +1,447 @@ +# Marvin Refactor +The Marvin test framework will undergo some key improvements as part of this +refactor: + +1. All CloudStack resources modelled as entities which are more object-oriented +2. Data modelled as factories that form basic building blocks +3. DSL support for assertions + +## Introduction +Marvin which has been used thus far for testing has undergone several +significant changes in this refactor. Many of these changes were driven by the +need for succinctly describing a test scenario in a few lines of code. This +document describes the changes and the reasons behind this refactor. While this +makes the framework simple to use the internals of marvin have become a bit +complex. For this reason we will cover some of the internal workings as part of +this document. + +## Rationale +Two main rationale were responsible for this refactor + +1. Brittle nature of the integration library +2. Separating data from the test + +### Integration library +Typically to write a test case previously the test case author was expected to +know (in advance) all the APIs he was going to call to complete his scenario. +With the growing list of APIs, their parameters and optional arguments it +becomes tedious often to compose a single API call. To overcome this the +integration libraries were written. These libraries (`integration.lib.base, +integration.lib.common` etc) present a list of resources or entities - eg: +VirtualMachine, VPC, VLAN to the library user. Each entity can perform a set of +operations that in turn transform into an API call. + +```python +class VirtualMachine(object): + def deploy(self, apiclient, service, template, zone): + cmd = deployVirtualMachine.deployVirtualMachineCmd() + cmd.serviceofferingid = service + cmd.templateid = template + ... + ... + def list(self,apiclient) + cmd = listVirtualMachines.listVirtualMachinesCmd() + return apiclient.listVirtualMachines(cmd) +``` +This makes the library usage more object-oriented. So in the testcase the +author only has to make a call to the VirtualMachine class when +creating/destroying/starting/stopping virtualmachine instances. + +The disadvantage of this approach is that the integration library is +hand-written and brittle. When changes are made several tests are affected in +the process. There are also inconsistencies caused by mixing the data required +for the API call with the arguments of the operation being performed. eg: + +```python +class VirtualMachine(object): +.... + @classmethod + def create(cls, apiclient, services, templateid=None, accountid=None, + domainid=None, zoneid=None, networkids=None, serviceofferingid=None, + securitygroupids=None, projectid=None, startvm=None, + diskofferingid=None, affinitygroupnames=None, group=None, + hostid=None, keypair=None, mode='basic', method='GET'): + .... + .... +```` +In this call, every argument is optionally lookedup in the services dictionary +or as part of the argument thereby complicating the body of the create(..) +call. Also the naming and the size of the API call is daunting for anyone using +the library. + +### Data vs Test +Another major disadvantage of the previous approach was data required for the +test was mixed with the test itself. This made it difficult to generate new +data from existing data objects. Data being highly coupled with the test +reduces readability. + +Additionaly due to the strict structure of this data it would impose itself +onto the implementation of a resource's methods in the integration library. + +However all of the data is reusable by other tests if presented as factories. +The refactor will address this using factories that act as building blocks for +creating reusable data. The document also describes how these blocks are extended. + +## CloudStack API Generation +The process of API module generation remains the same as before. CloudStack +expresses its API in XML and JSON via the ApiDiscovery plugin. For instance the +createFirewallRule API looks as follows (some fields removed for brevity) + +```json + "api": [ + { + "name": "createFirewallRule", + "description": "Creates a firewall rule for a given ip address", + "isasync": true, + "params": [ + { + "name": "cidrlist", + "description": "the cidr list to forward traffic from", + "type": "list", + "length": 255, + "required": false + }, + { + "name": "icmpcode", + }, + { + "name": "icmptype", + }, + { + "name": "type", + }, + ], + "response": [ + { + "name": "state", + "description": "the state of the rule", + "type": "string" + }, + { + "name": "endport", + }, + { + "name": "protocol", + }, + ], + "entity": "Firewall" + } + ] + ``` + +This JSON/XML can be used to create a binding in your favorite language and for +Marvin's purpose this will be python. An API module named +createFirewallRule.py with two classes (request and response) - +createFirewallRuleCmd and createFirewallRuleResponse represents the creation of +firewall rules. + +### Changes to API Discovery +Generated API modules now include the `entity` attribute from the listApi +response. The API discovery plugin has been enhanced to include the type of +entity that an API is acting upon. For instance when doing createFirewallRule +the entity that the user is dealing with is the `Firewall`. We do not +intuitively guess what entity an API acts upon but depend on the CloudStack +endpoint to tell us this information. Mostly because we cannot always predict +the entity an API acts upon using the name of the API + +eg: dedicatePublicIpRange + +```json +listapisresponse: { + count: 1, + api: [ + { + name: "dedicatePublicIpRange", + description: "Dedicates a Public IP range to an account", + isasync: false, + related: "listVlanIpRanges", + params: [], + response: [], + entity: "VlanIpRange" + } + ] + } +} +``` + +This transforms into the following Marvin entity class through auto-generation: + +```python +class VlanIpRange(CloudStackEntity): + + def dedicate(self, apiclient, account, domainid, **kwargs): + cmd = dedicatePublicIpRange.dedicatePublicIpRangeCmd() + cmd.id = self.id + cmd.account = account + cmd.domainid = domainid + [setattr(cmd, key, value) for key,value in kwargs.iteritems()] + publiciprange = apiclient.dedicatePublicIpRange(cmd) + return publiciprange if publiciprange else None + +``` + +> kwargs represents all the optional arguments for dedicatePublicIpRange + +The use of the entity in generating a higher level model for the CloudStack API +is described in the next section. + +## Entity and Factory Generation +Marvin now includes a new module named `generate` that contains all the code +generators. + +1. `xmltoapi.py` - this module is responsible for converting the JSON/XML +response to a python binding. Previously this was the `codegenerator.py` +2. `apitoentity.py` - this module is responsible for grouping actions on a +given entity into a single module and define all its actions as methods on the +entity object. +3. `entity.py` - is the base entity creator that transforms an API into a +cloudstackEntity +4. `factory.py` - is the base factory creator that transforms an API into a +factory + +For eg: in the method createFirewallRule the `entity` is the Firewall and the +`action` being performed on the entity is `create` + +So our entity becomes + +```python +class Firewall: + def create(...): + createFirewallRule() +``` + +Almost all APIs are transformed naturally into this model but there are a few +exceptions. These exceptions are dealt with by the `linguist.py` module in +which APIs that don't split this way are broken down using special +transformers. + +### Required and Optional Arguments +All required arguments to an API will be available in the API operation + +```python +Entity.verb(reqd1=None, reqd2=None, ..., **kwargs) +``` + +Here the `Entity` (eg:Firewall) can perform an operation `verb()` (eg:create) +using the arguments `[reqd1, reqd2]`. The optional arguments (if any) will be +passed as key, value pairs to the keyword args `**kwargs`. + +All entity classes are autogenerated and placed in the `marvin.entity` module. +You may want to look at some sample entities like virtualmachine.py or +network.py. To anyone who has used the previous version of marvin, these will +look familiar. If you are looking at them for the first time, it will be +obvious to you that each entity is a simple class defined with CRUD operations +that map to the cloudStack API. + +1. **Creators** +A creator of an entity is the API operation that brings the entity into +existence on the cloud. For instance a firewall rule is created using the +createFirewallRule API. Or a virtualmachine comes into existence with the +deployVirtualMachine command. These are our creators for entities firewall and +virtualmachines respectively. Every entity class's `__init__` method is +basically a call to its creator + +2. **Enumerators** +Often it is not necessary to bring an entity into existence since it is already +present on the cloud infrastructure. We simply list* these entities and should +still be able to treat them and use them like entities created using their +corresponding creator methods. The list* APIs become our enumerators for each +entity. + +## Factories +Factories in cloudstack are implemented using the +[factory_boy](http://factoryboy.readthedocs.org/en/latest/) framework. The +factory_boy framework helps cloudstack define complex relationships in its +model. For eg. In order to create a virtualmachine typically one needs a +service offering, a template and a zone present to be able to launch the VM. +Factory boy enables traversing these object relationships effectively +(top-down or bottom-up) to create those objects. + +Every entity in the new framework is created using its corresponding factory +`EntityFactory`. Factories can be thought of as objects that carry necessary +and sufficient data to satisfy the API call that brings the entity into +existence. For example in order to create an account the `AccountFactory` will +carry the `firstname, lastname, email, username` of the Account since these +are the required arguments to the `createAccount` API. + +So the account factory looks as follows: + +```python +import factory + +class AccountFactory(factory): + + FACTORY_FOR = Account + + accounttype = None + firstname = None + lastname = None + email = None + username = None + password = None +``` + +Here the `AccountFactory` is a bare representation with all None fields. These +are the default factories. The default factories are simply base classes for +defining hierarchical data using inheritance. For instance we have three +types of accounts in cloudstack - DomainAdmin, Admin and User + +Each of these accounttypes represents an inheritance from the AccountFactory. +And for each factory we have a specific value for the `accounttype`. In fact we +don't have to repeat ourselves when defining a factory for each type of account: + +> UserAccount(AccountFactory) + +> AdminAccount(UserAccount) with (accounttype=1) + +> DomainAdminAccount(UserAccount) with (accounttype=2) + +By simply altering the accounttype and having Admin and DomainAdmin inherit +from User we have defined factories for all types of accounts in cloudstack + +In order to create accounts in our tests all we have to do is the following: + +```python +class TestAccounts(cloudstackTestCase): + + def setUp(...): + apiclient = getApiClient() + + def test_AccountForUser(...): + user = UserAccount(apiclient) + assert user is valid + + def test_AccountForAdmin(...): + admin = AdminAccount(apiclient) + assert admin is valid + + def test_AccountForDomainAdmin(...): + domadmin = DomainAdminAccount(apiclient) + assert domadmin is active + + def tearDown(...): + user.delete() + admin.delete() + domadmin.delete() +``` + +## Basic tools for extending factories + +### Sequences +Sequences are provided by factory boy to randomize the object generated by each +call to the factory. Typically these are incremented integers but for the +CloudStack objects each distinguishing attribute is randomized to prevent +collisions and duplicate objects. + +To define an attribute as a sequence we simply call the factory.Sequence(..) +method with a lambda function defining said sequence. + +eg: + +```python + class SharedNetworkOffering(NetworkOfferingFactory): + name = factory.Sequence(lambda n: 'SharedOffering' + my_random_generator_function(n)) + ... +``` + +### SubFactory +SubFactories are an important factory_boy building block for creating factories +that depend on other factories. + +For eg: in order to create a SharedNetwork a networkofferingid of a +SharedNetworkOffering is required. So we first call on the factory of +SharedNetworkOffering using the factory.SubFactory(..) and use the id to create +the SharedNetwork using the SharedNetwork's factory + +```python +class SharedNetwork(NetworkFactory): + name = factory.Sequence(...) + networkoffering = \ + factory.SubFactory( + SharedNetworkOffering, + attr1=val1 + ) + networkofferingid = networkoffering.id +``` + +RelatedFactory is a special case of SubFactory in that RelatedFactories are +created after the existing factory is created. + +SubFactories are very powerful to chain many factories together to compose +complex objects in cloudstack. + +### PostGeneration Hooks +In many cases additional hooks are done to simplify working with cloud +resources. For instance, when creating a virtual machine in an advanced zone it +is useful to associate a NAT rule to be able to SSH into the virtual machine +for post processing the effects on the virtualmachine like testing connectivity +to the internet for instance. PostGeneration hooks work after factories have +been created to perform such special functions. For examples, check the +`marvin.factory.data.vm` module for the VirtualMachineWithStaticNat factory +where we create a static nat rule allowing SSH access to the created VM. + +## Guidelines for defining new factories +All factories are auto-generated and there is no need to define the default +factories. Test case authors will mostly be creating data factories inherited +from the default factories. All the data factories are defined in +`marvin.factory.data`. Currently implementations are provided for often used +data objects. + +1. networkoffering +2. networks +3. service and disk offerings +4. security groups +5. virtualmachine +6. vpcoffering +7. vpcvirtualmachine +8. firewallrules +9. ingress and egress rules + +and many more implementations should serve as examples to extend new data +objects. + +Factory naming convention is simple. Any data inheriting from default factory +`EntityFactory` should be named without the suffix `Factory`. The data should +take the name of the purpose of the factory. Use simple prepositions +(Of,And,With etc) to combine words. For instance: VirtualMachineWithStaticNat +or VirtualMachineInIsolatedNetwork. Naming the data clearly aids its widespread +use. A badly named factory will likely not be used in more than one test. + +## Should DSL assertions +The typical assertion capabilites of unittest are enough to express all +validation but it does not read naturally. Should_dsl is a library that makes +the assertions read like natural language. This is installed by default with +marvin now enabling all test cases to write assertions using simple dsl +statements + +eg: + +```python + vm = VirtualMachineIsolatedNetwork(apiclient) + vm.state | should | equal_to('Running') + vm.nic | should_not | be(None) +``` + +## Utilities +All the pre-existing utilities from the previous `util.py` are still available +with enhancements in the util.py module. The legacy util.py module is +deprecated but retained since older tests refer to this module. All new changes +should go to the util.py under marvin/ + +## unittest2 and nose2 +Marvin earlier was coupled with Python2.7 since python's unittest did not have +the same capabilites in versions <2.7. With unittest2 all features are now +backported to older python implementations. Marvin has also switched to +unittest2 so that we don't have to depend on the specific version of python to +be able to install and use marvin for testing. This change is internal and +should not be felt by the test case writer. + +> There are plans to move to nose2 as well but this is separated from factory +> work at the moment. + +## Legacy Libraries and Tests +In order to not disrupt the running of existing tests all the older libraries +in `base.py`, `common.py` and `util.py` are moved to the legacy module. Any new +tests should be written using factories. Older libraries are retained to be +able to run our existing tests whose imports will be switched as part of this +refactor. http://git-wip-us.apache.org/repos/asf/cloudstack/blob/f1eb7235/tools/marvin/docs/errata.markdown ---------------------------------------------------------------------- diff --git a/tools/marvin/docs/errata.markdown b/tools/marvin/docs/errata.markdown new file mode 100644 index 0000000..4890d3d --- /dev/null +++ b/tools/marvin/docs/errata.markdown @@ -0,0 +1,29 @@ +## Marvin Refactor + +### Bugs +- marvin build now requires inflect, should-dsl, unittest2 which will cause -Pdeveloper profile to break for the first time +- Entities should include @docstring for optional arguments in their actions() methods. **kwargs is confusing +- Handle APIs that need parameters but dont have a required args list because multiple sets of args form a required list + - eg: disableAccount (either provide id (account) or accoutname and domainid) +- Better sync functionality +- Bump up version to 0.2.0/Versioning based on cloudmonkey/cloudstack +- Improved cleanup support using unittest2.addCleanup() +- If setUp() fails how to handle tearDown() + +### Features +- Export deployment to JSON [CLOUDSTACK-4590](https://issues.apache.org/jira//browse/CLOUDSTACK-4590) +- nose2 support [CLOUDSTACK-4591](https://issues.apache.org/jira//browse/CLOUDSTACK-4591) +- Python pip repository for cloudstack-marvin +- Docs from readthedocs.org using sphinx +- support for correlating test with cloud resources + +### Future +- DSL for marvin using Behave [CLOUDSTACK-1952](https://issues.apache.org/jira/browse/CLOUDSTACK-1952) + +### Fixed +- marvin.sync and xml compilation produce different versions of cloudstackAPI +- Dissociate the grammar list to make it extensible via a properties file +- XML precache required for factory and base generation [CLOUDSTACK-4589](https://issues.apache.org/jira//browse/CLOUDSTACK-4589) +- Remove marvin dependency with apidoc build. Provide precache json [CLOUDSTACK-4589](https://issues.apache.org/jira//browse/CLOUDSTACK-4589) +- unittest2 support added with [CLOUDSTACK-4591](https://issues.apache.org/jira//browse/CLOUDSTACK-4591) +- Use distutils http://git-wip-us.apache.org/repos/asf/cloudstack/blob/f1eb7235/tools/marvin/docs/errata.md ---------------------------------------------------------------------- diff --git a/tools/marvin/docs/errata.md b/tools/marvin/docs/errata.md deleted file mode 100644 index f626069..0000000 --- a/tools/marvin/docs/errata.md +++ /dev/null @@ -1,22 +0,0 @@ -## Idea Stack - -### Bugs - -- **marvin.sync and xml compilation produce different versions of cloudstackAPI** -- marvin build now requires inflect which will cause -Pdeveloper profile to break for the first time -- Entities should include @docstring for optional arguments in their actions() methods. **kwargs is confusing -- Dissociate the grammar list to make it extensible via a properties file -- Handle APIs that need parameters but dont have a required args list because multiple sets of args form a required list - - eg: disableAccount (either provide id (account) or accoutname and domainid) -- XML precache required for factory and base generation [CLOUDSTACK-4589](https://issues.apache.org/jira//browse/CLOUDSTACK-4589) -- Remove marvin dependency with apidoc build. Provide precache json [CLOUDSTACK-4589](https://issues.apache.org/jira//browse/CLOUDSTACK-4589) -- Better sync functionality -- Bump up version to 0.2.0 -- Improved cleanup support - -### Features -- Export deployment to JSON [CLOUDSTACK-4590](https://issues.apache.org/jira//browse/CLOUDSTACK-4590) -- nose2 and unittest2 support [CLOUDSTACK-4591](https://issues.apache.org/jira//browse/CLOUDSTACK-4591) -- Use distutils -- Python pip repository for cloudstack-marvin -- DSL for marvin using Behave [CLOUDSTACK-1952](https://issues.apache.org/jira/browse/CLOUDSTACK-1952) \ No newline at end of file