The Marvin test framework will undergo some key improvements as part of this refactor:
Marvin which has been used thus far for testing has undergone several significant changes in this refactor. Many of these changes were driven by the need for succinctly describing a test scenario in a few lines of code. This document describes the changes and the reasons behind this refactor. While this makes the framework simple to use the internals of marvin have become a bit complex. For this reason we will cover some of the internal workings as part of this document.
Two main rationale were responsible for this refactor
Typically to write a test case previously the test case author was expected to know (in advance) all the APIs he was going to call to complete his scenario. With the growing list of APIs, their parameters and optional arguments it becomes tedious often to compose a single API call. To overcome this the integration libraries were written. These libraries (integration.lib.base, integration.lib.common
etc) present a list of resources or entities - eg: VirtualMachine, VPC, VLAN to the library user. Each entity can perform a set of operations that in turn transform into an API call.
class VirtualMachine(object): def deploy(self, apiclient, service, template, zone): cmd = deployVirtualMachine.deployVirtualMachineCmd() cmd.serviceofferingid = service cmd.templateid = template ... ... def list(self,apiclient) cmd = listVirtualMachines.listVirtualMachinesCmd() return apiclient.listVirtualMachines(cmd)
This makes the library usage more object-oriented. So in the testcase the author only has to make a call to the VirtualMachine class when creating/destroying/starting/stopping virtualmachine instances.
The disadvantage of this approach is that the integration library is hand-written and brittle. When changes are made several tests are affected in the process. There are also inconsistencies caused by mixing the data required for the API call with the arguments of the operation being performed. eg:
class VirtualMachine(object): .... @classmethod def create(cls, apiclient, services, templateid=None, accountid=None, domainid=None, zoneid=None, networkids=None, serviceofferingid=None, securitygroupids=None, projectid=None, startvm=None, diskofferingid=None, affinitygroupnames=None, group=None, hostid=None, keypair=None, mode='basic', method='GET'): .... ....
In this call, every argument is optionally lookedup in the services dictionary or as part of the argument thereby complicating the body of the create(..) call. Also the naming and the size of the API call is daunting for anyone using the library.
Another major disadvantage of the previous approach was data required for the test was mixed with the test itself. This made it difficult to generate new data from existing data objects. Data being highly coupled with the test reduces readability.
Additionaly due to the strict structure of this data it would impose itself onto the implementation of a resource's methods in the integration library. However all of the data is reusable by other tests if presented as factories. The refactor will address this using factories that act as building blocks for creating reusable data. The document also describes how these blocks are extended.
The process of API module generation remains the same as before. CloudStack expresses its API in XML and JSON via the ApiDiscovery plugin. For instance the createFirewallRule API looks as follows (some fields removed for brevity)
"api": [ { "name": "createFirewallRule", "description": "Creates a firewall rule for a given ip address", "isasync": true, "params": [ { "name": "cidrlist", "description": "the cidr list to forward traffic from", "type": "list", "length": 255, "required": false }, { "name": "icmpcode", }, { "name": "icmptype", }, { "name": "type", }, ], "response": [ { "name": "state", "description": "the state of the rule", "type": "string" }, { "name": "endport", }, { "name": "protocol", }, ], "entity": "Firewall" } ]
This JSON/XML can be used to create a binding in your favorite language and for Marvin's purpose this will be python. An API module named createFirewallRule.py with two classes (request and response) - createFirewallRuleCmd and createFirewallRuleResponse represents the creation of firewall rules.
Generated API modules now include the entity
attribute from the listApi response. The API discovery plugin has been enhanced to include the type of entity that an API is acting upon. For instance when doing createFirewallRule the entity that the user is dealing with is the Firewall
. We do not intuitively guess what entity an API acts upon but depend on the CloudStack endpoint to tell us this information. Mostly because we cannot always predict the entity an API acts upon using the name of the API
eg: dedicatePublicIpRange
listapisresponse: { count: 1, api: [ { name: "dedicatePublicIpRange", description: "Dedicates a Public IP range to an account", isasync: false, related: "listVlanIpRanges", params: [], response: [], entity: "VlanIpRange" } ] } }
This transforms into the following Marvin entity class through auto-generation:
class VlanIpRange(CloudStackEntity): def dedicate(self, apiclient, account, domainid, **kwargs): cmd = dedicatePublicIpRange.dedicatePublicIpRangeCmd() cmd.id = self.id cmd.account = account cmd.domainid = domainid [setattr(cmd, key, value) for key,value in kwargs.iteritems()] publiciprange = apiclient.dedicatePublicIpRange(cmd) return publiciprange if publiciprange else None
kwargs represents all the optional arguments for dedicatePublicIpRange
The use of the entity in generating a higher level model for the CloudStack API is described in the next section.
Marvin now includes a new module named generate
that contains all the code generators.
xmltoapi.py
- this module is responsible for converting the JSON/XMLcodegenerator.py
apitoentity.py
- this module is responsible for grouping actions on aentity.py
- is the base entity creator that transforms an API into afactory.py
- is the base factory creator that transforms an API into aFor eg: in the method createFirewallRule the entity
is the Firewall and the action
being performed on the entity is create
So our entity becomes
class Firewall: def create(...): createFirewallRule()
Almost all APIs are transformed naturally into this model but there are a few exceptions. These exceptions are dealt with by the linguist.py
module in which APIs that don't split this way are broken down using special
transformers.
All required arguments to an API will be available in the API operation
Entity.verb(reqd1=None, reqd2=None, ..., **kwargs)
Here the Entity
(eg:Firewall) can perform an operation verb()
(eg:create) using the arguments reqd1, reqd2
. The optional arguments (if any) will be passed as key, value pairs to the keyword args **kwargs
.
All entity classes are autogenerated and placed in the marvin.entity
module. You may want to look at some sample entities like virtualmachine.py or network.py. To anyone who has used the previous version of marvin, these will look familiar. If you are looking at them for the first time, it will be obvious to you that each entity is a simple class defined with CRUD operations that map to the cloudStack API.
_init_
method is basically a call to its creatorFactories in cloudstack are implemented using the factory_boy(http://factoryboy.readthedocs.org/en/latest/) framework. The factory_boy framework helps cloudstack define complex relationships in its model. For eg. In order to create a virtualmachine typically one needs a service offering, a template and a zone present to be able to launch the VM. Factory boy enables traversing these object relationships effectively (top-down or bottom-up) to create those objects.
Every entity in the new framework is created using its corresponding factory EntityFactory
. Factories can be thought of as objects that carry necessary and sufficient data to satisfy the API call that brings the entity into existence. For example in order to create an account the AccountFactory
will carry the firstname, lastname, email, username
of the Account since these are the required arguments to the createAccount
API.
So the account factory looks as follows:
import factory class AccountFactory(factory): FACTORY_FOR = Account accounttype = None firstname = None lastname = None email = None username = None password = None
Here the AccountFactory
is a bare representation with all None fields. These are the default factories. The default factories are simply base classes for defining hierarchical data using inheritance. For instance we have three types of accounts in cloudstack - DomainAdmin, Admin and User
Each of these accounttypes represents an inheritance from the AccountFactory. And for each factory we have a specific value for the accounttype
. In fact we don't have to repeat ourselves when defining a factory for each type of account:
UserAccount(AccountFactory)
AdminAccount(UserAccount) with (accounttype=1)
DomainAdminAccount(UserAccount) with (accounttype=2)
By simply altering the accounttype and having Admin and DomainAdmin inherit from User we have defined factories for all types of accounts in cloudstack In order to create accounts in our tests all we have to do is the following:
class TestAccounts(cloudstackTestCase): def setUp(...): apiclient = getApiClient() def test_AccountForUser(...): user = UserAccount(apiclient) assert user is valid def test_AccountForAdmin(...): admin = AdminAccount(apiclient) assert admin is valid def test_AccountForDomainAdmin(...): domadmin = DomainAdminAccount(apiclient) assert domadmin is active def tearDown(...): user.delete() admin.delete() domadmin.delete()
Sequences are provided by factory boy to randomize the object generated by each call to the factory. Typically these are incremented integers but for the CloudStack objects each distinguishing attribute is randomized to prevent collisions and duplicate objects.
To define an attribute as a sequence we simply call the factory.Sequence(..) method with a lambda function defining said sequence.
eg:
class SharedNetworkOffering(NetworkOfferingFactory): name = factory.Sequence(lambda n: 'SharedOffering' + my_random_generator_function(n)) ...
SubFactories are an important factory_boy building block for creating factories that depend on other factories.
For eg: in order to create a SharedNetwork a networkofferingid of a SharedNetworkOffering is required. So we first call on the factory of SharedNetworkOffering using the factory.SubFactory(..) and use the id to create the SharedNetwork using the SharedNetwork's factory
class SharedNetwork(NetworkFactory): name = factory.Sequence(...) networkoffering = \ factory.SubFactory( SharedNetworkOffering, attr1=val1 ) networkofferingid = networkoffering.id
RelatedFactory is a special case of SubFactory in that RelatedFactories are created after the existing factory is created.
SubFactories are very powerful to chain many factories together to compose complex objects in cloudstack.
In many cases additional hooks are done to simplify working with cloud resources. For instance, when creating a virtual machine in an advanced zone it is useful to associate a NAT rule to be able to SSH into the virtual machine for post processing the effects on the virtualmachine like testing connectivity to the internet for instance. PostGeneration hooks work after factories have been created to perform such special functions. For examples, check the marvin.factory.data.vm
module for the VirtualMachineWithStaticNat factory where we create a static nat rule allowing SSH access to the created VM.
All factories are auto-generated and there is no need to define the default factories. Test case authors will mostly be creating data factories inherited from the default factories. All the data factories are defined in marvin.factory.data
. Currently implementations are provided for often used data objects.
and many more implementations should serve as examples to extend new data objects.
Factory naming convention is simple. Any data inheriting from default factory EntityFactory
should be named without the suffix Factory
. The data should take the name of the purpose of the factory. Use simple prepositions (Of,And,With etc) to combine words. For instance: VirtualMachineWithStaticNat or VirtualMachineInIsolatedNetwork. Naming the data clearly aids its widespread use. A badly named factory will likely not be used in more than one test.
The typical assertion capabilites of unittest are enough to express all validation but it does not read naturally. Should_dsl is a library that makes the assertions read like natural language. This is installed by default with marvin now enabling all test cases to write assertions using simple dsl statements
eg:
vm = VirtualMachineIsolatedNetwork(apiclient) vm.state | should | equal_to('Running') vm.nic | should_not | be(None)
All the pre-existing utilities from the previous util.py
are still available with enhancements in the util.py module. The legacy util.py module is deprecated but retained since older tests refer to this module. All new changes should go to the util.py under marvin/
Marvin earlier was coupled with Python2.7 since python's unittest did not have the same capabilites in versions <2.7. With unittest2 all features are now backported to older python implementations. Marvin has also switched to unittest2 so that we don't have to depend on the specific version of python to be able to install and use marvin for testing. This change is internal and should not be felt by the test case writer.
There are plans to move to nose2 as well but this is separated from factory work at the moment.
In order to not disrupt the running of existing tests all the older libraries in base.py
, common.py
and util.py
are moved to the legacy module. Any new tests should be written using factories. Older libraries are retained to be able to run our existing tests whose imports will be switched as part of this refactor.