This Confluence has been LDAP enabled, if you are an ASF Committer, please use your LDAP Credentials to login. Any problems file an INFRA jira ticket please.
The current automation infrastructure needs to be improved following are some concerns and proposed solutions.
1.) The documentation of setting up the test infra is not complete (QA Infra from scratch). They seem to be guide lines
need to implement. The current documentation dose not explain in detail on how to setup cobbler, puppet and other
tools included in the driver VM. Although the configuration files are provided they have to be modified based on the
environment. This requires understanding of these tools and hence a learning curve. The link given to download the
driver VM dose not work. The Driver VM i.e. QA cloud is pretty large to distribute.
2.) The repository mentioned to download the puppet recipes is old and is missing some files like
secseeder.sh. These files are not publicly available.
3.) code needed to understand the build process is not available.
4.) The infrastructure setup process requires some static configuration which is not documented.
1.) need to maintain the a local repo with all the required files and documentation at one place.
2.) update the wiki to provide detailed description of how to setup and the way each of the component works.
for example it would have helped to document how puppet works along with dns to sign certificates and identify nodes.
3.) Automate the infra setup.
Operational Issues with the current automation infrastructure.
1.) The code required to setup is not properly organised and requires static configuration.
we need to run the management server with a specific hostname for puppet to identify the node and push
the puppet recipe. IMO we should be able to dynamically generate the puppet configuration. This will help when we need
to scale the infra or automate the infra setup.
we fetch the ip and mac address based on the type of hypervisor.
The type of hypervisor and the hardware resource we use to setup should be decoupled.
2.) The present implementation dose not support parallel execution of tests. each of the jobs in the jenkins will have to
wait even though we have resources available to run the tests. one way of parallelising is to have multiple jobs each tied
to a specific hypervisor and type of test. This is not suitable if we want to open this service up for developers.
3.) The simulator build is not automated. The current way of executing tests on simulator is to create a simulator setup manually and run tests
by adding this node to the jenkins as a slave.
1.) reorganise the current code and make it dynamic.
2.) Automate simulator install and deploying test cases.
3.) Allow scheduling of tests using a single interface.
4.) use a more structured way to store available hardware info to make adding and removing of resources easier.
provide methods to add and remove hardware resources available as and when required.
5.) Allow choosing the hardware dynamically rather than using static configuration like names specific to a test environment.
In phase one we plan to reorganise the code and automate simulator install.
will post them shortly.
Current progress on automation of simulator install.
Following is the break up of work required.
1.) create a puppet recipe to install maven, mysql, java and git on the blank VM. progress -100%
2.) write a python script to pull the code from the git repo, build and run the simulator. progress -100%
the script will take the repo url, build number, and commit hash as the arguments.
this script will be running inside the simulator VM.
3.) script to be run from the jenkins which will take the type of os and the repo url etc progress-30%
This will create a blank VM on one of the hypervisors with a given mac and ip address.
configure cobbler to enable pxe boot and puppet agent install.
insert code into puppet nodes.pp to associate puppet config with this VM.
trigger the script mentioned in step 2 once VM is created.
run tests and report the results.
4.) create a jenkins job to schedule the tests.
As a part of phase2 we will take up scheduling of tests using a singe interface and allow dynamic allocation
of available hardware.currently we nee to schedule one job to execute the tests and another to create testbed.
purpose of having one interface is to maintain only one job queue. This will allow better control over job scheduling.
we can enable parallel execution by using more than one executors on this job queue.
we plan to use xml file to store information of all the hardware available in the infrastructure.
Information like mac address ip addresses ipmi interface address etc can be stored in a single file. this
will the only place that needs to change when adding or removing hardware, this will also allow us to keep
track of the resources currently in use and resources which are free.