Bigtop is based on iTest which has a clear separation between test artifacts and test execution. Test artifacts happen to be arbitrary Maven artifacts with classes containing @Test methods. Test execution phase is being driven by maven pom.xml files where you'd define dependencies on test artifacts that you would like to execute and bind everything to the maven-failsafe-plugin's verify goal.
These tests can also be run with a new feature (pending BIGTOP-1388) - cluster failure tests, which is explained at the end of this page.
There are 3 levels of testing that bigtop supports.
- smoke-tests : Basic hadoop ecosystem interoperability. These are gradle based, and can be run from the bigtop-tests/smoke-tests directory, and are super easy to modify at run time (no jar file is built - they are run from groovy source and compiled on the fly at runtime)
- integration tests : Advanced, maven based tests which run from jar files. These are maven based, and are run from the bigtop-tests/test-execution directory.
- integration tests : with test failures : Same as (2), but also featuring cluster failures during the tests, to test resiliency.
Since many of the integration tests are simply smoke tests, we will hope to see convergence of much of (2) into (1),over time.
If you are looking to simply test that your basic ecosystem components are working, most likely the smoke-tets will suite your needs.
Also, note that the smoke tests can be used to call the tets from the integration tests directory quite easily : and run them from source without needing a jar file intermediate.
To see examples of how to do this, check out the mapreduce/ and mahout/ tests, which both reference groovy files in bigtop-tests/test-artifacts.
To avoid redundant documentation, you can read about how to run these tests in the README file under bigtop-tests/smoke-tests/.
These tests are particularly easy to modify - without requiring precompiled jars, and run directly from groovy scripts.
Running bigtop's integration tests
For the testing of binary compatibility with particular versions of hadoop ecosystem components, or for other integration tests, the original bigtop tests, which live in the bigtop-tests/test-artifacts directory, can be used.
This maven project contains a battery of junit based tests which are built as jar files, and compiled against a specific hadoop version, and is executed using the bigtop-tests/test-execution project.
These can be a good way to do fine grained integration testing of your bigtop based hadoop installation.
- Make sure that you have the latest Maven installed (3.3.3+)
Make sure that you have the following defined in your environment:
Given the on-going issues with Apache Jenkins builds you might need to deploy everything locally:
Start test execution:
[OPTIONAL] If you want to run a specific class of test:
[OPTIONAL] If you want to run a specific test in a class:
Cluster Failure Tests
The purpose of this test is to check whether or not mapreduce jobs complete when failing the nodes of the cluster that is performing the job. When applying these cluster failures, the mapreduce job should complete with no issues. If mapreduce jobs fail as a result of any of the cluster failure tests, the user may not have a functional cluster or implementation of mapreduce.
Cluster failures are handled by three classes - ServiceKilledFailure.groovy, ServiceRestartFailure.groovy, and NetworkShutdownFailure.groovy.
We will call the functionality of these classes "cluster failures." The cluster failures extend an abstract class called AbstractFailure.groovy, which is a runnable. Each of these runnable classes execute specific shell commands that purposely fail a cluster. When cluster failures are executed, they call a function populateCommandsLIst(), which will fill up the datastructures failCommands and restoreCommands with values pertaining to the cluster failure. The values include a shell string such as "sudo pkill -9 -f %s" and a specified host to run the command on. From this, shell commands are generated and executed. Note: the host can be specified when instantiating the cluster failure or configured in /resources/vars.properties
ServiceKilledFailure will execute commands that will kill a specified service.
ServiceRestartFailure will execute commands that will stop and start a service.
NetworkShutdownFailure will execute a series of commands that restarts the network.
- Two other classes that must be mentioned are FailureVars.groovy and FailureExecutor.groovy. FailureVars, when instantiated, will load configurations from /resources/vars.properties to prepare for cluster failing. The configurations dictate which cluster failures will be executed along with a variety of different timing options. More information in "How to run cluster failure tests." FailureExecutor is the main driver that creates and runs cluster failure threads (threads run parallel to hadoop and mapreduce jobs). The sequence of execution are as follows:
- FailureVars will configure all variables that are necessary for cluster failures.
- For configuration of FailureVars, see the properties file associated with it (in FailureVars.groovy).
- FailureExecutor will then spawn and execute cluster failure threads.
- The threads will then run its respective shell commands on hosts specified by the user.
- How to run cluster failure tests:
Since the cluster failures are all runnable, the user just has to instantiate the objects and execute them in the tests they are running. If the user wishes to run cluster failures in parallel to hadoop and mapreduce jobs to test for job completion, the user must utilize FailureVars and FailureExecutor. Let's say we want to run cluster failures test while a mapreduce test such as TestDFSIO is running:
First step is to create a FailureVars object before the test is run inside TestDFSIO.groovy.
Next step is to insert code to spawn and start a FailureExecutor thread inside the test body of TestDFSIO
- Now the user just has to execute the test. When the test is run, the cluster failures will run in parallel to the mapreduce test.
- To configure the hosts as well as various timing options, open /resources/vars.properties. There, you can specify hosts, which cluster failures to run, and when the cluster failures start. You can also specify the time in between cluster failures and how long services can be killed before being brought back up. Refer to the /bigtop/bigtop-test-framework/README for more information on vars.properties.
There's a special kind of tests designed to validate and find bugs in the packages before they are getting deployed. The source code of the tests could be found in
bigtop-tests/test-artifacts/package. Before you can run tests you actually have to specify the testsuite that you want to use. You can pick from the following list:
(you can open up corresponding class implementation to see how they are different from each other).
As the first step, pick
TestPackagesBasicsWithRM. With that in mind your last command line is going to look something like:
$ mvn clean verify -f bigtop-tests/test-execution/package/pom.xml -Dorg.apache.bigtop.itest.log4j.level=TRACE -Dlog4j.debug=true -Dorg.apache.maven-failsafe-plugin.testInclude="**/TestPackagesBasicsWithRM.*" -Dbigtop.repo.file.url=http://xxxxxxx
The last two -D settings are for the name of the test suite and for the URL of the repo file describing the repo with your packages.
Things to keep in mind
- If you want to select a subset of tests you can use -Dorg.apache.maven-failsafe-plugin.testInclude='**/Mask*'. e.g., mvn verify -Dorg.apache.maven-failsafe-plugin.testInclude='**/TestHDFSBalancer*'
- It is helpful to add -Dorg.apache.bigtop.itest.log4j.level=TRACE to your mvn verify command
- These tests are not currently executed via our smoke tests - which remains a separate testing package.