We are adding some tools to help non-functional e2e tests on a real cluster. The reason for that is, we want to scale, stress and longevity test to cover all aspects not covered by our current e2e tests.
Create Synthetic Data in a Real Cluster
Create jar file,
mvn clean install -DskipTests=true -Ptest-tools
setup environment,
export HIVE_CONF_DIR=/etc/hive/conf export HIVE_LIB=/usr/lib/hive export HADOOP_CONF_DIR=/etc/hadoop/conf.cloudera.YARN-1 export HADOOP_CLASSPATH=${HIVE_LIB}/lib/*:${HADOOP_CLASSPATH} export HADOOP_CLASSPATH=${HIVE_CONF_DIR}/*:${HADOOP_CLASSPATH} export HADOOP_CLASSPATH=${HADOOP_CONF_DIR}/*:${HADOOP_CLASSPATH} export HADOOP_OPTS="$HADOOP_OPTS -Dhive.server2.thrift.bind.host=ay-s3a-1.vpc.cloudera.com -Dsentry.e2e.hive.keytabs.location=/cdep/keytabs -Dsentry.scale.test.config.path=/root/apache-sentry/sentry/sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/tools”
create a test config file similar to sentry/sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/tools/sentry_scale_test_config.xml.
run test tool to create synthetic data:
hadoop jar ${TEST_ROOT}/sentry/sentry-tests/sentry-tests-hive/target/sentry-tests-hive-1.8.0-SNAPSHOT-tests.jar -s
if you want to clean up previously created data:
hadoop jar ${TEST_ROOT}/sentry/sentry-tests/sentry-tests-hive/target/sentry-tests-hive-1.8.0-SNAPSHOT-tests.jar -c