We are adding some tools to help non-functional e2e tests on a real cluster. The reason for that is, we want to contiounsly add scale, stress and longevity tests to cover all aspects not covered by our current e2e tests.
Create Synthetic Data in a Real Cluster
Create jar file,
mvn clean install -DskipTests=true -Ptest-tools
setup environment,
export HIVE_CONF_DIR=/etc/hive/conf export HIVE_LIB=/usr/lib/hive export HADOOP_CONF_DIR=/etc/hadoop/conf.cloudera.YARN-1 export HADOOP_CLASSPATH=${HIVE_LIB}/lib/*:${HADOOP_CLASSPATH} export HADOOP_CLASSPATH=${HIVE_CONF_DIR}/*:${HADOOP_CLASSPATH} export HADOOP_CLASSPATH=${HADOOP_CONF_DIR}/*:${HADOOP_CLASSPATH} export HADOOP_OPTS="$HADOOP_OPTS -Dhive.server2.thrift.bind.host=ay-s3a-1.vpc.cloudera.com -Dsentry.e2e.hive.keytabs.location=/cdep/keytabs -Dsentry.scale.test.config.path=/root/apache-sentry/sentry/sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/tools”
create a test config file similar to sentry/sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/tools/sentry_scale_test_config.xml.
run test tool to create synthetic data:
hadoop jar ${TEST_ROOT}/sentry/sentry-tests/sentry-tests-hive/target/sentry-tests-hive-1.8.0-SNAPSHOT-tests.jar -s
if you want to clean up previously created data:
hadoop jar ${TEST_ROOT}/sentry/sentry-tests/sentry-tests-hive/target/sentry-tests-hive-1.8.0-SNAPSHOT-tests.jar -c