We are adding some tools to help non-functional e2e tests on a real cluster. The reason for that is, we want to contiounsly add scale, stress and longevity tests to cover all aspects not covered by our current e2e tests. 

Create Synthetic Data in a Real Cluster

  1. Create jar file, 

    mvn clean install -DskipTests=true -Ptest-tools
  2. setup environment,

    export HIVE_CONF_DIR=/etc/hive/conf
    export HIVE_LIB=/usr/lib/hive
    export HADOOP_CONF_DIR=/etc/hadoop/conf.cloudera.YARN-1
    export HADOOP_CLASSPATH=${HIVE_LIB}/lib/*:${HADOOP_CLASSPATH}
    export HADOOP_CLASSPATH=${HIVE_CONF_DIR}/*:${HADOOP_CLASSPATH}
    export HADOOP_CLASSPATH=${HADOOP_CONF_DIR}/*:${HADOOP_CLASSPATH}
    export HADOOP_OPTS="$HADOOP_OPTS -Dhive.server2.thrift.bind.host=ay-s3a-1.vpc.cloudera.com -Dsentry.e2e.hive.keytabs.location=/cdep/keytabs -Dsentry.scale.test.config.path=/root/apache-sentry/sentry/sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/tools”
  3. create a test config file similar to sentry/sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/tools/sentry_scale_test_config.xml.

  4. run test tool to create synthetic data:

    hadoop jar ${TEST_ROOT}/sentry/sentry-tests/sentry-tests-hive/target/sentry-tests-hive-1.8.0-SNAPSHOT-tests.jar -s
  5. if you want to clean up previously created data:

    hadoop jar ${TEST_ROOT}/sentry/sentry-tests/sentry-tests-hive/target/sentry-tests-hive-1.8.0-SNAPSHOT-tests.jar -c
  • No labels