Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Option 1 - Building Impala (for developing Impala)

This will also load all the test data, i.e. on HDFS, Kudu and HBase and for all formats (text/parquet/orc) and compressions (none/gzip/snappy etc.) which usually takes hours. If you just need some simple test data (e.g. TPCH data set), try Option 2 below.

Code Block
git clone https://gitbox.apache.org/repos/asf/impala.git ~/Impala
cd ~/Impala
export IMPALA_HOME=`pwd`
./bin/bootstrap_development.sh

Option 2 - Building Impala without Test Data (for manually testing Impala)

If you have some test data in hand, or you just need a subset of the test data (e.g. TPCH tables in parquet format), you can try this.

Code Block
git clone https://gitbox.apache.org/repos/asf/impala.git ~/Impala
cd ~/Impala
export IMPALA_HOME=`pwd`
./bin/bootstrap_system.sh
source ./bin/impala-config.sh
# Format the test cluster and start Impala and dependent services
./buildall.sh -noclean -notests -format -start_minicluster -start_impala_cluster

# Load some test data (skip this if you don't need them)
# Example 1: Load functional test data in parquet format. Note that you need the text tables since parquet tables are generated from them.
bin/create_testdata.sh
bin/load-data.py -e core -w functional-query --table_formats=text/none,parquet/none
# Example 2: Load TPCH test data in parquet format.
bin/load-data.py -e core -w tpch --table_formats=text/none,parquet/none
# Example 3: Load TPCDS test data in parquet format.
bin/load-data.py -e core -w tpcds --table_formats=text/none,parquet/none

Rebuilding after initial build

...