DUE TO SPAM, SIGN-UP IS DISABLED. Goto Selfserve wiki signup and request an account.
...
Option 1 - Building Impala (for developing Impala)
This will also load all the test data, i.e. on HDFS, Kudu and HBase and for all formats (text/parquet/orc) and compressions (none/gzip/snappy etc.) which usually takes hours. If you just need some simple test data (e.g. TPCH data set), try Option 2 below.
| Code Block |
|---|
git clone https://gitbox.apache.org/repos/asf/impala.git ~/Impala
cd ~/Impala
export IMPALA_HOME=`pwd`
./bin/bootstrap_development.sh |
Option 2 - Building Impala without Test Data (for manually testing Impala)
If you have some test data in hand, or you just need a subset of the test data (e.g. TPCH tables in parquet format), you can try this.
| Code Block |
|---|
git clone https://gitbox.apache.org/repos/asf/impala.git ~/Impala cd ~/Impala export IMPALA_HOME=`pwd` ./bin/bootstrap_system.sh source ./bin/impala-config.sh # Format the test cluster and start Impala and dependent services ./buildall.sh -noclean -notests -format -start_minicluster -start_impala_cluster # Load some test data (skip this if you don't need them) # Example 1: Load functional test data in parquet format. Note that you need the text tables since parquet tables are generated from them. bin/create_testdata.sh bin/load-data.py -e core -w functional-query --table_formats=text/none,parquet/none # Example 2: Load TPCH test data in parquet format. bin/load-data.py -e core -w tpch --table_formats=text/none,parquet/none # Example 3: Load TPCDS test data in parquet format. bin/load-data.py -e core -w tpcds --table_formats=text/none,parquet/none |
Rebuilding after initial build
...