Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

eImpalaImpala's tests depend on a significant number of test databases that are used by various tests. This page aims to provide an introduction and some tips for working with this test data.

...

Sometimes while developing it is useful to load a new table or reload a modified table without redoing the whole data load. It is often possible to do incremental loads using bin/load-data.py. Note that the Impala minicluster has to be started in order to execute this script of load-data.py, i.e., we have to execute $IMPALA_HOME/bin/start-impala-cluster.py first.

Code Block
languagebash
# Reload a specific table for specific file formats.
# -f forces reloading the table even if it exists.
./bin/load-data.py -f -w functional-query --table_names=decimal_rtf_tiny_tbl --table_formats=text/none,kudu/none --exploration_strategy=exhaustive


# Load any missing tables from the functional data set (which is used by the functional-query workload)
# Omitting -f means that data is not reloaded if the script detects that it is present.
# We specify exhaustive because exhaustive tables are always used for the functional data set.
./bin/load-data.py -w functional-query --exploration_strategy=exhaustive


# Reload all versions of the TPC-H nation table for "core" file formats.
./bin/load-data.py -w tpch --table_names=nation -f


...