See Bootstrapping an Impala Development Environment From Scratch for up-to-date, regularly tested, steps to set up your development environment.
The information on this page is stale, but maybe be useful for adventurous people who want to set up a dev environment manually from scratch.
Java
Download and install a Java 7 or Java 8 JDK. Either the Oracle JDK or OpenJDK should work for development.
On Ubuntu 14.04:
sudo apt-get install openjdk-7-jdk
On Ubuntu 16.04 :
sudo apt-get install openjdk-8-jdk
The OpenJDK website has tips for other distributions too: http://openjdk.java.net/install/
Required packages
On Ubuntu 14.04
sudo apt-get install git build-essential cmake bison flex pkg-config libsasl2-dev autoconf automake libtool maven subversion doxygen libbz2-dev zlib1g-dev python-pip python-setuptools python-dev libssl-dev libboost-all-dev postgresql liblzo2-dev lzop -y sudo pip install allpairs pytest pytest-xdist paramiko texttable prettytable sqlparse psutil==0.7.1 pywebhdfs gitpython jenkinsapi boto3
On CentOS 6.5
sudo yum groupinstall "Development Tools" sudo yum -y install git ant libevent-devel automake libtool flex bison gcc-c++ openssl-devel make cmake doxygen.x86_64 glib-devel python-devel bzip2-devel svn libevent-devel krb5-workstation openldap-devel db4-devel python-setuptools python-pip cyrus-sasl* postgresql postgresql-server ant-nodeps lzo-devel lzop sudo pip-python install allpairs pytest pytest-xdist paramiko texttable prettytable sqlparse psutil==0.7.1 pywebhdfs gitpython jenkinsapi boto3
Configuring Postgresql
If you are installing Impala on a fresh machine, you'll need to initialize postgres. On CentOs 6.5 this can be done by running
sudo service postgresql initdb
You need to make a configurations change to allow Hbase and the Hive metastore to functions correctly. Edit the following file as root.
On Ubuntu 14.04 and 16.04
/etc/postgresql/*/main/pg_hba.conf
On CentOs 6.5
/var/lib/pgsql/data/pg_hba.conf
In the following lines at the end of the file, change `peer` or `ident` to `trust`.
# Database administrative login by UNIX sockets local all all ident # TYPE DATABASE USER CIDR-ADDRESS METHOD # "local" is for Unix domain socket connections only local all all ident # IPv4 local connections: host all all 127.0.0.1/32 md5 # IPv6 local connections: host all all ::1/128 md5
To make Postgres aware of these changes, either restart the service or run: pg_ctl reload
If the script fails to start postgresql due to a missing snakeoil SSL cert, do:
sudo make-ssl-cert generate-default-snakeoil
Creating the Hive metastore user
sudo -u postgres psql postgres
Then, at the `postgres` command prompt:
CREATE ROLE hiveuser LOGIN PASSWORD 'password'; ALTER ROLE hiveuser WITH CREATEDB;
Maven 3
On some older systems you may need to install Maven 3 from https://maven.apache.org/ and install it:
tar xvf apache-maven-3.0.5-bin.tar.gz && sudo mv apache-maven-3.0.5 /usr/local
Environment variables
Put these in your `.bashrc` or elsewhere:
export JAVA_HOME=/usr/lib/jvm/<your java version> export LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu export LC_ALL="en_US.UTF-8" # If you installed maven manually: export M2_HOME=/usr/local/apache-maven-3.0.5 export M2=$M2_HOME/bin export PATH=$M2:$PATH
Add a path for HDFS domain sockets
sudo mkdir /var/lib/hadoop-hdfs/ sudo chown <user> /var/lib/hadoop-hdfs/
Start local ssh server
sudo service ssh start
Enable password-less SSH for HBase
ssh-keygen -t dsa # Do not type in any passkey. Just press enter. cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
Setup FQDN to point to loopback
You may also want to modify your /etc/hosts file so that your fully qualified domain name points to the loopback device. Modify /etc/hosts so that it includes these lines (with your host name substituted as appropriate).
127.0.0.1 localhost 127.0.0.1 <Your-host-name> <Your-host-name>.ca.cloudera.com
Setup NTP for Kudu
On Ubuntu
sudo apt-get install ntp ntpdate sudo service ntp start
On CentOS 7
yum install ntp systemctl start ntpd
Make sure your clock is set correctly. If it's way off, NTP will only adjust it little by little. man ntpdate
to maybe fix that.