This Confluence has been LDAP enabled, if you are an ASF Committer, please use your LDAP Credentials to login. Any problems file an INFRA jira ticket please.

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Step 1: Configure operating system following instruction in Appendix 1.

  • Step 2: Install runtime dependency libraries following instruction in Appendix 2.
  • Step 3: Install Apache Hadoop rpm packages using bigtop.

    • Download bigtop repo as /etc/yum.repos.d/bigtop.repo from https://www.apache.org/dyn/closer.lua/bigtop/bigtop-1.2.1/repos/

      No Format
      sudo wget -O /etc/yum.repos.d/bigtop.repo http://ftp.jaist.ac.jp/pub/apache/bigtop/bigtop-1.2.1/repos/centos7/bigtop.repo


    • Install Hadoop component including Hadoop, Hive, Hbase. Note that bigtop-tomcat version need to be >= 6.0.

      No Format
      sudo yum install hadoop\* hive\* hbase\*


    • Install bigtop-tomcat if ranger plugin service is enabled. The version need to be 6.x.

      No Format
      sudo yum install bigtop-tomcat


    • Initialize Hadoop and start it.

      No Format
      export JAVA_HOME= Java Home Directory
      export PATH=${JAVA_HOME}/bin:$PATH
       
      sudo /etc/init.d/hadoop-hdfs-namenode init
      sudo service hadoop-hdfs-namenode start
      sudo service hadoop-hdfs-datanode start
       
      sudo -u hdfs hdfs dfs -mkdir /hawq_default
      sudo -u hdfs hdfs dfs -chown gpadmin /hawq_default


    • Install Ranger 0.6 using source code following instructions in Apache Ranger 0.5.0 Installation

  • Step 4: Get HAWQ/PXF/Ranger RPM package.

    • Download rpm binary release tarball from Apache HAWQ binary repository

      No Format
      wget -O apache-hawq-rpm-%{hawq_version}-incubating.tar.gz https://dist.apache.org/repos/dist/release/hawq/%{hawq_version}-incubating/apache-hawq-rpm-%{hawq_version}-incubating.tar.gz


    • Or copy it if you compile it by yourself

      No Format
      cp $HAWQ_SOURCE_DIR/contrib/hawq-package/apache-hawq-rpm-%{hawq_version}-incubating.tar.gz .


  • Step 5: Install HAWQ/PXF Ranger with RPM package.

    • Extract apache-hawq-rpm-%{hawq_version}-incubating.tar.gz and go to the directory ‘hawq_rpm_packages’

      No Format
      tar -xzvf apache-hawq-rpm-%{hawq_version}-incubating.tar.gz; cd hawq_rpm_packages


    • Install the rpm packages with below sequence

      No Format
      sudo rpm -ivh apache-hawq-2.24.0.0-el7.x86_64.rpm
      sudo rpm -ivh hawq-ranger-plugin-2.24.0.0-1.el7.centos.noarch.rpm
      sudo rpm -ivh apache-tomcat-7.0.62-el6.noarch.rpm
      sudo rpm -ivh pxf-service-3.23.10.0-1.el6.noarch.rpm
      sudo rpm -ivh pxf-hdfs-3.23.10.0-1.el6.noarch.rpm
      sudo rpm -ivh pxf-hive-3.23.10.0-1.el6.noarch.rpm
      sudo rpm -ivh pxf-hbase-3.23.10.0-1.el6.noarch.rpm
      sudo rpm -ivh pxf-jdbc-3.23.10.0-1.el6.noarch.rpm
      sudo rpm -ivh pxf-json-3.23.10.0-1.el6.noarch.rpm
      sudo rpm -ivh pxf-ignite-3.23.0.0-1.el6.noarch.rpm
      sudo rpm -ivh pxf-3.3.0.0-1.el6.noarch.rpm


    • Inicialize HAWQ cluster with gpadmin user

      No Format
      su gpadmin
       
      source /usr/local/apache-hawq/greenplum_path.sh
       
      # Create password-less SSH connection between nodes in the cluster
      hawq ssh-exkeys -h localhost
       
      # Initialize HAWQ cluster
      hawq init cluster

      Anchor
      appendix1
      appendix1

  • Appendix 1: Configure operating system.
    • Configure system control parameters: edit /etc/sysctl.conf file to set parameters with values; then execute "sudo sysctl -p" to apply the changes.

      No Format
      kernel.shmmax = 1000000000
      kernel.shmmni = 4096
      kernel.shmall = 4000000000
      kernel.sem = 250 512000 100 2048
      kernel.sysrq = 1
      kernel.core_uses_pid = 1
      kernel.msgmnb = 65536
      kernel.msgmax = 65536
      kernel.msgmni = 2048
      net.ipv4.tcp_syncookies = 0
      net.ipv4.conf.default.accept_source_route = 0
      net.ipv4.tcp_tw_recycle = 1
      net.ipv4.tcp_max_syn_backlog = 200000
      net.ipv4.conf.all.arp_filter = 1
      net.ipv4.ip_local_port_range = 1281 65535
      net.core.netdev_max_backlog = 200000
      vm.overcommit_memory = 2
      fs.nr_open = 3000000
      kernel.threads-max = 798720
      kernel.pid_max = 798720
      # increase network
      net.core.rmem_max=2097152
      net.core.wmem_max=2097152
      fs.file-max=3000000


    • Configure system limits: edit /etc/security/limits.conf file to parameters in extract the same order; then relogin to make them take effect.

      No Format
      * soft nofile 2900000
      * hard nofile 2900000
      * soft nproc 131072
      * hard nproc 131072

      Anchor
      appendix2
      appendix2

  • Appendix 2: Install runtime dependency libraries.

    No Format
    # Install epel
    For CentOS 7: sudo rpm -ivh https://dl.fedoraproject.org/pub/epel/7/x86_64/e/epel-release-7-8.noarch.rpm
    For RHEL 7: sudo rpm -ivh https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
     
    # For RHEL 7, enable rhel-7-server-extras-rpms and rhel-7-server-optional-rpms channel in /etc/yum.repos.d/redhat.repo
    sudo yum install -y libgsasl thrift boost protobuf java-1.7.0-openjdk-devel python-pip krb5-server krb5-libs json-c net-snmp-libs python-devel gcc snappy
    
    # Install pycrypto
    sudo pip --retries=50 --timeout=300 install pycrypto (Requirement: you need to have gcc and python-devel installed.)