HiveServer2 (HS2) is a server interface that enables remote clients to execute queries against Hive and retrieve the results. The current implementation, based on Thrift RPC, is an improved version of HiveServer and supports multi-client concurrency and authentication. It is designed to provide better support for open API clients like JDBC and ODBC.
- The Thrift interface definition language (IDL) for HiveServer2 is available at https://github.com/apache/hive/blob/trunk/service/if/TCLIService.thrift.
- Thrift documentation is available at http://thrift.apache.org/docs/.
This document describes how to set up the server. How to use a client with this server is described in the HiveServer2 Clients document.
Introduced in Hive version 0.11. See HIVE-2935.
How to Configure
Configuration Properties in the
Optional Environment Settings
Running in HTTP mode
Starting Hive 0.13, HiveServer2 provides support for sending Thrift RPC messages over http transport (HIVE-4752). This is particularly useful to support a proxying intermediary between the client and the server (for example, for load balancing or security reasons). Currently, you can run HiveServer2 in either TCP mode or the HTTP mode, but not in both. For the corresponding JDBC url, check this link: HiveServer2 Clients JDBC url. Use the following settings to enable http mode:
Optional Environment Settings
How to Start
--help option displays a usage message, for example:
HiveServer2 supports Anonymous (no authentication), Kerberos, pass through LDAP, Pluggable Custom Authentication and Pluggable Authentication Modules (supported Hive 0.13 onwards).
By default HiveServer2 performs the query processing as the user who submitted the query. But if the following parameter is set to false, the query will run as the user that the
hiveserver2 process runs as.
To prevent memory leaks in unsecure mode, disable file system caches by setting the following parameters to true:
Changes in HIVE-4911, which is available in Hive 0.12, enable integrity protection and confidentiality protection (beyond just the default of authentication) for communication between the Hive JDBC driver and HiveServer2. You can use the SASL QOP property to configure this.
- This is only when Kerberos is used for the HS2 client (JDBC/ODBC application) authentication with HiveServer2.
- hive.server2.thrift.sasl.qop in
hive-site.xmlhas to be set to one of the valid QOP values ('auth', 'auth-int' or 'auth-conf').
Changes in HIVE-5351, which will be available in Hive 0.13, provides support for SSL encryption. To enable, set the following configurations in
Pluggable Authentication Modules (PAM)
HIVE-6466, which will be available in Hive 0.13, provides support for PAM. To configure PAM:
- Download the JPAM native library for the relevant architecture.
- Unzip and copy libjpam.so to a directory (<libjmap-directory>) on the system.
- Add the directory to the LD_LIBRARY_PATH environment variable like so: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:<libjmap-directory>
Finally, set the following configurations in
Python Client Driver
A Python client driver for HiveServer2 is available at https://github.com/BradRuderman/pyhs2 (thanks, Brad). It includes all the required packages such as SASL and Thrift wrappers.
To use the pyhs2 driver:
You can discuss this driver on the email@example.com mailing list.