This page describes the different clients supported by HiveServer2.

Version

Introduced in Hive version 0.11. See HIVE-2935.

Beeline – New Command Line Shell

HiveServer2 supports a new command shell Beeline that works with HiveServer2. It's a JDBC client that is based on the SQLLine CLI (http://sqlline.sourceforge.net/). There’s detailed documentation of SQLLine which is applicable to Beeline as well.

The Beeline shell works in both embedded mode as well as remote mode. In the embedded mode, it runs an embedded Hive (similar to Hive CLI) whereas remote mode is for connecting to a separate HiveServer2 process over Thrift. Starting in Hive 0.14, when Beeline is used with HiveServer2, it also prints the log messages from HiveServer2 for queries it executes to STDERR.

In remote mode HiveServer2 only accepts valid Thrift calls – even in HTTP mode, the message body contains Thrift payloads.

Beeline Example

% bin/beeline
Hive version 0.11.0-SNAPSHOT by Apache
beeline> !connect jdbc:hive2://localhost:10000 scott tiger org.apache.hive.jdbc.HiveDriver
!connect jdbc:hive2://localhost:10000 scott tiger org.apache.hive.jdbc.HiveDriver
Connecting to jdbc:hive2://localhost:10000
Connected to: Hive (version 0.10.0)
Driver: Hive (version 0.10.0-SNAPSHOT)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://localhost:10000> show tables;
show tables;
+-------------------+
|     tab_name      |
+-------------------+
| primitives        |
| src               |
| src1              |
| src_json          |
| src_sequencefile  |
| src_thrift        |
| srcbucket         |
| srcbucket2        |
| srcpart           |
+-------------------+
9 rows selected (1.079 seconds)

Beeline with NoSASL connection

If you'd like to connect via NOSASL mode, you must specify the authentication mode explicitly:

% bin/beeline
beeline> !connect jdbc:hive2://<host>:<port>/<db>;auth=noSasl hiveuser pass org.apache.hive.jdbc.HiveDriver

Beeline Command Options

The Beeline CLI supports these command line options:

Option	Description
-u <database URL>	The JDBC URL to connect to. Usage: `beeline -u` db_URL
-n <username>	The username to connect as. Usage: `beeline -n` valid_user
-p <password>	The password to connect as. Usage: `beeline -p` valid_password
-d <driver class>	The driver class to use. Usage: `beeline -d` driver_class
-e <query>	Query that should be executed. This option can be specified multiple times. Usage: `beeline -e "`query_string" Only a single command per `-e` option is supported. You can't provide multiple semicolon separated commands. Use the `-e` option multiple times if you want to achieve this. Bug fix (null pointer exception): 0.13.0 (HIVE-5765) Bug to be fixed (running `-e` in background): workaround available (HIVE-6758) Bug fix (--headerInterval not honored): 0.14.0 (HIVE-7647)
-f <file>	Script file that should be executed. Usage: `beeline -f` filepath Version: 0.12.0 (HIVE-4268) Note: If the script contains tabs, query compilation fails in version 0.12.0. This bug is fixed in version 0.13.0 (HIVE-6359). Bug to be fixed (running `-f` in background): workaround available (HIVE-6758)
--hiveconf property=value	Use value for the given configuration property. Properties that are listed in hive.conf.restricted.list cannot be reset with hiveconf (see Restricted List and Whitelist). Usage: `beeline --hiveconf` prop1`=`value1 Version: 0.13.0 (HIVE-6173)
--hivevar name=value	Hive variable name and value. This is a Hive-specific setting in which variables can be set at the session level and referenced in Hive commands or queries. Usage: `beeline --hivevar` var1`=`value1
--color=[true/false]	Control whether color is used for display. Default is false. Usage: `beeline --color=true`
--showHeader=[true/false]	Show column names in query results (true) or not (false). Default is true. Usage: `beeline --showHeader=false`
--headerInterval=ROWS	The interval for redisplaying column headers, in number of rows, when outputformat is table. Default is 100. Usage: `beeline --headerInterval=50`
--fastConnect=[true/false]	When connecting, skip building a list of all tables and columns for tab-completion of HiveQL statements (true) or build the list (false). Default is true. Usage: `beeline --fastConnect=false`
--autoCommit=[true/false]	Enable/disable automatic transaction commit. Default is false. Usage: `beeline --autoCommit=true`
--verbose=[true/false]	Show verbose error messages and debug information (true) or do not show (false). Default is false. Usage: `beeline --verbose=true`
--showWarnings=[true/false]	Display warnings that are reported on the connection after issuing any HiveQL commands. Default is false. Usage: `beeline --showWarnings=true`
--showNestedErrs=[true/false]	Display nested errors. Default is false. Usage: `beeline --showNestedErrs=true`
--numberFormat=[pattern]	Format numbers using a DecimalFormat pattern. Usage: `beeline --numberFormat="#,###,##0.00"`
--force=[true/false]	Continue running script even after errors (true) or do not continue (false). Default is false. Usage: `beeline--force=true`
--maxWidth=MAXWIDTH	The maximum width to display before truncating data, in characters, when outputformat is table. Default is to query the terminal for current width, then fall back to 80. Usage: `beeline --maxWidth=150`
--maxColumnWidth=MAXCOLWIDTH	The maximum column width, in characters, when outputformat is table. Default is 15. Usage: `beeline --maxColumnWidth=25`
--silent=[true/false]	Reduce the amount of informational messages displayed (true) or not (false). It also stops displaying the log messages for the query from HiveServer2 (Hive 0.14 and later). Default is false. Usage: `beeline --silent=true`
--autosave=[true/false]	Automatically save preferences (true) or do not autosave (false). Default is false. Usage: `beeline --autosave=true`
--outputformat=[table/vertical/csv/tsv/csv2/tsv2]	Format mode for result display. Default is table. See below section for description of recommended sv options. Usage: `beeline --outputformat=tsv`
--delimiterForDSV= DELIMITER	The delimiter for delimiter-separated values output format. Default is '\|' character.
--isolation=LEVEL	Set the transaction isolation level to TRANSACTION_READ_COMMITTED or TRANSACTION_SERIALIZABLE. See the "Field Detail" section in the Java Connection documentation. Usage: `beeline --isolation=TRANSACTION_SERIALIZABLE`
--nullemptystring=[true/false]	Use historic behavior of printing null as empty string (true) or use current behavior of printing null as NULL (false). Default is false. Usage: `beeline --nullemptystring=false` Version: 0.13.0 (HIVE-4485)
--incremental=[true/false]	Print output incrementally.
--help	Display a usage message. Usage: `beeline --help`

Separated-Value Output Formats

Starting with Hive 0.14, there are improved SV formats available, namely DSV, CSV2 and TSV2. These are in-line with standard CSV convention, which adds quotes only when a cell contains special characters, such as the delimiter char, a quote char, or spans multiple lines. They differ only with the delimiter between cells, which is comma for CSV2, tab for TSV2, and configurable for DSV (delimiterForDSV property).

CSV and TSV are maintained for backward compatibilty, they add additional single-quote characters around all values contrary to this convention.

JDBC

HiveServer2 has a new JDBC driver. It supports both embedded and remote access to HiveServer2.

Connection URLs

Connection URL for Remote or Embedded Mode

The JDBC connection URL format has the prefix jdbc:hive2:// and the Driver class is org.apache.hive.jdbc.HiveDriver. Note that this is different from the old HiveServer.

For a remote server, the URL format is jdbc:hive2://<host>:<port>/<db> (default port for HiveServer2 is 10000).
For an embedded server, the URL format is jdbc:hive2:// (no host or port).

Connection URL When HiveServer2 Is Running in HTTP Mode

JDBC connection URL: jdbc:hive2://<host>:<port>/<db>?hive.server2.transport.mode=http;hive.server2.thrift.http.path=<http_endpoint>, where:

<http_endpoint> is the corresponding HTTP endpoint configured in hive-site.xml. Default value is cliservice.
Default port for HTTP transport mode is 10001

Connection URL When SSL Is Enabled in HiveServer2

JDBC connection URL: jdbc:hive2://<host>:<port>/<db>;ssl=true;sslTrustStore=<trust_store_path>;trustStorePassword=<trust_store_password>, where:

<trust_store_path> is the path where client's truststore file lives.
<trust_store_password> is the password to access the truststore.

In HTTP mode: jdbc:hive2://<host>:<port>/<db>;ssl=true;sslTrustStore=<trust_store_path>;trustStorePassword=<trust_store_password>?hive.server2.transport.mode=http;hive.server2.thrift.http.path=<http_endpoint>.

Using JDBC

You can use JDBC to access data stored in a relational database or other tabular format.

Load the HiveServer2 JDBC driver.

For example:

Class.forName("org.apache.hive.jdbc.HiveDriver");

Connect to the database by creating a Connection object with the JDBC driver.

For example:
```
Connection cnct = DriverManager.getConnection("jdbc:hive2://<host>:<port>", "<user>", "<password>");
```
The default <port> is 10000. In non-secure configurations, specify a <user> for the query to run as. The <password> field value is ignored in non-secure mode.
```
Connection cnct = DriverManager.getConnection("jdbc:hive2://<host>:<port>", "<user>", "");
```
In Kerberos secure mode, the user information is based on the Kerberos credentials.
Submit SQL to the database by creating a Statement object and using its executeQuery() method.

For example:
```
Statement stmt = cnct.createStatement();
ResultSet rset = stmt.executeQuery("SELECT foo FROM bar");
```
Process the result set, if necessary.

These steps are illustrated in the sample code below.

JDBC Client Sample Code

import java.sql.SQLException;
import java.sql.Connection;
import java.sql.ResultSet;
import java.sql.Statement;
import java.sql.DriverManager;

public class HiveJdbcClient {
  private static String driverName = "org.apache.hive.jdbc.HiveDriver";

  /**
   * @param args
   * @throws SQLException
   */
  public static void main(String[] args) throws SQLException {
      try {
      Class.forName(driverName);
    } catch (ClassNotFoundException e) {
      // TODO Auto-generated catch block
      e.printStackTrace();
      System.exit(1);
    }
    //replace "hive" here with the name of the user the queries should run as
    Connection con = DriverManager.getConnection("jdbc:hive2://localhost:10000/default", "hive", "");
    Statement stmt = con.createStatement();
    String tableName = "testHiveDriverTable";
    stmt.execute("drop table if exists " + tableName);
    stmt.execute("create table " + tableName + " (key int, value string)");
    // show tables
    String sql = "show tables '" + tableName + "'";
    System.out.println("Running: " + sql);
    ResultSet res = stmt.executeQuery(sql);
    if (res.next()) {
      System.out.println(res.getString(1));
    }
       // describe table
    sql = "describe " + tableName;
    System.out.println("Running: " + sql);
    res = stmt.executeQuery(sql);
    while (res.next()) {
      System.out.println(res.getString(1) + "\t" + res.getString(2));
    }

    // load data into table
    // NOTE: filepath has to be local to the hive server
    // NOTE: /tmp/a.txt is a ctrl-A separated file with two fields per line
    String filepath = "/tmp/a.txt";
    sql = "load data local inpath '" + filepath + "' into table " + tableName;
    System.out.println("Running: " + sql);
    stmt.execute(sql);

    // select * query
    sql = "select * from " + tableName;
    System.out.println("Running: " + sql);
    res = stmt.executeQuery(sql);
    while (res.next()) {
      System.out.println(String.valueOf(res.getInt(1)) + "\t" + res.getString(2));
    }

    // regular hive query
    sql = "select count(1) from " + tableName;
    System.out.println("Running: " + sql);
    res = stmt.executeQuery(sql);
    while (res.next()) {
      System.out.println(res.getString(1));
    }
  }
}

Running the JDBC Sample Code

# Then on the command-line
$ javac HiveJdbcClient.java

# To run the program using remote hiveserver in non-kerberos mode, we need the following jars in the classpath
# from hive/build/dist/lib
#     hive-jdbc*.jar
#     hive-service*.jar
#     libfb303-0.9.0.jar
#  	  libthrift-0.9.0.jar
# 	  log4j-1.2.16.jar
# 	  slf4j-api-1.6.1.jar
# 	  slf4j-log4j12-1.6.1.jar
# 	  commons-logging-1.0.4.jar
#
#
# To run the program using kerberos secure mode, we need the following jars in the classpath 
#     hive-exec*.jar
#     commons-configuration-1.6.jar
#  and from hadoop
#     hadoop-core*.jar
#
# To run the program in embedded mode, we need the following additional jars in the classpath
# from hive/build/dist/lib
#     hive-exec*.jar
#     hive-metastore*.jar
#     antlr-runtime-3.0.1.jar
#     derby.jar
#     jdo2-api-2.1.jar
#     jpox-core-1.2.2.jar
#     jpox-rdbms-1.2.2.jar
# and from hadoop/build
#     hadoop-core*.jar
# as well as hive/build/dist/conf, any HIVE_AUX_JARS_PATH set, 
# and hadoop jars necessary to run MR jobs (eg lzo codec)

$ java -cp $CLASSPATH HiveJdbcClient

Alternatively, you can run the following bash script, which will seed the data file and build your classpath before invoking the client. The script adds all the additional jars needed for using HiveServer2 in embedded mode as well.

#!/bin/bash
HADOOP_HOME=/your/path/to/hadoop
HIVE_HOME=/your/path/to/hive

echo -e '1\x01foo' > /tmp/a.txt
echo -e '2\x01bar' >> /tmp/a.txt

HADOOP_CORE=$(ls $HADOOP_HOME/hadoop-core*.jar)
CLASSPATH=.:$HIVE_HOME/conf:$(hadoop classpath)

for i in ${HIVE_HOME}/lib/*.jar ; do
    CLASSPATH=$CLASSPATH:$i
done

java -cp $CLASSPATH HiveJdbcClient

JDBC Data Types

The following table lists the data types implemented for HiveServer2 JDBC.

Hive Type	Java Type	Specification
TINYINT	byte	signed or unsigned 1-byte integer
SMALLINT	short	signed 2-byte integer
INT	int	signed 4-byte integer
BIGINT	long	signed 8-byte integer
FLOAT	double	single-precision number (approximately 7 digits)
DOUBLE	double	double-precision number (approximately 15 digits)
DECIMAL	java.math.BigDecimal	fixed-precision decimal value
BOOLEAN	boolean	a single bit (0 or 1)
STRING	String	character string or variable-length character string
TIMESTAMP	java.sql.Timestamp	date and time value
BINARY	String	binary data
Complex Types
ARRAY	String – json encoded	values of one data type
MAP	String – json encoded	key-value pairs
STRUCT	String – json encoded	structured values

JDBC Client Setup for a Secure Cluster

When connecting to HiveServer2 with Kerberos authentication, the URL format is:

jdbc:hive2://<host>:<port>/<db>;principal=<Server_Principal_of_HiveServer2>

The client needs to have a valid Kerberos ticket in the ticket cache before connecting.

NOTE: If you don't have a "/" after the port number, the jdbc driver does not parse the hostname and ends up running HS2 in embedded mode . So if you are specifying a hostname, make sure you have a "/" or "/<dbname>" after the port number.

In the case of LDAP, CUSTOM or PAM authentication, the client needs to pass a valid user name and password to the JDBC connection API.

To use sasl.qop, add the following to the sessionconf part of your Hive JDBC hive connection string, e.g.

jdbc:hive://hostname/dbname;sasl.qop=auth-int

For more information, see Setting Up HiveServer2.

Multi-User Scenarios and Programmatic Login to Kerberos KDC

In the current approach of using Kerberos you need to have a valid Kerberos ticket in the ticket cache before connecting. This entails a static login (using kinit, key tab or ticketcache) and the restriction of one Kerberos user per client. These restrictions limit the usage in middleware systems and other multi-user scenarios, and in scenarios where the client wants to login programmatically to Kerberos KDC.

One way to mitigate the problem of multi-user scenarios is with secure proxy users (see HIVE-5155). Starting in Hive 0.13.0, support for secure proxy users has two components:

Delegation token based connection for Oozie (OOZIE-1457). This is the common mechanism for Hadoop ecosystem components.
Direct proxy access for privileged Hadoop users (HIVE-5155). This enables a privileged user to directly specify an alternate session user during the connection. If the connecting user has Hadoop level privilege to impersonate the requested userid, then HiveServer2 will run the session as that requested user.

The other way is to use a pre-authenticated Kerberos Subject (see HIVE-6486). In this method, starting with Hive 0.13.0 the Hive JDBC client can use a pre-authenticated subject to authenticate to HiveServer2. This enables a middleware system to run queries as the user running the client.

Using Kerberos with a Pre-Authenticated Subject

To use a pre-authenticated subject you will need the following changes.

Add hive-exec*.jar to the classpath in addition to the regular Hive JDBC jars (commons-configuration-1.6.jar and hadoop-core*.jar are not required).
Add auth=kerberos and kerberosAuthType=fromSubject JDBC URL properties in addition to having the “principal" url property.
Open the connection in Subject.doAs().

The following code snippet illustrates the usage (refer to HIVE-6486 for a complete test case):

static Connection getConnection( Subject signedOnUserSubject ) throws Exception{
       Connection conn = (Connection) Subject.doAs(signedOnUserSubject, new PrivilegedExceptionAction<Object>()
           {
               public Object run()
               {
                       Connection con = null;
                       String JDBC_DB_URL = "jdbc:hive2://HiveHost:10000/default;" ||
                                              "principal=hive/localhost.localdomain@EXAMPLE.COM;" || 
                                              "auth=kerberos;kerberosAuthType=fromSubject";
                       try {
                               Class.forName(JDBC_DRIVER);
                               con =  DriverManager.getConnection(JDBC_DB_URL);
                       } catch (SQLException e) {
                               e.printStackTrace();
                       } catch (ClassNotFoundException e) {
                               e.printStackTrace();
                       }
                       return con;
               }
           });
       return conn;
}

Python Client

A Python client driver is available on github. For installation instructions, see Setting Up HiveServer2: Python Client Driver.

Ruby Client

A Ruby client driver is available on github at https://github.com/forward3d/rbhive.

Integration with SQuirrel SQL Client

Download, install and start the SQuirrel SQL Client from the SQuirrel SQL website.
Select 'Drivers -> New Driver...' to register Hive's JDBC driver that works with HiveServer2.
1. Enter the driver name and example URL:
```
   Name: Hive
   Example URL: jdbc:hive2://localhost:10000/default
```
Select 'Extra Class Path -> Add' to add the following jars from your local Hive and Hadoop distribution.
```
   HIVE_HOME/build/dist/lib/*.jar
   HADOOP_HOME/hadoop-*-core.jar 
```
Select 'List Drivers'. This will cause SQuirrel to parse your jars for JDBC drivers and might take a few seconds. From the 'Class Name' input box select the Hive driver for working with HiveServer2:
```
   org.apache.hive.jdbc.HiveDriver
   
```
Click 'OK' to complete the driver registration.
Select 'Aliases -> Add Alias...' to create a connection alias to your HiveServer2 instance.
1. Give the connection alias a name in the 'Name' input box.
2. Select the Hive driver from the 'Driver' drop-down.
3. Modify the example URL as needed to point to your HiveServer2 instance.
4. Enter 'User Name' and 'Password' and click 'OK' to save the connection alias.
5. To connect to HiveServer2, double-click the Hive alias and click 'Connect'.

When the connection is established you will see errors in the log console and might get a warning that the driver is not JDBC 3.0 compatible. These alerts are due to yet-to-be-implemented parts of the JDBC metadata API and can safely be ignored. To test the connection enter SHOW TABLES in the console and click the run icon.

Also note that when a query is running, support for the 'Cancel' button is not yet available.

Space shortcuts

Child pages

Beeline – New Command Line Shell

Beeline Example

Beeline Command Options

Separated-Value Output Formats

JDBC

Connection URLs

Connection URL for Remote or Embedded Mode

Connection URL When HiveServer2 Is Running in HTTP Mode

Connection URL When SSL Is Enabled in HiveServer2

Using JDBC

JDBC Client Sample Code

Running the JDBC Sample Code

JDBC Data Types

JDBC Client Setup for a Secure Cluster

Multi-User Scenarios and Programmatic Login to Kerberos KDC

Using Kerberos with a Pre-Authenticated Subject

Python Client

Ruby Client

Integration with SQuirrel SQL Client

Space shortcuts

Child pages

HiveServer2 Clients

Beeline – New Command Line Shell

Beeline Example

Beeline Command Options

Separated-Value Output Formats

JDBC

Connection URLs

Connection URL for Remote or Embedded Mode

Connection URL When HiveServer2 Is Running in HTTP Mode

Connection URL When SSL Is Enabled in HiveServer2

Using JDBC

JDBC Client Sample Code

Running the JDBC Sample Code

JDBC Data Types

JDBC Client Setup for a Secure Cluster

Multi-User Scenarios and Programmatic Login to Kerberos KDC

Using Kerberos with a Pre-Authenticated Subject

Python Client

Ruby Client

Integration with SQuirrel SQL Client