This page describes the different clients supported by HiveServer2.
Introduced in Hive version 0.11. See HIVE-2935. |
HiveServer2 supports a new command shell Beeline that works with HiveServer2. It's a JDBC client that is based on the SQLLine CLI (http://sqlline.sourceforge.net/). There’s detailed documentation of SQLLine which is applicable to Beeline as well. The Beeline shell works in both embedded mode as well as remote mode. In the embedded mode, it runs an embedded Hive (similar to Hive CLI) whereas remote mode is for connecting to a separate HiveServer2 process over Thrift.
Example –
% bin/beeline Hive version 0.11.0-SNAPSHOT by Apache beeline> !connect jdbc:hive2://localhost:10000 scott tiger org.apache.hive.jdbc.HiveDriver !connect jdbc:hive2://localhost:10000 scott tiger org.apache.hive.jdbc.HiveDriver Connecting to jdbc:hive2://localhost:10000 Connected to: Hive (version 0.10.0) Driver: Hive (version 0.10.0-SNAPSHOT) Transaction isolation: TRANSACTION_REPEATABLE_READ 0: jdbc:hive2://localhost:10000> show tables; show tables; +-------------------+ | tab_name | +-------------------+ | primitives | | src | | src1 | | src_json | | src_sequencefile | | src_thrift | | srcbucket | | srcbucket2 | | srcpart | +-------------------+ 9 rows selected (1.079 seconds) |
HiveServer2 has a new JDBC driver. It supports both embedded and remote access to HiveServer2.
The JDBC connection URL format has the prefix jdbc:hive2://
and the Driver class is org.apache.hive.jdbc.HiveDriver
. Note that this is different from the old HiveServer.
jdbc:hive2://<host>:<port>/<db>
(default port for HiveServer2 is 10000).jdbc:hive2://
(no host or port).You can use JDBC to access data stored in a relational database or other tabular format.
Class.forName("org.apache.hive.jdbc.HiveDriver"); |
Connection
object with the JDBC driver.
Connection cnct = DriverManager.getConnection("jdbc:hive2://<host>:<port>", "<user>", "<password>"); |
<port>
is 10000. In non-secure configurations, specify a <user>
for the query to run as. The <password>
field value is ignored in non-secure mode.
Connection cnct = DriverManager.getConnection("jdbc:hive2://<host>:<port>", "<user>", ""); |
Statement
object and using its executeQuery()
method.
Statement stmt = cnct.createStatement(); ResultSet rset = stmt.executeQuery("SELECT foo FROM bar"); |
These steps are illustrated in the sample code below.
import java.sql.SQLException; import java.sql.Connection; import java.sql.ResultSet; import java.sql.Statement; import java.sql.DriverManager; public class HiveJdbcClient { private static String driverName = "org.apache.hive.jdbc.HiveDriver"; /** * @param args * @throws SQLException */ public static void main(String[] args) throws SQLException { try { Class.forName(driverName); } catch (ClassNotFoundException e) { // TODO Auto-generated catch block e.printStackTrace(); System.exit(1); } //replace "hive" here with the name of the user the queries should run as Connection con = DriverManager.getConnection("jdbc:hive2://localhost:10000/default", "hive", ""); Statement stmt = con.createStatement(); String tableName = "testHiveDriverTable"; stmt.execute("drop table if exists " + tableName); stmt.execute("create table " + tableName + " (key int, value string)"); // show tables String sql = "show tables '" + tableName + "'"; System.out.println("Running: " + sql); ResultSet res = stmt.executeQuery(sql); if (res.next()) { System.out.println(res.getString(1)); } // describe table sql = "describe " + tableName; System.out.println("Running: " + sql); res = stmt.executeQuery(sql); while (res.next()) { System.out.println(res.getString(1) + "\t" + res.getString(2)); } // load data into table // NOTE: filepath has to be local to the hive server // NOTE: /tmp/a.txt is a ctrl-A separated file with two fields per line String filepath = "/tmp/a.txt"; sql = "load data local inpath '" + filepath + "' into table " + tableName; System.out.println("Running: " + sql); stmt.execute(sql); // select * query sql = "select * from " + tableName; System.out.println("Running: " + sql); res = stmt.executeQuery(sql); while (res.next()) { System.out.println(String.valueOf(res.getInt(1)) + "\t" + res.getString(2)); } // regular hive query sql = "select count(1) from " + tableName; System.out.println("Running: " + sql); res = stmt.executeQuery(sql); while (res.next()) { System.out.println(res.getString(1)); } } } |
# Then on the command-line $ javac HiveJdbcClient.java # To run the program using remote hiveserver in non-kerberos mode, we need the following jars in the classpath # from hive/build/dist/lib # hive-jdbc*.jar # hive-service*.jar # libfb303-0.9.0.jar# libthrift-0.9.0.jar# log4j-1.2.16.jar# slf4j-api-1.6.1.jar# slf4j-log4j12-1.6.1.jar# commons-logging-1.0.4.jar# # # Following additional jars are needed for the kerberos secure mode - # hive-exec*.jar # commons-configuration-1.6.jar # and from hadoop - hadoop-*core.jar # To run the program in embedded mode, we need the following additional jars in the classpath # from hive/build/dist/lib # hive-exec*.jar # hive-metastore*.jar # antlr-runtime-3.0.1.jar # derby.jar # jdo2-api-2.1.jar # jpox-core-1.2.2.jar # jpox-rdbms-1.2.2.jar # # from hadoop/build # hadoop-*-core.jar # as well as hive/build/dist/conf, any HIVE_AUX_JARS_PATH set, and hadoop jars necessary to run MR jobs (eg lzo codec) $ java -cp $CLASSPATH HiveJdbcClient # Alternatively, you can run the following bash script, which will seed the data file # and build your classpath before invoking the client. The script adds all the # additional jars needed for using HiveServer2 in embedded mode as well. #!/bin/bash HADOOP_HOME=/your/path/to/hadoop HIVE_HOME=/your/path/to/hive echo -e '1\x01foo' > /tmp/a.txt echo -e '2\x01bar' >> /tmp/a.txt HADOOP_CORE={{ls $HADOOP_HOME/hadoop-*-core.jar}} CLASSPATH=.:$HIVE_HOME/conf:`hadoop classpath` for i in ${HIVE_HOME}/lib/*.jar ; do CLASSPATH=$CLASSPATH:$i done java -cp $CLASSPATH HiveJdbcClient |
The following table lists the data types implemented for HiveServer2 JDBC.
Hive Type |
Java Type |
Specification |
---|---|---|
TINYINT |
byte |
signed or unsigned 1-byte integer |
SMALLINT |
short |
signed 2-byte integer |
INT |
int |
signed 4-byte integer |
BIGINT |
long |
signed 8-byte integer |
FLOAT |
double |
single-precision number (approximately 7 digits) |
DOUBLE |
double |
double-precision number (approximately 15 digits) |
DECIMAL |
java.math.BigDecimal |
fixed-precision decimal value |
BOOLEAN |
boolean |
a single bit (0 or 1) |
STRING |
String |
character string or variable-length character string |
TIMESTAMP |
java.sql.Timestamp |
date and time value |
BINARY |
String |
binary data |
Complex Types |
|
|
ARRAY |
String – json encoded |
values of one data type |
MAP |
String – json encoded |
key-value pairs |
STRUCT |
String – json encoded |
structured values |
When connecting to HiveServer2 with Kerberos authentication, the URL format is:
jdbc:hive2://<host>:<port>/<db>;principal=<Server_Principal_of_HiveServer2>
The client needs to have a valid Kerberos ticket in the ticket cache before connecting.
NOTE: If you don't have a "/" after the port number, the jdbc driver does not parse the hostname and ends up running HS2 in embedded mode . So if you are specifying a hostname, make sure you have a "/" or "/<dbname>" after the port number.
In the case of LDAP or customer pass through authentication, the client needs to pass the valid user name and password to the JDBC connection API.
To use sasl.qop, add the following to the sessionconf part of your Hive jdbc hive connection string, eg
jdbc:hive://hostname/dbname;sasl.qop=auth-int
For more information, see Setting up HiveServer2.