...

Command

Description

!<SQLLine command>

List of SQLLine commands available at http://sqlline.sourceforge.net/.

Example: !quit exits the Beeline client.

!delimiter

Set the delimiter for queries written in Beeline. Multi-character delimiters are allowed, but quotation marks, slashes, and -- are not allowed. Defaults to ;

Usage: !delimiter $$

Version: 3.0.0 (HIVE-10865)

Beeline

...

Properties

Property Description

fetchsize

Standard JDBC enables you to specify the number of rows fetched with each database round-trip for a query, and this number is referred to as the fetch size.
Setting the fetch size in Beeline overrides the JDBC driver's default fetch size and affects subsequent statements executed in the current session.

A value of -1 instructs Beeline to use the JDBC driver's default fetch size (default)
A value of zero or more is passed to the JDBC driver for each statement
Any other negative value will throw an Exception

Usage: !set fetchsize 200

Version: 4.0.0 (HIVE-22853)

Beeline Hive Commands

Hive specific commands (same as Hive CLI commands) can be run from Beeline, when the Hive JDBC driver is used.

Use ";" (semicolon) to terminate commands. Comments in scripts can be specified using the "--" prefix.

Command	Description
reset	Resets the configuration to the default values.
reset <key>	Resets the value of a particular configuration variable (key) to the default value. Note: If you misspell the variable name, Beeline will not show an error.
set <key>=<value>	Sets the value of a particular configuration variable (key). Note: If you misspell the variable name, Beeline will not show an error.
set	Prints a list of configuration variables that are overridden by the user or Hive.
set -v	Prints all Hadoop and Hive configuration variables.
add FILE[S] <filepath> <filepath>* add JAR[S] <filepath> <filepath>* add ARCHIVE[S] <filepath> <filepath>*	Adds one or more files, jars,

Hive specific commands (same as Hive CLI commands) can be run from Beeline, when the Hive JDBC driver is used.

Use ";" (semicolon) to terminate commands. Comments in scripts can be specified using the "--" prefix.

As of Hive 1.2.0, adds one or more files, jars

Command	Description
reset	Resets the configuration to the default values.
reset <key>	Resets the value of a particular configuration variable (key) to the default value. Note: If you misspell the variable name, Beeline will not show an error.
set <key>=<value>	Sets the value of a particular configuration variable (key). Note: If you misspell the variable name, Beeline will not show an error.
set	Prints a list of configuration variables that are overridden by the user or Hive.
set -v	Prints all Hadoop and Hive configuration variables.
add FILE[S] <filepath> <filepath>* add JAR[S] <filepath> <filepath>* add ARCHIVE[S] <filepath> <filepath>*	Adds one or more files, jars, or archives to the list of resources in the distributed cache. See Hive Resources for more information.
add FILE[S] <ivyurl> <ivyurl>* add JAR[S] <ivyurl> <ivyurl>* add ARCHIVE[S] <ivyurl> <ivyurl>*	or archives to the list of resources in the distributed cache. See Hive Resources for more information.
add FILE[S] <ivyurl> <ivyurl>* add JAR[S] <ivyurl> <ivyurl>* add ARCHIVE[S] <ivyurl> <ivyurl>*	As of Hive 1.2.0, adds one or more files, jars or archives to the list of resources in the distributed cache using an Ivy URL of the using an Ivy URL of the form ivy://group:module:version?query_string. See Hive Resources for more information.
list FILE[S] list JAR[S] list ARCHIVE[S]	Lists the resources already added to the distributed cache. See Hive Resources for more information. (As of Hive 0.14.0: HIVE-7592).
list FILE[S] <filepath>* list JAR[S] <filepath>* list ARCHIVE[S] <filepath>*	Checks whether the given resources are already added to the distributed cache or not. See Hive Resources for more information.
delete FILE[S] <filepath>* delete JAR[S] <filepath>* delete ARCHIVE[S] <filepath>*	Removes the resource(s) from the distributed cache.
delete FILE[S] <ivyurl> <ivyurl>* delete JAR[S] <ivyurl> <ivyurl>* delete ARCHIVE[S] <ivyurl> <ivyurl>*	As of Hive 1.2.0, removes the resource(s) which were added using the <ivyurl> from the distributed cache. See Hive Resources for more information.
reload	As of Hive 0.14.0, makes HiveServer2 aware of any jar changes in the path specified by the configuration parameter hive.reloadable.aux.jars.path (without needing to restart HiveServer2). The changes can be adding, removing, or updating jar files.
dfs <dfs command>	Executes a dfs command.
<query string>	Executes a Hive query and prints results to standard output.

...

Option	Description
-u <database URL>	The JDBC URL to connect to. Special characters in parameter values should be encoded with URL encoding if needed. Usage: `beeline -u` db_URL
-r	Reconnect to last used URL (if a user has previously used `!connect` to a URL and used `!save` to a beeline.properties file). Usage: `beeline -r` Version: 2.1.0 (HIVE-13670)
-n <username>	The username to connect as. Usage: `beeline -n` valid_user
-p <password>	The password to connect as. Usage: `beeline -p` valid_password Optional password mode: Starting Hive 2.2.0 (HIVE-13589) the argument for -p option is optional. Usage : beeline -p [valid_password] If the password is not provided after -p Beeline will prompt for the password while initiating the connection. When password is provided Beeline uses it initiate the connection without prompting.
-d <driver class>	The driver class to use. Usage: `beeline -d` driver_class
-e <query>	Query that should be executed. Double or single quotes enclose the query string. This option can be specified multiple times. Usage: `beeline -e "`query_string" Support to run multiple SQL statements separated by semicolons in a single query_string: 1.2.0 (HIVE-9877) Bug fix (null pointer exception): 0.13.0 (HIVE-5765) Bug fix (--headerInterval not honored): 0.14.0 (HIVE-7647) Bug fix (running `-e` in background): 1.3.0 and 2.0.0 (HIVE-6758); workaround available for earlier versions
-f <file>	Script file that should be executed. Usage: `beeline -f` filepath Version: 0.12.0 (HIVE-4268) Note: If the script contains tabs, query compilation fails in version 0.12.0. This bug is fixed in version 0.13.0 (HIVE-6359). Bug fix (running `-f` in background): 1.3.0 and 2.0.0 (HIVE-6758); workaround available for earlier versions
-i (or) --init <file or files>	The init files for initialization Usage: `beeline -i /tmp/initfile` Single file: Version: 0.14.0 (HIVE-6561) Multiple files: Version: 2.1.0 (HIVE-11336)
-w (or) --password-file <password file>	The password file to read password from. Version: 1.2.0 (HIVE-7175)
-a (or) --authType <auth type>	The authentication type passed to the jdbc as an auth property Version: 0.13.0 (HIVE-5155)
--property-file <file>	File to read configuration properties from Usage: `beeline --property-file /tmp/a` Version: 2.2.0 (HIVE-13964)
--hiveconf property=value	Use value for the given configuration property. Properties that are listed in hive.conf.restricted.list cannot be reset with hiveconf (see Restricted List and Whitelist). Usage: `beeline --hiveconf` prop1`=`value1 Version: 0.13.0 (HIVE-6173)
--hivevar name=value	Hive variable name and value. This is a Hive-specific setting in which variables can be set at the session level and referenced in Hive commands or queries. Usage: `beeline --hivevar` var1`=`value1
--color=[true/false]	Control whether color is used for display. Default is false. Usage: `beeline --color=true` (Not supported for Separated-Value Output formats. See HIVE-9770)
--showHeader=[true/false]	Show column names in query results (true) or not (false). Default is true. Usage: `beeline --showHeader=false`
--headerInterval=ROWS	The interval for redisplaying column headers, in number of rows, when outputformat is table. Default is 100. Usage: `beeline --headerInterval=50` (Not supported for Separated-Value Output formats. See HIVE-9770)
--fastConnect=[true/false]	When connecting, skip building a list of all tables and columns for tab-completion of HiveQL statements (true) or build the list (false). Default is true. Usage: `beeline --fastConnect=false`
--autoCommit=[true/false]	Enable/disable automatic transaction commit. Default is false. Usage: `beeline --autoCommit=true`
--verbose=[true/false]	Show verbose error messages and debug information (true) or do not show (false). Default is false. Usage: `beeline --verbose=true`
--showWarnings=[true/false]	Display warnings that are reported on the connection after issuing any HiveQL commands. Default is false. Usage: `beeline --showWarnings=true`
--showDbInPrompt=[true/false]	Display the current database name in prompt. Default is false. Usage: `beeline --showDbInPrompt=true` Version: 2.2.0 (HIVE-14123)
--showNestedErrs=[true/false]	Display nested errors. Default is false. Usage: `beeline --showNestedErrs=true`
--numberFormat=[pattern]	Format numbers using a DecimalFormat pattern. Usage: `beeline --numberFormat="#,###,##0.00"`
--force=[true/false]	Continue running script even after errors (true) or do not continue (false). Default is false. Usage: `beeline--force=true`
--maxWidth=MAXWIDTH	The maximum width to display before truncating data, in characters, when outputformat is table. Default is to query the terminal for current width, then fall back to 80. Usage: `beeline --maxWidth=150`
--maxColumnWidth=MAXCOLWIDTH	The maximum column width, in characters, when outputformat is table. Default is 50 in Hive version 2.2.0+ (see HIVE-14135) or 15 in earlier versions. Usage: `beeline --maxColumnWidth=25`
--silent=[true/false]	Reduce the amount of informational messages displayed (true) or not (false). It also stops displaying the log messages for the query from HiveServer2 (Hive 0.14 and later) and the HiveQL commands (Hive 1.2.0 and later). Default is false. Usage: `beeline --silent=true`
--autosave=[true/false]	Automatically save preferences (true) or do not autosave (false). Default is false. Usage: `beeline --autosave=true`
--outputformat=[table/vertical/csv/tsv/dsv/csv2/tsv2]	Format mode for result display. Default is table. See 82903124 Separated-Value Output Formats below for description of recommended sv options. Usage: `beeline --outputformat=tsv` Version: dsv/csv2/tsv2 added in 0.14.0 (HIVE-8615)
--truncateTable=[true/false]	If true, truncates table column in the console when it exceeds console length. Version: 0.14.0 (HIVE-6928)
--delimiterForDSV= DELIMITER	The delimiter for delimiter-separated values output format. Default is '\|' character. Version: 0.14.0 (HIVE-7390)
--isolation=LEVEL	Set the transaction isolation level to TRANSACTION_READ_COMMITTED or TRANSACTION_SERIALIZABLE. See the "Field Detail" section in the Java Connection documentation. Usage: `beeline --isolation=TRANSACTION_SERIALIZABLE`
--nullemptystring=[true/false]	Use historic behavior of printing null as empty string (true) or use current behavior of printing null as NULL (false). Default is false. Usage: `beeline --nullemptystring=false` Version: 0.13.0 (HIVE-4485)
--incremental=[true/false]	Defaults to `true` from Hive 2.3 onwards, before it defaulted to `false.` When set to `false`, the entire result set is fetched and buffered before being displayed, yielding optimal display column sizing. When set to `true`, result rows are displayed immediately as they are fetched, yielding lower latency and memory usage at the price of extra display column padding. Setting `--incremental=true` is recommended if you encounter an OutOfMemory on the client side (due to the fetched result set size being large).
--incrementalBufferRows=NUMROWS	The number of rows to buffer when printing rows on stdout, defaults to 1000; only applicable if `--incremental=true` and `--outputformat=table` Usage: `beeline --incrementalBufferRows=1000` Version: 2.3.0 (HIVE-14170)
--maxHistoryRows=NUMROWS	The maximum number of rows to store Beeline history. Version: 2.3.0 (HIVE-15166)
--delimiter=;	Set the delimiter for queries written in Beeline. Multi-char delimiters are allowed, but quotation marks, slashes, and -- are not allowed. Defaults to ; Usage: `beeline --delimiter=$$` Version: 3.0.0 (HIVE-10865)
--convertBinaryArrayToString=[true/false]	Display binary column data as string or as byte array. a string using the platform's default character set. The default behavior (false) is to display binary data using: `Arrays.toString(byte[] columnValue)`Usage: `beeline --convertBinaryArrayToString=true` Version: 3.0.0 (HIVE-14786) Display binary column data as a string using the UTF-8 character set. The default behavior (false) is to display binary data using Base64 encoding without padding. Version: 4.0.0 (HIVE-2 3856) Usage: `beeline --convertBinaryArrayToString=true`
--help	Display a	--help	Display a usage message. Usage: `beeline --help`

...

The following output formats are supported:

table
vertical
82903124xmlattr82903124
xmlelements82903124
HiveServer2 Clients#json
82903124HiveServer2 Clients#jsonfile
separated-value formats (csv, tsv, csv2, tsv2, dsv)

...

Expand

title	Example

Result of the query select id, value, comment from test_table

No Format

<resultset>
  <result>
    <id>1</id>
    <value>Value1</value>
    <comment>Test comment 1</comment>
  </result>
  <result>
    <id>2</id>
    <value>Value2</value>
    <comment>Test comment 2</comment>
  </result>
  <result>
    <id>3</id>
    <value>Value3</value>
    <comment>Test comment 3</comment>
  </result>
</resultset>

Separated-Value Output Formats

The values of a row are separated by different delimiters.
There are five separated-value output formats available: csv, tsv, csv2, tsv2 and dsv.

csv2, tsv2, dsv

Starting with Hive 0.14 there are improved SV output formats available, namely dsv, csv2 and tsv2.
These three formats differ only with the delimiter between cells, which is comma for csv2, tab for tsv2, and configurable for dsv.

json

(Hive 4.0) The result is displayed in JSON format where each row is a "result" element in the JSON array "resultset".

Expand

title	Example

Result of the query select `String`, `Int`, `Decimal`, `Bool`, `Null`, `Binary` from test_table

No Format

{"resultset":[{"String":"aaa","Int":1,"Decimal":3.14,"Bool":true,"Null":null,"Binary":"SGVsbG8sIFdvcmxkIQ"},{"String":"bbb","Int":2,"Decimal":2.718,"Bool":false,"Null":null,"Binary":"RWFzdGVyCgllZ2cu"}]}

jsonfile

(Hive 4.0) The result is displayed in JSON format where each row is a distinct JSON object. This matches the expected format for a table created as JSONFILE formatFor the dsv format, the delimiter can be set with the delimiterForDSV option. The default delimiter is '|'.
Please be aware that only single character delimiters are supported.

Expand

title	Example

Result of the query select id, value, comment `String`, `Int`, `Decimal`, `Bool`, `Null`, `Binary` from test_table

csv2

No Format
id,value,comment 1,Value1,Test comment 1 2,Value2,Test comment 2 3,Value3,Test comment 3

tsv2

No Format
id value comment 1 Value1 Test comment 1 2 Value2 Test comment 2 3 Value3 Test comment 3

dsv (the delimiter is |)

No Format
id\|value\|comment 1\|Value1\|Test comment 1 2\|Value2\|Test comment 2 3\|Value3\|Test comment 3

Quoting in csv2, tsv2 and dsv Formats

If quoting is not disabled, double quotes are added around a value if it contains special characters (such as the delimiter or double quote character) or spans multiple lines.
Embedded double quotes are escaped with a preceding double quote.

The quoting can be disabled by setting the disable.quoting.for.sv system variable to true.
If the quoting is disabled, no double quotes are added around the values (even if they contains special characters) and the embedded double quotes are not escaped.
By default, the quoting is disabled.

{"String":"aaa","Int":1,"Decimal":3.14,"Bool":true,"Null":null,"Binary":"SGVsbG8sIFdvcmxkIQ"}
{"String":"bbb","Int":2,"Decimal":2.718,"Bool":false,"Null":null,"Binary":"RWFzdGVyCgllZ2cu"}

Separated-Value Output Formats

The values of a row are separated by different delimiters.
There are five separated-value output formats available: csv, tsv, csv2, tsv2 and dsv.

csv2, tsv2, dsv

Starting with Hive 0.14 there are improved SV output formats available, namely dsv, csv2 and tsv2.
These three formats differ only with the delimiter between cells, which is comma for csv2, tab for tsv2, and configurable for dsv.

For the dsv format, the delimiter can be set with the delimiterForDSV option. The default delimiter is '|'.
Please be aware that only single character delimiters are supported.

Expand

title	Example

Result of the query select id, value,

Expand

title	Example

Result of the query select id, value, comment from test_table

csv2, quoting is enabled

No Format
id,value,comment 1,"Value,1",Value contains commaValue1,Test comment 1 2,"Value""2",Value contains double quoteValue2,Test comment 2 3,Value'3Value3,ValueTest contains single quotecomment 3

tsv2csv2, quoting is disabled

No Format
id, value, comment 1,Value,1,Value contains comma 2,Value"2,Value contains double quote 3,Value'3,Value contains single quote

csv, tsv

	Value1	Test comment 1
2	Value2	Test comment 2
3	Value3	Test comment 3

dsv (the delimiter is |)

No Format
id\|value\|comment 1\|Value1\|Test comment 1 2\|Value2\|Test comment 2 3\|Value3\|Test comment 3

Quoting in csv2, tsv2 and dsv Formats

If quoting is not disabled, double quotes are added around a value if it contains special characters (such as the delimiter or double quote character) or spans multiple lines.
Embedded double quotes are escaped with a preceding double quote.

The quoting can be disabled by setting the disable.These two formats differ only with the delimiter between values, which is comma for csv and tab for tsv.
The values are always surrounded with single quote characters, even if the quoting is disabled by the disable.quoting.for.sv system variable .
These output formats don't escape the embedded single quotes.
Please be aware that these output formats are deprecated and only maintained for backward compatibilityto true.
If the quoting is disabled, no double quotes are added around the values (even if they contains special characters) and the embedded double quotes are not escaped.
By default, the quoting is disabled.

Expand

title	Example

Result of the query select id, value, comment from test_table

csvcsv2, quoting is enabled

No Format
'id','value','comment' '1','Value1','Test comment 1' '2','Value2','Test comment 2' '3','Value3','Test comment 3'

tsv

No Format
'id' 'value' 'comment' '1' 'Value1' 'Test comment 1' '2' 'Value2' 'Test comment 2' '3' 'Value3' 'Test comment 3'

HiveServer2 Logging

Starting with Hive 0.14.0, HiveServer2 operation logs are available for Beeline clients. These parameters configure logging:

hive.server2.logging.operation.enabled
hive.server2.logging.operation.log.location
hive.server2.logging.operation.verbose (Hive 0.14 to 1.1)
hive.server2.logging.operation.level (Hive 1.2 onward)

HIVE-11488 (Hive 2.0.0) adds the support of logging queryId and sessionId to HiveServer2 log file. To enable that, edit/add %X{queryId} and %X{sessionId} to the pattern format string of the logging configuration file.

Cancelling the Query

When a user enters CTRL+C on the Beeline shell, if there is a query which is running at the same time then Beeline attempts to cancel the query while closing the socket connection to HiveServer2. This behavior is enabled only when hive.server2.close.session.on.disconnect is set to true. Starting from Hive 2.2.0 (HIVE-15626) Beeline does not exit the command line shell when the running query is being cancelled as a user enters CTRL+C. If the user wishes to exit the shell they can enter CTRL+C for the second time while the query is being cancelled. However, if there is no query currently running, the first CTRL+C will exit the Beeline shell. This behavior is similar to how the Hive CLI handles CTRL+C.

!quit is the recommended command to exit the Beeline shell.

Background Query in Terminal Script

...

,"Value,1",Value contains comma
2,"Value""2",Value contains double quote
3,Value'3,Value contains single quote

csv2, quoting is disabled

No Format
id,value,comment 1,Value,1,Value contains comma 2,Value"2,Value contains double quote 3,Value'3,Value contains single quote

csv, tsv

These two formats differ only with the delimiter between values, which is comma for csv and tab for tsv.
The values are always surrounded with single quote characters, even if the quoting is disabled by the disable.quoting.for.sv system variable.
These output formats don't escape the embedded single quotes.
Please be aware that these output formats are deprecated and only maintained for backward compatibility.

Expand

title	Example

Result of the query select id, value, comment from test_table

csv

No Format
'id','value','comment' '1','Value1','Test comment 1' '2','Value2','Test comment 2' '3','Value3','Test comment 3'

tsv

No Format
'id' 'value' 'comment' '1' 'Value1' 'Test comment 1' '2' 'Value2' 'Test comment 2' '3' 'Value3' 'Test comment 3'

HiveServer2 Logging

Starting with Hive 0.14.0, HiveServer2 operation logs are available for Beeline clients. These parameters configure logging:

hive.server2.logging.operation.enabled
hive.server2.logging.operation.log.location
hive.server2.logging.operation.verbose (Hive 0.14 to 1.1)
hive.server2.logging.operation.level (Hive 1.2 onward)

HIVE-11488 (Hive 2.0.0) adds the support of logging queryId and sessionId to HiveServer2 log file. To enable that, edit/add %X{queryId} and %X{sessionId} to the pattern format string of the logging configuration file.

Cancelling the Query

When a user enters CTRL+C on the Beeline shell, if there is a query which is running at the same time then Beeline attempts to cancel the query while closing the socket connection to HiveServer2. This behavior is enabled only when hive.server2.close.session.on.disconnect is set to true. Starting from Hive 2.2.0 (HIVE-15626) Beeline does not exit the command line shell when the running query is being cancelled as a user enters CTRL+C. If the user wishes to exit the shell they can enter CTRL+C for the second time while the query is being cancelled. However, if there is no query currently running, the first CTRL+C will exit the Beeline shell. This behavior is similar to how the Hive CLI handles CTRL+C.

!quit is the recommended command to exit the Beeline shell.

Background Query in Terminal Script

Beeline can be run disconnected from a terminal for batch processing and automation scripts using commands such as nohup and disown.

Some versions of Beeline client may require a workaround to allow the nohup command to correctly put the Beeline process in the background without stopping it. See HIVE-11717, H IVE-6758.

The following environment variable can be updated:

Code Block

language	text

export HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS -Djline.terminal=jline.UnsupportedTerminal"

Running with nohangup (nohup) and ampersand (&) will place the process in the background and allow the terminal to disconnect while keeping the Beeline process running.

Code Block

language	text

nohup beeline --silent=true --showHeader=true --outputformat=dsv -f query.hql </dev/null > /tmp/output.log 2> /tmp/error.log &

JDBC

HiveServer2 has

The following environment variable can be updated:

Code Block

language	text

export HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS -Djline.terminal=jline.UnsupportedTerminal"

Running with nohangup (nohup) and ampersand (&) will place the process in the background and allow the terminal to disconnect while keeping the Beeline process running.

Code Block

language	text

nohup beeline --silent=true --showHeader=true --outputformat=dsv -f query.hql </dev/null > /tmp/output.log 2> /tmp/error.log &

Fetch size

In order of precedence, the fetch size for beeline is determined using :

If the beeline user does nothing, each query will use the fetch size received from HS2
If the beeline user sets the fetchSize in the JDBC connection string, each query will use the fetch size specified there
If the user wants to set the fetchSize in the session, they can with the syntax: !set fetchSize xxx
1. Setting a fetchSize of 0 will direct the driver to use the fetch size provided from HS2
2. Setting a fetchSize greater than 0 will set the driver fetch size to the specified value
3. Setting a fetchSize of -1 directs beeline to use the default JDBC default behavior: use the connection string fetchSize and, if none is specified, fallback to the fetch size specified by HS2 (this is the default beeline fetchSize value)
4. Setting a fetchSize of any other negative integer value is an error

Keeping in mind that whatever the client requests for a fetch size will be overruled on the HiveServer, for every FetchResults request, depending on the configured value for hive.server2.thrift.resultset.max.fetch.size. When a client requests a fetchSize larger than the max, a WARN message is emitted into the HS2 logs for further investigation and to direct clients to adjust their expectations (and configurations).

[1] https://docs.oracle.com/javase/8/docs/api/java/sql/Statement.html#setFetchSize-int-

JDBC

HiveServer2 has a JDBC driver. It supports both embedded and remote access to HiveServer2. Remote HiveServer2 mode is recommended for production use, as it is more secure and doesn't require direct HDFS/metastore access to be granted for users.

...

For versions earlier than 0.14, see the version note above.

Connection URL When ZooKeeper Service Discovery Is Enabled

ZooKeeper-based service discovery introduced in Hive 0.14.0 (HIVE-7935) enables high availability and rolling upgrade for HiveServer2. A JDBC URL that specifies <zookeeper quorum> needs to be used to make use of these features.

With further changes in Hive 2.0.0 and 1.3.0 (unreleased, HIVE-11581), none of the additional configuration parameters such as authentication mode, transport mode, or SSL parameters need to be specified, as they are retrieved from the ZooKeeper entries along with the hostname.

The JDBC connection URL: jdbc:hive2://<zookeeper quorum>/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2 .

The <zookeeper quorum> is the same as the value of hive.zookeeper.quorum configuration parameter in hive-site.xml/hivserver2-site.xml used by HiveServer2.

...

That is, at least in `hive-site.xml` or other configuration files for HiveServer2, `hive.server2.support.dynamic.service.discovery` should be set to `true`, and `hive.zookeeper.quorum` should be defined to point to several started Zookeeper Servers. Reference Configuration Properties .

The minimal configuration example is as follows.

Code Block

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
    <property>
        <name>hive.server2.support.dynamic.service.discovery</name>
        <value>true</value>
    </property>
    <property>
        <name>hive.zookeeper.quorum</name>
        <value>127.0.0.1:2181</value>
    </property>
</configuration>

With further changes in Hive 2.0.0 and 1.3.0 (unreleased, HIVE-11581), none of the additional configuration parameters such as authentication mode, transport mode, or SSL parameters need to be specified, as they are retrieved from the ZooKeeper entries along with the hostname.

The JDBC connection URL: jdbc:hive2://<zookeeper quorum>/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2 .

The <zookeeper quorum> is the same as the value of hive.zookeeper.quorum configuration parameter in hive-site.xml/hivserver2-site.xml used by HiveServer2.

Additional runtime parameters needed for querying can be provided within the URL as follows, by appending it as a ?<option> as before.

The JDBC connection URL: jdbc:hive2://<zookeeper quorum>/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2?tez.queue.name=hive1&hive.server2.thrift.resultset.serialize.in.tasks=true

Named Connection URLs

As of Hive 2.1.0 (HIVE-13670), Beeline now also supports named URL connect strings via usage of environment variables. If you try to do a !connect to a name that does not look like a URL, then Beeline will attempt to see if there is an environment variable called BEELINE_URL_<name>. For instance, if you specify !connect blue, it will look for BEELINE_URL_BLUE, and use that to connect. This should make it easier for system administrators to specify environment variables for users, and users need not type in the full URL each time to connect.

Reconnecting

Traditionally, !reconnect has worked to refresh a connection that has already been established. It is not able to do a fresh connect after !close has been run. As of Hive 2.1.0 (HIVE-13670), Beeline remembers the last URL successfully connected to in a session, and is able to reconnect even after a !close has been run. In addition, if a user does a !save, then this is saved in the beeline.properties file, which then allows !reconnect to connect to this saved last-connected-to URL across multiple Beeline sessions. This also allows the use of beeline -r from the command line to do a reconnect on startup.

Using hive-site.xml to automatically connect to HiveServer2

As of Hive 2.2.0 (HIVE-14063), Beeline adds support to use the hive-site.xml present in the classpath to automatically generate a connection URL based on the configuration properties in hive-site.xml and an additional user configuration

...

Named Connection URLs

As of Hive 2.1.0 (HIVE-13670), Beeline now also supports named URL connect strings via usage of environment variables. If you try to do a !connect to a name that does not look like a URL, then Beeline will attempt to see if there is an environment variable called BEELINE_URL_<name>. For instance, if you specify !connect blue, it will look for BEELINE_URL_BLUE, and use that to connect. This should make it easier for system administrators to specify environment variables for users, and users need not type in the full URL each time to connect.

Reconnecting

Traditionally, !reconnect has worked to refresh a connection that has already been established. It is not able to do a fresh connect after !close has been run. As of Hive 2.1.0 (HIVE-13670), Beeline remembers the last URL successfully connected to in a session, and is able to reconnect even after a !close has been run. In addition, if a user does a !save, then this is saved in the beeline.properties file, which then allows !reconnect to connect to this saved last-connected-to URL across multiple Beeline sessions. This also allows the use of beeline -r from the command line to do a reconnect on startup.

Using hive-site.xml to automatically connect to HiveServer2

As of Hive 2.2.0 (HIVE-14063), Beeline adds support to use the hive-site.xml present in the classpath to automatically generate a connection URL based on the configuration properties in hive-site.xml and an additional user configuration file. Not all the URL properties can be derived from hive-site.xml and hence in order to use this feature user must create a configuration file called “beeline-hs2-connection.xml” which is a Hadoop XML format file. This file is used to provide user-specific connection properties for the connection URL. Beeline looks for this configuration file in ${user.home}/.beeline/ (Unix based OS) or ${user.home}\beeline\ directory (in case of Windows). If the file is not found in the above locations Beeline looks for it in ${HIVE_CONF_DIR} location and /etc/hive/conf (check HIVE-16335 which fixes this location from /etc/conf/hive in Hive 2.2.0) in that order. Once the file is found, Beeline uses beeline-hs2-connection.xml in conjunction with the hive-site.xml in the class path to determine the connection URL.

...

Code Block

language	java

static Connection getConnection( Subject signedOnUserSubject ) throws Exception{
       Connection conn = (Connection) Subject.doAs(signedOnUserSubject, new PrivilegedExceptionAction<Object>()
           {
               public Object run()
               {
                       Connection con = null;
                       String JDBC_DB_URL = "jdbc:hive2://HiveHost:10000/default;" ||
                                              "principal=hive/localhost.localdomain@EXAMPLE.COM;" || 
                                              "kerberosAuthType=fromSubject";
                       try {
                               Class.forName(JDBC_DRIVER);
                               con =  DriverManager.getConnection(JDBC_DB_URL);
                       } catch (SQLException e) {
                               e.printStackTrace();
                       } catch (ClassNotFoundException e) {
                               e.printStackTrace();
                       }
                       return con;
               }
           });
       return conn;
}

JDBC Fetch Size

Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed by the client. The default value, used for every statement, can be specified through the JDBC connection string. This default value may subsequently be overwritten, per statement, with the JDBC API. If no value is specified within the JDBC connection string, then the default fetch size is retrieved from the HiveServer2 instance as part of the session initiation operation.

jdbc:hive2://<host>:<port>/<db>;fetchsize=<value>

Info

title	Hive Version 4.0

The Hive JDBC driver will receive a preferred fetch size from the instance of HiveServer2 it has connected to. This value is specified on the server by the hive.server2.thrift.resultset.default.fetch.size configuration.

The JDBC fetch size is only a hint and the server will attempt to respect the client's requested fetch size though with some limits. HiveServer2 will cap all requests at a maximum value specified by the hive.server2.thrift.resultset.max.fetch.size configuration value regardless of the client's requested fetch size.

While a larger fetch size may limit the number of round-trips between the client and server, it does so at the expense of additional memory requirements on the client and server.

The default JDBC fetch size value may be overwritten, per statement, with the JDBC API:

Setting a value of 0 instructs the driver to use the fetch size value preferred by the server
Setting a value greater than zero will instruct the driver to fetch that many rows, though the actual number of rows returned may be capped by the server
If no fetch size value is explicitly set on the JDBC driver's statement then the driver's default value is used
- If the fetch size value is specified within the JDBC connection string, this is the default value
- If the fetch size value is absent from the JDBC connection string, the server's preferred fetch size is used as the default value

Python Client

A Python client driver is available on github. For installation instructions, see Setting Up HiveServer2: Python Client Driver.

Ruby Client

A Ruby client driver is available on github at https://github.com/forward3d/rbhive.

Integration with SQuirrel SQL Client

Download, install and start the SQuirrel SQL Client from the SQuirrel SQL website.
Select 'Drivers -> New Driver...' to register Hive's JDBC driver that works with HiveServer2.
1. Enter the driver name and example URL:
  Code Block
  language text

...

1. Name: Hive Example

...

FetchSize for ResultSets

In the order of precedence, Hive JDBC driver uses the following criteria to determine fetchSize for ResultSet.

Fetch size is set based on what is received from HS2 (hive.server2.thrift.resultset.default.fetch.size) during the client session open sequence
Fetch size is set in the JDBC connection string (not well documented: jdbc:hive2://localhost:10000;fetchSize=100)
Fetch size is set by the application code via JDBC setFetchSize [1]

Keeping in mind that whatever the client requests for a fetch size will be overruled on the HiveServer, for every FetchResults request, depending on the configured value for hive.server2.thrift.resultset.max.fetch.size. When a client requests a fetchSize larger than the max, a WARN message is emitted into the HS2 logs for further investigation and to direct clients to adjust their expectations (and configurations).

[1] https://docs.oracle.com/javase/8/docs/api/java/sql/Statement.html#setFetchSize-int-

Python Client

A Python client driver is available on github. For installation instructions, see Setting Up HiveServer2: Python Client Driver.

Ruby Client

A Ruby client driver is available on github at https://github.com/forward3d/rbhive.

Integration with SQuirrel SQL Client

...

Enter the driver name and example URL:

Code Block

language	text

   Name: Hive
   Example URL: jdbc:hive2://localhost:10000/default

Select 'Extra Class Path -> Add' to add the following jars from your local Hive and Hadoop distribution.

Code Block
HIVE_HOME/lib/hive-jdbc--standalone.jar HADOOP_HOME/share/hadoop/common/hadoop-common-.jar

Info

title	Version information

Hive JDBC standalone jars are used in Hive 0.14.0 onward (HIVE-538); for previous versions of Hive, use HIVE_HOME/build/dist/lib/*.jar instead.

The hadoop-common jars are for Hadoop 2.0; for previous versions of Hadoop, use HADOOP_HOME/hadoop-*-core.jar instead.

Select 'List Drivers'. This will cause SQuirrel to parse your jars for JDBC drivers and might take a few seconds. From the 'Class Name' input box select the Hive driver for working with HiveServer2:

Code Block
org.apache.hive.jdbc.HiveDriver

...

Click 'OK' to complete the driver registration.

...

Give the connection alias a name in the 'Name' input box.
Select the Hive driver from the 'Driver' drop-down.
Modify the example URL as needed to point to your HiveServer2 instance.
Enter 'User Name' and 'Password' and click 'OK' to save the connection alias.
To connect to HiveServer2, double-click the Hive alias and click 'Connect'.

URL: jdbc:hive2://localhost:10000/default

Select 'Extra Class Path -> Add' to add the following jars from your local Hive and Hadoop distribution.

Code Block
HIVE_HOME/lib/hive-jdbc--standalone.jar HADOOP_HOME/share/hadoop/common/hadoop-common-.jar

Info

title	Version information

Hive JDBC standalone jars are used in Hive 0.14.0 onward (HIVE-538); for previous versions of Hive, use HIVE_HOME/build/dist/lib/*.jar instead.

The hadoop-common jars are for Hadoop 2.0; for previous versions of Hadoop, use HADOOP_HOME/hadoop-*-core.jar instead.

Select 'List Drivers'. This will cause SQuirrel to parse your jars for JDBC drivers and might take a few seconds. From the 'Class Name' input box select the Hive driver for working with HiveServer2:
Code Block
org.apache.hive.jdbc.HiveDriver
Click 'OK' to complete the driver registration.
Select 'Aliases -> Add Alias...' to create a connection alias to your HiveServer2 instance.
1. Give the connection alias a name in the 'Name' input box.
2. Select the Hive driver from the 'Driver' drop-down.
3. Modify the example URL as needed to point to your HiveServer2 instance.
4. Enter 'User Name' and 'Password' and click 'OK' to save the connection alias.
5. To connect to HiveServer2, double-click the Hive alias and click 'Connect'.

When the connection is established you will see errors in the log console and might get a warning that the driver is not JDBC 3.0 compatible. These alerts are due to yet-to-be-implemented parts of the JDBC metadata API and can safely be ignored. To test the connection enter SHOW TABLES in the console and click the run icon.

Also note that when a query is running, support for the 'Cancel' button is not yet available.

Integration with SQL Developer

Integration with Oracle SQLDeveloper is available using JDBC connection.

https://community.hortonworks.com/articles/1887/connect-oracle-sql-developer-to-hive.html

Integration with DbVisSoftware's DbVisualizer

Download, install and start DbVisualizer free or purchase DbVisualizer Pro from https://www.dbvis.com/.
Follow instructions on github.

Advanced Features for Integration with Other Tools

Supporting Cookie Replay in HTTP Mode

Info

title	Version 1.2.0 and later

This option is available starting in Hive 1.2.0.

HIVE-9709 introduced support for the JDBC driver to enable cookie replay. This is turned on by default so that incoming cookies can be sent back to the server for authentication.

The JDBC connection URL when enabled should look like this:

jdbc:hive2://<host>:<port>/<db>?transportMode=http;httpPath=<http_endpoint>;cookieAuth=true;cookieName=<cookie_name>

cookieAuth is set to true by default.
cookieName: If any of the incoming cookies' keys match the value of cookieName, the JDBC driver will not send any login credentials/Kerberos ticket to the server. The client will just send the cookie alone back to the server for authentication. The default value of cookieName is hive.server2.auth (this is the HiveServer2 cookie name).
To turn off cookie replay, cookieAuth=false must be used in the JDBC URL.
Important Note: As part of HIVE-9709, we upgraded Apache http-client and http-core components of Hive to 4.4. To avoid any collision between this upgraded version of HttpComponents and other any versions that might be present in your system (such as the one provided by Apache Hadoop 2.6 which uses http-client and http-core components version of 4.2.5), the client is expected to set CLASSPATH in such a way that Beeline-related jars appear before HADOOP lib jars. This is achieved via setting HADOOP_USER_CLASSPATH_FIRST=true before using hive-jdbc. In fact, in bin/beeline.sh we do this!

Using 2-way SSL in HTTP Mode

When the connection is established you will see errors in the log console and might get a warning that the driver is not JDBC 3.0 compatible. These alerts are due to yet-to-be-implemented parts of the JDBC metadata API and can safely be ignored. To test the connection enter SHOW TABLES in the console and click the run icon.

Also note that when a query is running, support for the 'Cancel' button is not yet available.

Integration with SQL Developer

Integration with Oracle SQLDeveloper is available using JDBC connection.

https://community.hortonworks.com/articles/1887/connect-oracle-sql-developer-to-hive.html

Integration with DbVisSoftware's DbVisualizer

Download, install and start DbVisualizer free or purchase DbVisualizer Pro from https://www.dbvis.com/.
Follow instructions on github.

Advanced Features for Integration with Other Tools

...

Info
title Version 1.2.0 and later
This option is available starting in Hive 1.2.0.
HIVE-9709 introduced support for 10447 enabled the JDBC driver to enable cookie replay. This is turned on by default so that incoming cookies can be sent back to the server for authentication. support 2-way SSL in HTTP mode. Please note that HiveServer2 currently does not support 2-way SSL. So this feature is handy when there is an intermediate server such as Knox which requires client to support 2-way SSL.
JDBC connection URLThe JDBC connection URL when enabled should look like this:
`jdbc:hive2://<host>:<port>/<db>`?`;ssl=true;twoWay=true;sslTrustStore=<trust_store_path>;trustStorePassword=<trust_store_password>;sslKeyStore=<key_store_path>;keyStorePassword=<key_store_password>;transportMode=http;httpPath=<http_endpoint>`;cookieAuth=true;cookieName=<cookie_name>
cookieAuth is set to `true` by default.
cookieName: If any of the incoming cookies' keys match the value of cookieName, the JDBC driver will not send any login credentials/Kerberos ticket to the server. The client will just send the cookie alone back to the server for authentication. The default value of cookieName is hive.server2.auth (this is the HiveServer2 cookie name).
To turn off cookie replay, cookieAuth=false must be used in the JDBC URL.
Important Note: As part of HIVE-9709, we upgraded Apache http-client and http-core components of Hive to 4.4. To avoid any collision between this upgraded version of HttpComponents and other any versions that might be present in your system (such as the one provided by Apache Hadoop 2.6 which uses http-client and http-core components version of 4.2.5), the client is expected to set CLASSPATH in such a way that Beeline-related jars appear before HADOOP lib jars. This is achieved via setting HADOOP_USER_CLASSPATH_FIRST=true before using hive-jdbc. In fact, in bin/beeline.sh we do this!

Using 2-way SSL in HTTP Mode

Info

title	Version 1.2.0 and later

This option is available starting in Hive 1.2.0.

HIVE-10447 enabled the JDBC driver to support 2-way SSL in HTTP mode. Please note that HiveServer2 currently does not support 2-way SSL. So this feature is handy when there is an intermediate server such as Knox which requires client to support 2-way SSL.

<trust_store_path> is the path where the client's truststore file lives. This is a mandatory non-empty field.
<trust_store_password> is the password to access the truststore.
<key_store_path> is the path where the client's keystore file lives. This is a mandatory non-empty field.
<key_store_password> is the password to access the keystore.

For versions earlier than 0.14, see the version note above.

In the environment where exposing trustStorePassword and keyStorePassword in the connection URL is a security concern, a new option storePasswordPath is introduced with HIVE-27308 that can be used in URL instead of trustStorePassword and keyStorePassword. storePasswordPath value hold the path to the local keystore file storing the trustStorePassword and keyStorePassword aliases. When the existing trustStorePassword or keyStorePassword is present in URL along with storePasswordPath, respective password is directly obtained from password option. Otherwise, fetches the particular alias from local keystore file(i.e., existing password options are preferred over storePasswordPath).

JDBC connection URL with storePasswordPathJDBC connection URL:

jdbc:hive2://<host>:<port>/<db>;ssl=true;twoWay=true;sslTrustStore=<trust_store_path>;trustStorePassword=<trust_store_password>;sslKeyStore=<key_store_path>;keyStorePassword=<key_store_password>?transportMode=http;httpPath=<http_endpoint>

<trust_store_path> is the path where the client's truststore file lives. This is a mandatory non-empty field.
<trust_store_password> is the password to access the truststore.
<key_store_path> is the path where the client's keystore file lives. This is a mandatory non-empty field.
<key_store_password> is the password to access the keystore.

<key_store_path>;storePasswordPath=store_password_path>;transportMode=http;httpPath=<http_endpoint>

A local keystore file can be created leveraging hadoop credential command with trustStorePassword and keyStorePassword aliases like below. And this file can be passed with storePasswordPath option in the connection URL.

hadoop credential create trustStorePassword -value mytruststorepassword -provider localjceks://file/tmp/client_creds.jceks

hadoop credential create keyStorePassword -value mykeystorepassword -provider localjceks://file/tmp/client_creds.jceksFor versions earlier than 0.14, see the version note above.

Passing HTTP Header Key/Value Pairs via JDBC Driver

...

For versions earlier than 0.14, see the version note above.

Passing Custom HTTP Cookie Key/Value Pairs via JDBC Driver

...

Space shortcuts

Child pages

Page History

Versions Compared

Old Version 189

New Version Current

Key

Beeline

Properties

Beeline Hive Commands

Separated-Value Output Formats

csv2, tsv2, dsv

json

jsonfile

Quoting in csv2, tsv2 and dsv Formats

Separated-Value Output Formats

csv2, tsv2, dsv

csv, tsv

Quoting in csv2, tsv2 and dsv Formats

HiveServer2 Logging

Cancelling the Query

Background Query in Terminal Script

csv, tsv

HiveServer2 Logging

Cancelling the Query

Background Query in Terminal Script

JDBC

Fetch size

JDBC

Connection URL When ZooKeeper Service Discovery Is Enabled

Named Connection URLs

Reconnecting

Using hive-site.xml to automatically connect to HiveServer2

Named Connection URLs

Reconnecting

Using hive-site.xml to automatically connect to HiveServer2

JDBC Fetch Size

Python Client

Ruby Client

Integration with SQuirrel SQL Client

FetchSize for ResultSets

Python Client

Ruby Client

Integration with SQuirrel SQL Client

Integration with SQL Developer

Integration with DbVisSoftware's DbVisualizer

Advanced Features for Integration with Other Tools

Supporting Cookie Replay in HTTP Mode

Using 2-way SSL in HTTP Mode

Integration with SQL Developer

Integration with DbVisSoftware's DbVisualizer

Advanced Features for Integration with Other Tools

Using 2-way SSL in HTTP Mode

Passing HTTP Header Key/Value Pairs via JDBC Driver

Passing Custom HTTP Cookie Key/Value Pairs via JDBC Driver