...
Command | Description |
---|---|
!<SQLLine command> | List of SQLLine commands available at http://sqlline.sourceforge.net/. Example: |
!delimiter | Set the delimiter for queries written in Beeline. Multi-character delimiters are allowed, but quotation marks, slashes, and -- are not allowed. Defaults to ; Usage: Version: 3.0.0 (HIVE-10865) |
Beeline
...
Properties
Property |
---|
Hive specific commands (same as Hive CLI commands) can be run from Beeline, when the Hive JDBC driver is used.
Use ";
" (semicolon) to terminate commands. Comments in scripts can be specified using the "--
" prefix.
Description | |
---|---|
fetchsize | Standard JDBC enables you to specify the number of rows fetched with each database round-trip for a query, and this number is referred to as the fetch size.
Version: 4.0.0 (HIVE-22853) |
Beeline Hive Commands
Hive specific commands (same as Hive CLI commands) can be run from Beeline, when the Hive JDBC driver is used.
Use ";
" (semicolon) to terminate commands. Comments in scripts can be specified using the "--
" prefix.
Command | Description | |||||
---|---|---|---|---|---|---|
reset | Resets the configuration to the default values. | |||||
reset <key> | Resets the value of a particular configuration variable (key) to the default value. | |||||
set <key>=<value> | Sets the value of a particular configuration variable (key). | |||||
set | Prints a list of configuration variables that are overridden by the user or Hive. | |||||
set -v | Prints all Hadoop and Hive configuration variables | |||||
reset | Resets the configuration to the default values. | |||||
reset <key> | Resets the value of a particular configuration variable (key) to the default value. | |||||
set <key>=<value> | Sets the value of a particular configuration variable (key). | |||||
set | Prints a list of configuration variables that are overridden by the user or Hive. | |||||
set -v | Prints all Hadoop and Hive configuration variables. | |||||
add FILE[S] <filepath> <filepath>* | Adds one or more files, jars, or archives to the list of resources in the distributed cache. See Hive Resources for more information. | |||||
add FILE[S] | <ivyurl> <ivyurl><filepath> <filepath>* |
| <ivyurl> <ivyurl><filepath> <filepath>* |
| <ivyurl> <ivyurl><filepath> <filepath>* | As of Hive 1.2.0, adds Adds one or more files, jars, or archives to the list of resources in the distributed cache. See Hive Resources for more information. |
add FILE[S] <ivyurl> <ivyurl>* add JAR[S] <ivyurl> <ivyurl>* add ARCHIVE[S] <ivyurl> <ivyurl>* | As of Hive 1.2.0, adds one or more files, jars or archives to the list of resources in the distributed cache using an Ivy URL using an Ivy URL of the form ivy://group:module:version?query_string. See Hive Resources for more information. | |||||
list FILE[S] | Lists the resources already added to the distributed cache. See Hive Resources for more information. (As of Hive 0.14.0: HIVE-7592). | |||||
list FILE[S] <filepath>* | Checks whether the given resources are already added to the distributed cache or not. See Hive Resources for more information. | |||||
delete FILE[S] <filepath>* | Removes the resource(s) from the distributed cache. | |||||
delete FILE[S] <ivyurl> <ivyurl>* delete JAR[S] <ivyurl> <ivyurl>* delete ARCHIVE[S] <ivyurl> <ivyurl>* | As of Hive 1.2.0, removes the resource(s) which were added using the <ivyurl> from the distributed cache. See Hive Resources for more information. | |||||
reload | As of Hive 0.14.0, makes HiveServer2 aware of any jar changes in the path specified by the configuration parameter hive.reloadable.aux.jars.path (without needing to restart HiveServer2). The changes can be adding, removing, or updating jar files. | |||||
dfs <dfs command> | Executes a dfs command. | |||||
<query string> | Executes a Hive query and prints results to standard output. |
...
Running with nohangup (nohup) and ampersand (&) will place the process in the background and allow the terminal to disconnect while keeping the Beeline process running.
Code Block | ||
---|---|---|
| ||
nohup beeline --silent=true --showHeader=true --outputformat=dsv -f query.hql </dev/null > /tmp/output.log 2> /tmp/error.log & |
Fetch size
In order of precedence, the fetch size for beeline is determined using :
- If the beeline user does nothing, each query will use the fetch size received from HS2
- If the beeline user sets the fetchSize in the JDBC connection string, each query will use the fetch size specified there
- If the user wants to set the fetchSize in the session, they can with the syntax: !set fetchSize xxx
- Setting a fetchSize of 0 will direct the driver to use the fetch size provided from HS2
- Setting a fetchSize greater than 0 will set the driver fetch size to the specified value
- Setting a fetchSize of -1 directs beeline to use the default JDBC default behavior: use the connection string fetchSize and, if none is specified, fallback to the fetch size specified by HS2 (this is the default beeline fetchSize value)
- Setting a fetchSize of any other negative integer value is an error
Keeping in mind that whatever the client requests for a fetch size will be overruled on the HiveServer, for every FetchResults request, depending on the configured value for hive.server2.thrift.resultset.max.fetch.size. When a client requests a fetchSize larger than the max, a WARN message is emitted into the HS2 logs for further investigation and to direct clients to adjust their expectations (and configurations).
...
) will place the process in the background and allow the terminal to disconnect while keeping the Beeline process running.
Code Block | ||
---|---|---|
| ||
nohup beeline --silent=true --showHeader=true --outputformat=dsv -f query.hql </dev/null > /tmp/output.log 2> /tmp/error.log & |
JDBC
HiveServer2 has a JDBC driver. It supports both embedded and remote access to HiveServer2. Remote HiveServer2 mode is recommended for production use, as it is more secure and doesn't require direct HDFS/metastore access to be granted for users.
...
In the order of precedence, Hive JDBC driver uses the following criteria to determine fetchSize for ResultSet.
- Fetch size is set based on what is received from HS2 (hive.server2.thrift.resultset.default.fetch.size) during the client session open sequence
- Fetch size is set in the JDBC connection string (not well documented: jdbc:hive2://localhost:10000;fetchSize=100)
- Fetch size is set by the application code via JDBC setFetchSize [1]
Keeping in mind that whatever the client requests for a fetch size will be overruled on the HiveServer, for every FetchResults request, depending on the configured value for hive.server2.thrift.resultset.max.fetch.size. When a client requests a fetchSize larger than the max, a WARN message is emitted into the HS2 logs for further investigation and to direct clients to adjust their expectations (and configurations).
[1] https://docs.oracle.com/javase/8/docs/api/java/sql/Statement.html#setFetchSize-int-
Python Client
A Python client driver is available on github. For installation instructions, see Setting Up HiveServer2: Python Client Driver.
...