Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Dynamic service discovery of metastore using ZooKeeper.

...

If you execute Java directly, then JAVA_HOME, HIVE_HOME, HADOOP_HOME must be correctly set; CLASSPATH should contain Hadoop, Hive (lib and auxlib), and Java jars.

Server Configuration Parameters

The following example uses a Remote Metastore Database.

Config Param

Config Value

Comment

javax.jdo.option.ConnectionURL

jdbc:mysql://<host name>/<database name>?createDatabaseIfNotExist=true

metadata is stored in a MySQL server

javax.jdo.option.ConnectionDriverName

com.mysql.jdbc.Driver

MySQL JDBC driver class

javax.jdo.option.ConnectionUserName

<user name>

user name for connecting to MySQL server

javax.jdo.option.ConnectionPassword

<password>

password for connecting to MySQL server

hive.metastore.warehouse.dir

<base hdfs path>

default location for Hive tables.

hive.metastore.thrift.bind.host<host_name>Host name to bind the metastore service to. When empty, "localhost" is used. This configuration is available Hive 4.0.0 onwards.

From Hive 3.0.0 (HIVE-16452) onwards the metastore database stores a GUID which can be queried using the Thrift API get_metastore_db_uuid by metastore clients in order to identify the backend database instance. This API can be accessed by the HiveMetaStoreClient using the method getMetastoreDbUuid().

Client Configuration Parameters

Config Param

Config Value

Comment

hive.metastore.uris

thrift://<host_name>:<port>

host and port for the Thrift metastore server. If hive.metastore.thrift.bind.host is specified, host should be same as that configuration. Read more about this in dynamic service discovery configuration parameters.

hive.metastore.local

false

Metastore is remote.  Note: This is no longer needed as of Hive 0.10.  Setting hive.metastore.uri is sufficient.

hive.metastore.warehouse.dir

<base hdfs path>

Points to default location of non-external Hive tables in HDFS.


Dynamic Service Discovery Configuration Parameters

From Hive 4.0.0 (HIVE-20974) onwards, similar to HiveServer2, a ZooKeeper service can be used for dynamic service discovery of a remote metastore server. Following parameters are used by both metastore server and client.

Config Param

Config Value

Comment

hive.metastore.service.discovery.modeservice discovery modeWhen it is set to "zookeeper", ZooKeeper is used for dynamic service discovery of a remote metastore. In that case, a metastore adds itself to the ZooKeeper when it is started and removes itself when it shuts down. By default it is empty. Both the client and server should have same value for this parameter.
hive.metastore.uris<host_name>:<port>, <host_name>:<port>, ...One or more host and port pairs of ZooKeeper servers forming a ZooKeeper ensemble. Used when hive.metastore.service.discovery.mode is set to "zookeeper". The configuration is not used by server otherwise. If all the servers are using the same port you may specify the port using hive.metastore.zookeeper.client.port instead of specifying it with every server separately. Both the client and server should have same value for this parameter.

hive.metastore.zookeeper.client.port

<port>Port number when same port number is used by all the ZooKeeper servers in the ensemble. Both the client and server should have same value for this parameter.
hive.metastore.zookeeper.namespace<namespace name>The parent node under which all ZooKeeper nodes for metastores are created.
hive.metastore.zookeeper.session.timeout<time in milliseconds>

ZooKeeper client's session timeout (in milliseconds). The client is disconnected if a heartbeat is not sent in the timeout.

hive.metastore.zookeeper.connection.timeout<time in seconds>

ZooKeeper client's connection timeout in seconds. Connection timeout * hive.metastore.zookeeper.connection.max.retries with exponential backoff is when curator client deems connection is lost to zookeeper.

hive.metastore.zookeeper.connection.max.retries<number>Max number of times to retry when connecting to the ZooKeeper server.
hive.metastore.zookeeper.connection.basesleeptime<time in milliseconds>

Initial amount of time (in milliseconds) to wait between retries when connecting to the ZooKeeper server when using ExponentialBackoffRetry policy.


If you are using MySQL as the datastore for metadata, put MySQL jdbc libraries in HIVE_HOME/lib before starting Hive Client or HiveMetastore Server.

To change the metastore port, use this hive command:

...

Starting in release 0.12, Hive also includes an off-line schema tool to initialize and upgrade the metastore schema. Please refer to the details here.

Save