...
All the metadata for Hive tables and partitions are accessed through the Hive Metastore. Metadata is persisted using JPOX ORM solution (Data Nucleus) so any database that is supported by it can be used by Hive. Most of the commercial relational databases and many open source databases are supported. See the list of supported databases in section below.
You can find an E/R diagram for the metastore here.
...
Configuration options for metastore database where metadata is persisted:
Configuration options for metastore server:
...
Configuration Parameter | Description |
---|---|
javax.jdo.option.ConnectionURL | JDBC connection string for the data store which contains metadata |
javax.jdo.option.ConnectionDriverName | JDBC Driver class name for the data store which contains metadata |
hive.metastore.uris | Hive connects to one of these URIs to make metadata requests to a remote Metastore (comma separated list of URIs) |
hive.metastore.local | local or remote metastore (Removed removed as of Hive 0.10: If |
hive.metastore.warehouse.dir | URI of the default location for native tables |
...
The default configuration sets up an embedded metastore which is used in unit tests and is described in the next section. More practical options are described in the subsequent sections.
Local/Embedded Metastore Database (
...
Derby)
An embedded metastore database is mainly used for unit tests. Only one process can connect to the metastore database at a time, so it is not really a practical solution but works well for unit tests.
...
Config Param | Config Value | Comment |
---|---|---|
javax.jdo.option.ConnectionURL |
| Derby database located at hive/trunk/build... |
javax.jdo.option.ConnectionDriverName |
| Derby embeded JDBC driver class. |
hive.metastore.warehouse.dir | Unit test data goes in here on your local filesystem. |
...
If you want to run Derby as a network server so the metastore can be accessed from multiple nodes, see Hive Using Derby in Server Mode.
Remote Metastore Database
...
In local/embedded metastore setup, the metastore server component is used like a library within the Hive Client. Each Hive Client will open a connection to the database and make SQL queries against it. Make sure that the database is accessible from the machines where Hive queries are executed since this is a local store. Also make sure the JDBC client library is in the classpath of Hive Client. This configuration is often used with HiveServer2 (to use embedded metastore only with hiveserver2 HiveServer2 add "-hiveconf hive.metastore.uris=' '" in command line parameters of the hiveserver2 start command or use hiveserver2-site.xml (available in hive Hive 0.14)).
Config Param | Config Value | Comment |
---|---|---|
hive.metastore.uris | not needed because this is local store |
|
hive.metastore.local |
| this is local store (Removed removed in hive Hive 0.10, see configuration description section). |
hive.metastore.warehouse.dir |
| default location for Hive tables. |
Remote Metastore Server
In remote metastore setup, all Hive Clients will make a connection to a metastore server which in turn queries the datastore (MySQL in this example) for metadata. Metastore server and client communicate using Thrift Protocol. Starting with Hive 0.5.0, you can start a Thrift server by executing the following command:
...
The following example uses a AdminManual MetastoreAdmin.
Config Param | Config Value | Comment |
---|---|---|
javax.jdo.option.ConnectionURL |
| metadata is stored in a MySQL server |
javax.jdo.option.ConnectionDriverName |
| MySQL JDBC driver class |
javax.jdo.option.ConnectionUserName |
| user name for connecting to MySQL server |
javax.jdo.option.ConnectionPassword |
| password for connecting to MySQL server |
hive.metastore.warehouse.dir |
| default location for Hive tables. |
...