Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: warn about HiveServer1 removal in 0.15

Hive ODBC Driver

Table of Contents

Note

These instructions are for the Hive ODBC driver available in Hive for HiveServer1.
There is no ODBC driver available for HiveServer2 as part of Apache Hive. There are third party ODBC drivers available from different vendors, and most of them seem to be free.

HiveServer is scheduled to be removed from Hive releases starting with Hive 0.15. See HIVE-6977. Please switch over to HiveServer2.

Introduction

The Hive ODBC Driver is a software library that implements the Open Database Connectivity (ODBC) API standard for the Hive database management system, enabling ODBC compliant applications to interact seamlessly (ideally) with Hive through a standard interface. This driver will NOT be built as a part of the typical Hive build process and will need to be compiled and built separately according to the instructions below.

...

In order to build and install the Hive client:

  1. Checkout and setup the latest version of Apache Hive from the Subversion or Git source code repository. For more details, see Getting Started with Hive. From this point onwards, the path to the Hive root directory will be referred to as HIVE_SRC_ROOT.

    Note
    titleUsing a tarball source release

    If you are compiling against source code contained in the tarball release package then HIVE_SRC_ROOT refers to the 'src' subdirectory.

    Warning
    titleThe ODBC driver is broken on trunk!

    Currently the C++ Thrift client library that the ODBC driver depends on will not build on trunk. This issue is being tracked in HIVE-4433. If you are using trunk prior to release 0.12 check the status of this ticket before proceeding. Also see HIVE-4492.

  2. Build the Hive client by running the following command from HIVE_SRC_ROOT. This will compile and copy the libraries and header files to HIVE_SRC_ROOT/build/odbc/. Please keep in mind that all paths should be fully specified (no relative paths). If you encounter an "undefined reference to vtables" error, make sure that you have specified the absolute path for thrift.home.

    Code Block
    
     $ ant compile-cpp -Dthrift.home=<THRIFT_HOME>
     

    MVN:

    Code Block
    
    $ cd odbc
    $ mvn compile -Podbc,hadoop-1 -Dthrift.home=/usr/local -Dboost.home=/usr/local
     

You can optionally force Hive client to compile into a non-native bit architecture by specifying the additional parameter (assuming you have the proper compilation libraries):

Code Block

 $ ant compile-cpp -Dthrift.home=<THRIFT_HOME> -Dword.size=<32 or 64>
 
  1. You can verify the entire Hive compilation by running the Hive test suite from HIVE_SRC_ROOT. Specifying the argument '-Dthrift.home=<THRIFT_HOME>' will enable the tests for the Hive client. If you do NOT specify thrift.home, the Hive client tests will not be run and will just return successful.

    Code Block
    
     $ ant test -Dthrift.home=<THRIFT_HOME>
     

    MVN:

    Code Block
    
    $ cd odbc
    $ mvn test -Podbc,hadoop-1 -Dthrift.home=/usr/local -Dboost.home=/usr/local
     

    You can specifically execute the Hive client tests by running the above command from HIVE_SRC_ROOT/odbc/. NOTE: Hive client tests require that a local Hive Server be operating on port 10000.

  2. To install the Hive client libraries onto your machine, run the following command from HIVE_SRC_ROOT/odbc/. NOTE: The install path defaults to /usr/local. While there is no current way to change this default directory from the ant build process, a manual install may be performed by skipping the command below and copying out the contents of HIVE_SRC_ROOT/build/odbc/lib and HIVE_SRC_ROOT/build/odbc/include into their local file system counterparts.

    Code Block
    
     $ sudo ant install -Dthrift.home=<THRIFT_HOME>
     

    NOTE: The compiled static library, libhiveclient.a, requires linking with stdc++ as well as thrift libraries to function properly.
    NOTE: Currently, there is no way to specify non-system library and header directories to the unixODBC build process. Thus, the Hive client libraries and headers MUST be installed to a default system location in order for the unixODBC build process to detect these files. This issue may be remedied in the future.

unixODBC API Wrapper Build/Setup

After you have built and installed the Hive client, you can now install the unixODBC API wrapper:

  1. In the unixODBC root directory, run the following command:

    Code Block
    
     $ ./configure --enable-gui=no --prefix=<unixODBC_INSTALL_DIR>
     

    If you encounter the the errors: "redefinition of 'struct _hist_entry'" or "previous declaration of 'add_history' was here" then re-execute the configure with the following command:

    Code Block
    
     $ ./configure --enable-gui=no --enable-readline=no --prefix=<unixODBC_INSTALL_DIR>
     

    To force the compilation of the unixODBC API wrapper into a non-native bit architecture, modify the CC and CXX environment variables to include the appropriate flags. For example:

    Code Block
    
     $ CC="gcc -m32" CXX="g++ -m32" ./configure --enable-gui=no --enable-readline=no --prefix=<unixODBC_INSTALL_DIR>
     
  2. Compile the unixODBC API wrapper with the following:

    Code Block
    
     $ make
     

    If you want to completely install unixODBC and all related drivers, run the following from the unixODBC root directory:

    Code Block
    
      $ sudo make install
      

    If your system complains about undefined symbols during unixODBC testing (such as with isql or odbcinst) after installation, try running ldconfig to update your dynamic linker's runtime libraries.
    If you only want to obtain the Hive ODBC driver shared object library:

  3. After compilation, the driver will be located at <unixODBC_BUILD_DIR>/Drivers/hive/.libs/libodbchive.so.1.0.0.
    This may be copied to any other location as desired. Keep in mind that the Hive ODBC driver has a dependency on the Hive client shared object library: libhiveclient.so and libthrift.so.0.
    You can manually install the unixODBC API wrapper by doing the following:

    Code Block
    
      $ cp <unixODBC_BUILD_DIR>/Drivers/hive/.libs/libodbchive.so.1.0.0 <SYSTEM_INSTALL_DIR>
      $ cd <SYSTEM_INSTALL_DIR>
      $ ln -s libodbchive.so.1.0.0 libodbchive.so
      $ ldconfig
      

...

  1. Locate the odbc.ini file associated with the Driver Manager (DM).
    1. If you are installing the driver on the system DM, then you can run the following command to print the locations of DM configuration files.

      Code Block
      
        $ odbcinst -j
        unixODBC 2.2.14
        DRIVERS............: /usr/local/etc/odbcinst.ini
        SYSTEM DATA SOURCES: /usr/local/etc/odbc.ini
        FILE DATA SOURCES..: /usr/local/etc/ODBCDataSources
        USER DATA SOURCES..: /home/ehwang/.odbc.ini
        SQLULEN Size.......: 8
        SQLLEN Size........: 8
        SQLSETPOSIROW Size.: 8
        
    2. If you are installing the driver on an application DM, then you have to help yourself on this one (wink). Hint: try looking in the installation directory of your application.
      • Keep in mind that an application's DM can exist simultaneously with the system DM and will likely use its own configuration files, such as odbc.ini.
      • Also, note that some applications do not have their own DMs and simply use the system DM.
  2. Add the following section to the DM's corresponding odbc.ini:

    Code Block
    
     [Hive]
     Driver = <path_to_libodbchive.so>
     Description = Hive Driver v1
     DATABASE = default
     HOST = <Hive_server_address>
     PORT = <Hive_server_port>
     FRAMED = 0
     

...

Once you have installed the necessary Hive ODBC libraries and added a Hive entry in your system's default odbc.ini, you will be able to interactively test the driver with isql:

Code Block

$ isql -v Hive

If your system does not have isql, you can obtain it by installing the entirety of unixODBC. If you encounter an error saying that the shared libraries cannot be opened by isql, use the ldd tool to ensure that all dynamic library dependencies are resolved and use the file tool to ensure that isql and all necessary libraries are compiled into the same architecture (32 or 64 bit).

...

  • Comments: Please keep in mind that this is still an initial version and is still very rough around the edges. However, it provides basic ODBC 3.51 API support for connecting, executing queries, fetching, etc. This driver has been successfully tested on 32-bit and 64-bit linux machines with iSQL. It has also been tested with partial success on enterprise applications such as MicroStrategy. Due to licensing reasons, the unixODBC API wrapper files will be uploaded as a separate JIRA attachment that will not be part of this code repository.
  • Limitations:
    • Only support for Linux operating systems
    • No support for Unicode
    • No support for asynchronous execution of queries
    • Does not support pattern matching for functions such as SQLColumns and SQLTables; requires exact matches.
    • Hive Server is currently not thread safe (see JIRA HIVE-80: https://issues.apache.org/jira/browse/HIVE-80). This will prevent the driver from safely making multiple connections to the same Hive Server. We need to resolve this issue to allow the driver to operate properly.
    • Hive Server's getSchema() function seems to have trouble with certain types of queries (such as "SELECT * ..." or "EXPLAIN"), and so the Hive ODBC driver sometimes has difficulties with these queries as well.
  • ODBC API Function Support (does anyone know how to remove the linking from the function names?):

    SQLAllocConnect

    supported

    SQLAllocEnv

    supported

    SQLAllocHandle

    supported

    SQLAllocStmt

    supported

    SQLBindCol

    supported

    SQLBindParameter

    NOT supported

    SQLCancel

    NOT supported

    SQLColAttribute

    supported

    SQLColumns

    supported

    SQLConnect

    supported

    SQLDescribeCol

    supported

    SQLDescribeParam

    NOT supported

    SQLDisconnect

    supported

    SQLDriverConnect

    supported

    SQLError

    supported

    SQLExecDirect

    supported

    SQLExecute

    supported

    SQLExtendedFetch

    NOT supported

    SQLFetch

    supported

    SQLFetchScroll

    NOT supported

    SQLFreeConnect

    supported

    SQLFreeEnv

    supported

    SQLFreeHandle

    supported

    SQLFreeStmt

    supported

    SQLGetConnectAttr

    NOT supported

    SQLGetData

    supported (however, SQLSTATE not returning values)

    SQLGetDiagField

    NOT supported

    SQLGetDiagRec

    supported

    SQLGetInfo

    partially supported; (to get MSTR v9 running)

    SQLMoreResults

    NOT supported

    SQLNumParams

    NOT supported

    SQLNumResultCols

    supported

    SQLParamOptions

    NOT supported

    SQLPrepare

    supported; but does not permit parameter markers

    SQLRowCount

    NOT supported

    SQLSetConnectAttr

    NOT supported

    SQLSetConnectOption

    NOT supported

    SQLSetEnvAttr

    Limited support

    SQLSetStmtAttr

    NOT supported

    SQLSetStmtOption

    NOT supported

    SQLTables

    supported

    SQLTransact

    NOT supported