-
Install ODBC and Azure HD Insight drivers for Windows. Drivers can be downloaded here.
-
Create the DSN (Data Source Name) by running the ODBC DataSource Adminstrator tool. A System DSN sample for Hive is provided for you to modify.
Description: vorac (or any other name) Host: vorac.azurehdinsight.net Port: 443 Database: sm_tst611 (or your database name) Mechanism: Azure HDInsight Service User/Password: admin/<your password here>
-
Create the Hadoop external account, as detailed in this section.
-
Download and install ODBC drivers for Apache Hive from HortonWorks. Drivers can be downloaded here.
dpkg -i hive-odbc-native_2.1.10.1014-2_amd64.deb
-
Create the DSN (Data Source Name) and edit the odbc.ini file. Then, create a DSN for your Hive connection.
The UseNativeQuery parameter here is mandatory for both Windows and Linux. Campaign is Hive-aware and will not work correctly unless UseNativeQuery is set. Typically, the driver or Hive SQL Connector will rewrite queries and tamper the column ordering.
Here is an example for HDInsight to setup a connection called "vorac":
[ODBC Data Sources] vorac [vorac] Driver=/usr/lib/hive/lib/native/Linux-amd64-64/libhortonworkshiveodbc64.so HOST=vorac.azurehdinsight.net PORT=443 Schema=sm_tst611 HiveServerType=2 AuthMech=6 UID=admin PWD=<your password here> HTTPPath= UseNativeQuery=1
Note:
The authentication setup depends on the Hive/Hadoop configuration. For instance, for HDInsight, use AuthMech=6 for user/password authentication, as described here.
-
Setup Hortonworks drivers via /usr/lib/hive/lib/native/Linux-amd64-64/hortonworks.hiveodbc.ini.
You have to use UTF-16 to be able to connect with Campaign and unix-odbc (libodbcinst).
[Driver] DriverManagerEncoding=UTF-16 ErrorMessagesPath=/usr/lib/hive/lib/native/hiveodbc/ErrorMessages/ LogLevel=0 LogPath=/tmp/hive SwapFilePath=/tmp ODBCInstLib=libodbcinst.so
-
Create the Hadoop external account, as detailed in this section.
The Hive connector is setup using the OOB ODBC connection with the following parameters:
- Account type: external database
- Connection type: odbc
- Server: name of the DSN. In the example above, that would be "vorac"
- Account and password: user and password for authentication
- Database: name of the database (if not specified in DSN). It can be left empty if specified in the DSN
- Time zone: server time zone
Option name |
Option values |
Comment |
---|---|---|
Hive options | ||
bucketsNumber | 20 |
Number of buckets when creating a clustered table |
fileFormat |
PARQUET |
Default file format for work tables |
<postConnectSQL> <![CDATA[ create table if not exists a1 ( i int ); insert into a1(i) values ('42'); ]]> </postConnectSQL>