GitHub user abbaspour opened a pull request: https://github.com/apache/hive/pull/133
Enable webhcat to run JDBC connection for Hive DDL queries This is a change in `HcatDelegator` to run Hive DDL queries over **JDBC** connection in contrast to slow `hcat` command. Motivation is basically speed. The way `HcatDelegator` launches new `hcat` scripts per call makes it unsuitable for interactive REST calls. This change speeds up /ddl queries from normally 10-20 sec down to few milliseconds. No connection pooling is in place to make the RP small but that can be added anytime. Also being JDBC connection, this is pretty secure and compatible with all access policies define in Hive server2. User does not have visibility over other databases (which used to be the case in `hcat` mode.) To switch to JDBC mode simply add this configuration to **webhcat-site.xml** ```xml <property> <name>templeton.ddl.mode</name> <value>jdbc</value> </property> <property> <name>hive.jdbc.url</name> <value>jdbc:hive2://server:port</value> </property> ``` For secure environments we also need these attributes in webhcat-site.xml configuration: ```xml <property> <name>hive.server2.kerberos.keytab</name> <value>/etc/security/keytabs/hive.service.keytab</value> </property> <property> <name>hive.server2.kerberos.principal</name> <value>hive/_HOST@{REALM}</value> </property> ``` This change uses Hive DDL JSON output so that should be allowed in **hiveserver2-site.xml** ```xml <property> <name>hive.security.authorization.sqlstd.confwhitelist.append</name> <value>hive.ddl.output.format</value> </property> ``` You can merge this pull request into a Git repository by running: $ git pull https://github.com/datarepublic/hive master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hive/pull/133.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #133 ---- commit 39cfb08044131b5422fd6e406bae8221a6b75011 Author: Amin Abbaspour <a...@consultants.datareplic.io> Date: 2017-01-16T02:57:20Z Enable webhcat to run JDBC connection for Hive DDL queries ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---