[ https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15751777#comment-15751777 ]
Jesus Camacho Rodriguez commented on HIVE-15277: ------------------------------------------------ [~bslim], some of the test failures (age=1) seem related to this patch. Could you take a look? Change in _LineageLogger_ causes changes in lineage golden files. It is better to tackle that in a follow-up and remove it from this patch, as it is not part of this issue. > Teach Hive how to create/delete Druid segments > ----------------------------------------------- > > Key: HIVE-15277 > URL: https://issues.apache.org/jira/browse/HIVE-15277 > Project: Hive > Issue Type: Sub-task > Components: Druid integration > Affects Versions: 2.2.0 > Reporter: slim bouguerra > Assignee: slim bouguerra > Attachments: HIVE-15277.2.patch, HIVE-15277.patch, HIVE-15277.patch, > HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, > file.patch > > > We want to extend the DruidStorageHandler to support CTAS queries. > In this implementation Hive will generate druid segment files and insert the > metadata to signal the handoff to druid. > The syntax will be as follows: > {code:sql} > CREATE TABLE druid_table_1 > STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' > TBLPROPERTIES ("druid.datasource" = "datasourcename") > AS <select `timecolumn` as `___time`, `dimension1`,`dimension2`, `metric1`, > `metric2`....>; > {code} > This statement stores the results of query <input_query> in a Druid > datasource named 'datasourcename'. One of the columns of the query needs to > be the time dimension, which is mandatory in Druid. In particular, we use the > same convention that it is used for Druid: there needs to be a the column > named '__time' in the result of the executed query, which will act as the > time dimension column in Druid. Currently, the time column dimension needs to > be a 'timestamp' type column. > metrics can be of type long, double and float while dimensions are strings. > Keep in mind that druid has a clear separation between dimensions and > metrics, therefore if you have a column in hive that contains number and need > to be presented as dimension use the cast operator to cast as string. > This initial implementation interacts with Druid Meta data storage to > add/remove the table in druid, user need to supply the meta data config as > --hiveconf hive.druid.metadata.password=XXX --hiveconf > hive.druid.metadata.username=druid --hiveconf > hive.druid.metadata.uri=jdbc:mysql://host/druid -- This message was sent by Atlassian JIRA (v6.3.4#6332)