[jira] [Comment Edited] (HIVE-21901) Join queries across different datasources (Druid and JDBC StorageHandler)

slim bouguerra (JIRA) Tue, 09 Jul 2019 10:55:41 -0700


    [ 
https://issues.apache.org/jira/browse/HIVE-21901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16881402#comment-16881402
 ]


slim bouguerra edited comment on HIVE-21901 at 7/9/19 5:54 PM:
---------------------------------------------------------------

in theory this class 
{code}
DruidSelectQueryRecordReader.java 
{code}
should not be used but i see in practice still used, will look at this. 


was (Author: bslim):
in theory this class 
{code }DruidSelectQueryRecordReader.java {code}
should not be used but i see in practice still used, will look at this. 

> Join queries across different datasources (Druid and JDBC StorageHandler)
> -------------------------------------------------------------------------
>
>                 Key: HIVE-21901
>                 URL: https://issues.apache.org/jira/browse/HIVE-21901
>             Project: Hive
>          Issue Type: Bug
>          Components: Druid integration, StorageHandler
>    Affects Versions: 3.1.1
>            Reporter: Subramani Raju V
>            Priority: Major
>
> We have a druid datasource and have external table created in hive for the 
> same datasource.
> For example: 
>  
> {code:java}
> CREATE EXTERNAL TABLE druid_table_1
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES ("druid.datasource" = "wikipedia");
> {code}
>  
>  
> We have another table in mysql database, which also has an external table 
> created in hive in this fashion: 
>  
> {code:java}
> CREATE EXTERNAL TABLE sample_table_1
> (
> old_id int,
> city_name string,
> new_id int
> )
> STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
> TBLPROPERTIES (
> "hive.sql.database.type" = "MYSQL",
> "hive.sql.jdbc.driver" = "com.mysql.jdbc.Driver",
> "hive.sql.jdbc.url" = "jdbc:mysql://172.16.0.15:3307/test",
> "hive.sql.dbcp.username" = "hive_user",
> "hive.sql.dbcp.password" = "hive_pass",
> "hive.sql.table" = "city_mapping"
> );
> {code}
> So we are able to perform normal queries on the individual tables, but when 
> we try to do join operation for both the above tables in this fashion: 
>  
>  
> {code:java}
> SELECT *
> FROM druid_table_1 o
> JOIN sample_table_1 c
> ON (c.city_name = o.channel) limit 10;
> {code}
> Then we are getting the error as follows: 
>  
>  
> {code:java}
> TaskAttempt 1 failed, info=[Error: Error while running task ( failure ) : 
> attempt_1560945328057_0022_2_01_000000_1:java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
> at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at 
> com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
> at 
> com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
> at 
> com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.IOException: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:419)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
> ... 16 more
> Caused by: java.io.IOException: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
> at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
> at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
> at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
> at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
> ... 18 more
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.druid.serde.DruidSelectQueryRecordReader.nextKeyValue(DruidSelectQueryRecordReader.java:62)
> at 
> org.apache.hadoop.hive.druid.serde.DruidSelectQueryRecordReader.next(DruidSelectQueryRecordReader.java:85)
> at 
> org.apache.hadoop.hive.druid.serde.DruidSelectQueryRecordReader.next(DruidSelectQueryRecordReader.java:38)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
> ... 24 more
> ],{code}
> We are running 
> hive - v3.1.1
> tez - v0.9.2
> druid - v0.14.2
> hadoop - v2.8.5



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HIVE-21901) Join queries across different datasources (Druid and JDBC StorageHandler)

Reply via email to