[ https://issues.apache.org/jira/browse/HIVE-21901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16881402#comment-16881402 ]
slim bouguerra edited comment on HIVE-21901 at 7/9/19 5:54 PM: --------------------------------------------------------------- in theory this class {code} DruidSelectQueryRecordReader.java {code} should not be used but i see in practice still used, will look at this. was (Author: bslim): in theory this class {code }DruidSelectQueryRecordReader.java {code} should not be used but i see in practice still used, will look at this. > Join queries across different datasources (Druid and JDBC StorageHandler) > ------------------------------------------------------------------------- > > Key: HIVE-21901 > URL: https://issues.apache.org/jira/browse/HIVE-21901 > Project: Hive > Issue Type: Bug > Components: Druid integration, StorageHandler > Affects Versions: 3.1.1 > Reporter: Subramani Raju V > Priority: Major > > We have a druid datasource and have external table created in hive for the > same datasource. > For example: > > {code:java} > CREATE EXTERNAL TABLE druid_table_1 > STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' > TBLPROPERTIES ("druid.datasource" = "wikipedia"); > {code} > > > We have another table in mysql database, which also has an external table > created in hive in this fashion: > > {code:java} > CREATE EXTERNAL TABLE sample_table_1 > ( > old_id int, > city_name string, > new_id int > ) > STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler' > TBLPROPERTIES ( > "hive.sql.database.type" = "MYSQL", > "hive.sql.jdbc.driver" = "com.mysql.jdbc.Driver", > "hive.sql.jdbc.url" = "jdbc:mysql://172.16.0.15:3307/test", > "hive.sql.dbcp.username" = "hive_user", > "hive.sql.dbcp.password" = "hive_pass", > "hive.sql.table" = "city_mapping" > ); > {code} > So we are able to perform normal queries on the individual tables, but when > we try to do join operation for both the above tables in this fashion: > > > {code:java} > SELECT * > FROM druid_table_1 o > JOIN sample_table_1 c > ON (c.city_name = o.channel) limit 10; > {code} > Then we are getting the error as follows: > > > {code:java} > TaskAttempt 1 failed, info=[Error: Error while running task ( failure ) : > attempt_1560945328057_0022_2_01_000000_1:java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296) > at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at > com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108) > at > com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41) > at > com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.io.IOException: java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:419) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267) > ... 16 more > Caused by: java.io.IOException: java.lang.NullPointerException > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365) > at > org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79) > at > org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116) > at > org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151) > at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68) > ... 18 more > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hive.druid.serde.DruidSelectQueryRecordReader.nextKeyValue(DruidSelectQueryRecordReader.java:62) > at > org.apache.hadoop.hive.druid.serde.DruidSelectQueryRecordReader.next(DruidSelectQueryRecordReader.java:85) > at > org.apache.hadoop.hive.druid.serde.DruidSelectQueryRecordReader.next(DruidSelectQueryRecordReader.java:38) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) > ... 24 more > ],{code} > We are running > hive - v3.1.1 > tez - v0.9.2 > druid - v0.14.2 > hadoop - v2.8.5 -- This message was sent by Atlassian JIRA (v7.6.3#76005)