Re: Hive Table not from from Spark SQL

Michael Armbrust Thu, 26 Mar 2015 08:07:39 -0700

What does "show tables" return?  You can also run "SET <optionName>" to
make sure that entries from you hive site are being read correctly.


On Thu, Mar 26, 2015 at 4:02 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com> wrote:

> I have tables dw_bid that is created in Hive and has nothing to do with
> Spark.  I have data in avro that i want to join with dw_bid table, this
> join needs to be done using Spark SQL.  However for some reason Spark says
> dw_bid table does not exist. How do i say spark that dw_bid is a table
> created in Hive and read it.
>
>
> Query that is run from Spark SQL
> ==============================
>  insert overwrite table sojsuccessevents2_spark select
> guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId,
> shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id as
> userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId,
> exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort,
> isDuplicate,b.bid_date as
> transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid
> as bidQuantity, b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as
>  
> bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId,
> sellerStdLevel,cssSellerLevel,a.experimentChannel from sojsuccessevents1 a
> join dw_bid b  on a.itemId = b.item_id  and  a.transactionId =
>  b.transaction_id  where b.auct_end_dt >= '2015-02-16' AND b.bid_dt >=
> '2015-02-16'  AND b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND (
> b.bid_flags & 32) = 0 and lower(a.successEventType) IN ('bid','bin')
>
>
> If i create sojsuccessevents2_spark from hive command line and run above
> command form Spark SQL program then i get error "sojsuccessevents2_spark
> table not found".
>
> Hence i dropped the command from Hive and run create table
> sojsuccessevents2_spark from Spark SQL before running above command and it
> works until it hits next road block "dw_bid table not found"
>
> This makes me belive that Spark for some reason is not able to
> read/understand the tables created outside Spark. I did copy
> /apache/hive/conf/hive-site.xml into Spark conf directory.
>
> Please suggest.
>
>
> Logs
> ———
> 15/03/26 03:50:40 INFO HiveMetaStore.audit: ugi=dvasthimal
> ip=unknown-ip-addr cmd=get_table : db=default tbl=dw_bid
> 15/03/26 03:50:40 ERROR metadata.Hive:
> NoSuchObjectException(message:default.dw_bid table not found)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1560)
>
>
>
> 15/03/26 03:50:40 ERROR yarn.ApplicationMaster: User class threw
> exception: no such table List(dw_bid); line 1 pos 843
> org.apache.spark.sql.AnalysisException: no such table List(dw_bid); line 1
> pos 843
> at
> org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
> at
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.getTable(Analyzer.scala:178)
> at
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$6.applyOrElse(Analyzer.scala:187)
>
>
>
> Regards,
> Deepak
>
>
> On Thu, Mar 26, 2015 at 4:27 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com>
> wrote:
>
>> I have this query
>>
>>  insert overwrite table sojsuccessevents2_spark select
>> guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId,
>> shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id as
>> userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId,
>> exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort,
>> isDuplicate,b.bid_date as
>> transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid
>> as bidQuantity, b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as
>>  
>> bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId,
>> sellerStdLevel,cssSellerLevel,a.experimentChannel from sojsuccessevents1 a 
>> *join
>> dw_bid b*  on a.itemId = b.item_id  and  a.transactionId =
>>  b.transaction_id  where b.auct_end_dt >= '2015-02-16' AND b.bid_dt >=
>> '2015-02-16'  AND b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND (
>> b.bid_flags & 32) = 0 and lower(a.successEventType) IN ('bid','bin')
>>
>>
>> If i create sojsuccessevents2_spark from hive command line and run above
>> command form Spark SQL program then i get error "sojsuccessevents2_spark
>> table not found".
>>
>> Hence i dropped the command from Hive and run create table
>> sojsuccessevents2_spark from Spark SQL before running above command and it
>> works until it hits next road block "*dw_bid table not found"*
>>
>> This makes me belive that Spark for some reason is not able to
>> read/understand the tables created outside Spark. I did copy
>>   /apache/hive/conf/hive-site.xml into Spark conf directory.
>>
>> Please suggest.
>>
>> Regards,
>> Deepak
>>
>>
>> On Thu, Mar 26, 2015 at 1:26 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com>
>> wrote:
>>
>>> I have a hive table named dw_bid, when i run hive from command prompt
>>> and run describe dw_bid, it works.
>>>
>>> I want to join a avro file (table) in HDFS with this hive dw_bid table
>>> and i refer it as dw_bid from Spark SQL program, however i see
>>>
>>> 15/03/26 00:31:01 INFO HiveMetaStore.audit: ugi=dvasthimal
>>> ip=unknown-ip-addr cmd=get_table : db=default tbl=dw_bid
>>> 15/03/26 00:31:01 ERROR metadata.Hive:
>>> NoSuchObjectException(message:default.dw_bid table not found)
>>> at
>>> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1560)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>
>>>
>>> Code:
>>>
>>>     val successDetail_S1 = sqlContext.avroFile(input)
>>>     successDetail_S1.registerTempTable("sojsuccessevents1")
>>>     val countS1 = sqlContext.sql("select
>>> guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId,"
>>> +
>>>         " shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id
>>> as
>>> userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId,"
>>> +
>>>         "
>>> exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort,"
>>> +
>>>         " isDuplicate,b.bid_date as
>>> transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid
>>> as bidQuantity," +
>>>     " b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as
>>>  
>>> bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId,"
>>> +
>>>     " sellerStdLevel,cssSellerLevel,a.experimentChannel" +
>>>     " from sojsuccessevents1 a join dw_bid b " +
>>>     " on a.itemId = b.item_id  and  a.transactionId =  b.transaction_id
>>> " +
>>>     " where b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND ( b.bid_flags
>>> & 32) = 0 and lower(a.successEventType) IN ('bid','bin')")
>>>     println("countS1.first:" + countS1.first)
>>>
>>>
>>>
>>> Any suggestions on how to refer a hive table form Spark SQL?
>>> --
>>>
>>> Deepak
>>>
>>>
>>
>>
>> --
>> Deepak
>>
>>
>
>
> --
> Deepak
>
>

Re: Hive Table not from from Spark SQL

Reply via email to