Re: Saving data using tempTable versus save() method

Robin East Tue, 21 Jun 2016 02:03:42 -0700

if you are able to trace the underlying oracle session you can see whether a 
commit has been called or not.





> On 21 Jun 2016, at 09:57, Robin East <robin.e...@xense.co.uk> wrote:
> 
> I’m not sure - I don’t know what those APIs do under the hood. It simply rang 
> a bell with something I have fallen foul of in the past (not with Spark 
> though) - have wasted many hours forgetting to commit and then scratching my 
> head as why my data is not persisting.
> 
> 
> 
> 
>> On 21 Jun 2016, at 09:20, Mich Talebzadeh <mich.talebza...@gmail.com 
>> <mailto:mich.talebza...@gmail.com>> wrote:
>> 
>> that is a very interesting point. I am not sure. how can I do that with
>> 
>> sorted.save("oraclehadoop.sales2")
>> 
>> like .. commit?
>> 
>> thanks
>> 
>> Dr Mich Talebzadeh
>>  
>> LinkedIn  
>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>  
>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>
>>  
>> http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/>
>>  
>> 
>> On 21 June 2016 at 08:56, Robin East <robin.e...@xense.co.uk 
>> <mailto:robin.e...@xense.co.uk>> wrote:
>> random thought - do you need an explicit commit with the 2nd method?
>> 
>> 
>> 
>> 
>>> On 20 Jun 2016, at 21:35, Mich Talebzadeh <mich.talebza...@gmail.com 
>>> <mailto:mich.talebza...@gmail.com>> wrote:
>>> 
>>> Hi,
>>> 
>>> I have a DF based on a table and sorted and shown below
>>> 
>>> This is fine and when I register as tempTable I can populate the underlying 
>>> table sales 2 in Hive. That sales2 is an ORC table 
>>> 
>>>  val s = HiveContext.table("sales_staging")
>>>   val sorted = s.sort("prod_id","cust_id","time_id","channel_id","promo_id")
>>>   sorted.registerTempTable("tmp")
>>>   sqltext = """
>>>   INSERT INTO TABLE oraclehadoop.sales2
>>>   SELECT
>>>           PROD_ID
>>>         , CUST_ID
>>>         , TIME_ID
>>>         , CHANNEL_ID
>>>         , PROMO_ID
>>>         , QUANTITY_SOLD
>>>         , AMOUNT_SOLD
>>>   FROM tmp
>>>   """
>>>   HiveContext.sql(sqltext)
>>>   HiveContext.sql("select count(1) from oraclehadoop.sales2").show
>>>   HiveContext.sql("truncate table oraclehadoop.sales2")
>>> 
>>>   sorted.save("oraclehadoop.sales2")
>>>   HiveContext.sql("select count(1) from oraclehadoop.sales2").show
>>> 
>>> When I truncate the Hive table and use sorted.save("oraclehadoop.sales2")
>>> 
>>> It does not save any data
>>> 
>>> Started at
>>> [20/06/2016 21:21:57.57]
>>> +------+
>>> |   _c0|
>>> +------+
>>> |918843|    // This works
>>> +------+
>>> [Stage 7:============================================>              (3 + 1) 
>>> / 4]SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
>>> SLF4J: Defaulting to no-operation (NOP) logger implementation
>>> SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder 
>>> <http://www.slf4j.org/codes.html#StaticLoggerBinder> for further details.
>>> +---+
>>> |_c0|
>>> +---+
>>> |  0|      // This does not
>>> +---+
>>> Finished at
>>> [20/06/2016 21:22:30.30]
>>> 
>>> Any ideas if anyone has seen this before?
>>> 
>>> 
>>> The issue is saving data. Saving through tempTable works but the other one 
>>> does not work.
>>> 
>>> 
>>> Thanks
>>> 
>>> Dr Mich Talebzadeh
>>>  
>>> LinkedIn  
>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>>  
>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>
>>>  
>>> http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/>
>>>  
>> 
>> 
>

Re: Saving data using tempTable versus save() method

Reply via email to