Re: Spark SQL is not returning records for HIVE transactional tables on HDP

Alan Gates Mon, 14 Mar 2016 09:49:17 -0700

I don’t know why you’re seeing Hive on Spark sometimes work with transactional 
tables and sometimes not.  But note that in general it doesn’t work.  The Spark 
runtime in Hive does not send heartbeats to the transaction/lock manager so it 
will timeout any job that takes longer than the heartbeat interval (5 min by 
default).


Alan.

> On Mar 12, 2016, at 00:24, @Sanjiv Singh <sanjiv.is...@gmail.com> wrote:
> 
> Hi All,
> 
> I am facing this issue on HDP setup on which COMPACTION is required only once 
> for transactional tables to fetch records with Spark SQL.
> On the other hand, Apache setup doesn't required compaction even once.
> 
> May be something got triggered on meta-store after compaction, Spark SQL 
> start recognizing delta files.
>   
> Let know me if needed other details to get root cause.
> 
> Try this,
> 
> See complete scenario :
> 
> hive> create table default.foo(id int) clustered by (id) into 2 buckets 
> STORED AS ORC TBLPROPERTIES ('transactional'='true');
> hive> insert into default.foo values(10);
> 
> scala> sqlContext.table("default.foo").count // Gives 0, which is wrong 
> because data is still in delta files
> 
> Now run major compaction:
> 
> hive> ALTER TABLE default.foo COMPACT 'MAJOR';
> 
> scala> sqlContext.table("default.foo").count // Gives 1
> 
> hive> insert into foo values(20);
> 
> scala> sqlContext.table("default.foo").count // Gives 2 , no compaction 
> required.
> 
> 
> 
> 
> Regards
> Sanjiv Singh
> Mob :  +091 9990-447-339

Re: Spark SQL is not returning records for HIVE transactional tables on HDP

Reply via email to