Re: Spark SQL is not returning records for hive bucketed tables on HDP

2016-02-23 Thread @Sanjiv Singh
Yes, It is very strange and also very opposite to my belief on Spark SQL on hive tables. I am facing this issue on HDP setup on which COMPACTION is required only once. On the other hand, Apache setup doesn't required compaction even once. May be something got triggered on meta-store after compact

Re: Spark SQL is not returning records for hive bucketed tables on HDP

2016-02-23 Thread @Sanjiv Singh
Try this, hive> create table default.foo(id int) clustered by (id) into 2 buckets STORED AS ORC TBLPROPERTIES ('transactional'='true'); hive> insert into default.foo values(10); scala> sqlContext.table("default.foo").count // Gives 0, which is wrong because data is still in delta files Now run

Re: Spark SQL is not returning records for hive bucketed tables on HDP

2016-02-22 Thread @Sanjiv Singh
Hi Varadharajan, That is the point, Spark SQL is able to recognize delta files. See below directory structure, ONE BASE (43 records) and one DELTA (created after last insert). And I am able see last insert through Spark SQL. *See below complete scenario :* *Steps:* - Inserted 43 records in

Re: Spark SQL is not returning records for hive bucketed tables on HDP

2016-02-22 Thread @Sanjiv Singh
Hi Varadharajan, Can you elaborate on (you quoted on previous mail) : "I observed that hive transaction storage structure do not work with spark yet" If it is related to delta files created after each transaction and spark would not be able recognize them. then I have a table *mytable *(ORC , BU

Re: Spark SQL is not returning records for hive bucketed tables on HDP

2016-02-21 Thread @Sanjiv Singh
Compaction would have been triggered automatically as following properties already set in *hive-site.xml*. and also *NO_AUTO_COMPACTION* property not been set for these tables. hive.compactor.initiator.on true hive.compactor.worker.threads 1 Do

Re: Spark SQL is not returning records for hive bucketed tables on HDP

2016-02-21 Thread @Sanjiv Singh
Hi Varadharajan, Thanks for your response. Yes it is transnational table; See below *show create table. * Table hardly have 3 records , and after triggering minor compaction on tables , it start showing results on spark SQL. > *ALTER TABLE hivespark COMPACT 'major';* > *show create table hiv