Hi,
I am getting issue while converting dataframe to Rdd, it reduces partitions.
In our code, Dataframe was created as :
DataFrame DF = hiveContext.sql("select * from table_instance");
When I convert my dataframe to rdd and try to get its number of partitions
as
RDD newRDD = Df.rdd();
System.
correct its creating delta file in hdfs.but after compaction it merge all
data and create extra directory where all bucketed data present.( i am able
to read data from hive but not from sparksql).
i am using spark 1.6.0 and hive 1.2.1.
reading from hive transactional table is not supported yet by sparl sql?
On Tue, Aug 9, 2016 at 12:18 AM, manish jaiswal
wrote:
> Hi,
>
> I am not able to read data from hive transactional table using sparksql.
> (i don't want r
Hi,
I am not able to read data from hive transactional table using sparksql.
(i don't want read via hive jdbc)
Please help.
Hi,
What is the best approach to trigger spark job in production cluster?
Hi,
Using sparkHiveContext when we read all rows where age was between 0 and
100, even though we requested rows where age was less than 15. Such full
table scanning is an expensive operation.
ORC avoids this type of overhead by using predicate push-down with three
levels of built-in indexes withi
-- Forwarded message --
From: "manish jaiswal"
Date: Jun 30, 2016 17:35
Subject: HiveContext
To: , , <
user-h...@spark.apache.org>
Cc:
Hi,
I am new to Spark.I found using HiveContext we can connect to hive and run
HiveQLs. I run it and it worked.
My doubt is w