Hi I have Hive insert into query which creates new Hive partitions. I have
two Hive partitions named server and date. Now I execute insert into queries
using the following code and try to save it
DataFrame dframe = hiveContext.sql("insert into summary1
partition(server='a1',date='2015-05-22') sele
I have installed the SparkR package from Spark distribution into the R
library. I can call the following command and it seems to work properly:
library(SparkR)
However, when I try to get the Spark context using the following code,
sc <- sparkR.init(master="local")
It fails after some time with th
Hi I am having couple of Spark jobs which processes thousands of files every
day. File size may very from MBs to GBs. After finishing job I usually save
using the following code
finalJavaRDD.saveAsParquetFile("/path/in/hdfs"); OR
dataFrame.write.format("orc").save("/path/in/hdfs") //storing as ORC
Hi I have to fire few insert into queries which uses Hive partitions. I have
two Hive partitions named server and date. Now I execute insert into queries
using hiveContext as shown below query works fine
hiveContext.sql("insert into summary1
partition(server='a1',date='2015-05-22') select from sou