Hi Everyone,
I was thinking if I can use hiveContext inside foreach like below,
object Test {
def main(args: Array[String]): Unit = {
val conf = new SparkConf()
val sc = new SparkContext(conf)
val hiveContext = new HiveContext(sc)
val dataElementsFile = args(0)
val deDF =
hiveContext.read.text(dataElementsFile).toDF("DataElement").coalesce(1).distinct().cache()
def calculate(de: Row) {
val dataElement = de.getAs[String]("DataElement").trim
val df1 = hiveContext.sql("SELECT cyc_dt, supplier_proc_i, '" +
dataElement + "' as data_elm, " + dataElement + " as data_elm_val FROM
TEST_DB.TEST_TABLE1 ")
df1.write.insertInto("TEST_DB.TEST_TABLE1")
}
deDF.collect().foreach(calculate)
}
}
I looked at
https://spark.apache.org/docs/1.6.0/api/scala/index.html#org.apache.spark.sql.hive.HiveContext
and I see it is extending SqlContext which extends Logging with
Serializable.
Can anyone tell me if this is the right way to use it ? Thanks for your time.
Regards,
Ajay