Here is an example:

|val sc = new SparkContext(new SparkConf)

// access hive tables
val hqlc = new HiveContext(sc)
import hqlc.implicits._

// access files on hdfs
val sqlc = new SQLContext(sc)
import sqlc.implicits._
sqlc.jsonFile("xxx").registerTempTable("xxx")

// access other DB
sqlc.jdbc("url", "tablename").registerTempTable("xxx")

// create streams
val ssc = new StreamingContext(sc, Seconds(30))
val stream = ssc.textFileStream("xxx")

// you could use foreachRDD or transform
// foreachRDD return Unit
// transform return DStream
stream.foreachRDD( streamRDD => {
  streamRDD.map(aaa(_)).toDF.registerTempTable("xxx")
  sqlc.sql("xxx").saveAsParquetFile("xxx")
})
ssc.start()
ssc.awaitTermination()
|


On Thu, Jun 11, 2015 at 10:47 PM, ?????? jadetan...@qq.com <http://mailto:jadetan...@qq.com> wrote:
Hi all:
We are trying to using spark to do some real time data processing. I need do some sql-like query and analytical tasks with the real time data against historical normalized data stored in data bases. Is there anyone has done this kind of work or design? Any suggestion or material would be truly welcomed.

On 6/11/15 22:47, ?????? wrote:

Hi all:
We are trying to using spark to do some real time data processing. I need do some sql-like query and analytical tasks with the real time data against historical normalized data stored in data bases. Is there anyone has done this kind of work or design? Any suggestion or material would be truly welcomed.

?6?7

Reply via email to