Re: spark stream and spark sql with data warehouse

Yi Tian Thu, 11 Jun 2015 09:04:35 -0700

Here is an example:

|val sc = new SparkContext(new SparkConf)


// access hive tables
val hqlc = new HiveContext(sc)
import hqlc.implicits._

// access files on hdfs
val sqlc = new SQLContext(sc)
import sqlc.implicits._
sqlc.jsonFile("xxx").registerTempTable("xxx")

// access other DB
sqlc.jdbc("url", "tablename").registerTempTable("xxx")

// create streams
val ssc = new StreamingContext(sc, Seconds(30))
val stream = ssc.textFileStream("xxx")

// you could use foreachRDD or transform
// foreachRDD return Unit
// transform return DStream
stream.foreachRDD( streamRDD => {
  streamRDD.map(aaa(_)).toDF.registerTempTable("xxx")
  sqlc.sql("xxx").saveAsParquetFile("xxx")
})
ssc.start()
ssc.awaitTermination()
|

On Thu, Jun 11, 2015 at 10:47 PM, ?????? jadetan...@qq.com<http://mailto:jadetan...@qq.com> wrote:

Hi all:

We are trying to using spark to do some real time data processing. Ineed do some sql-like query and analytical tasks with the real time dataagainst historical normalized data stored in data bases. Is there anyonehas done this kind of work or design? Any suggestion or material wouldbe truly welcomed.


On 6/11/15 22:47, ?????? wrote:

Hi all:
We are trying to using spark to do some real time dataprocessing. I need do some sql-like query and analytical tasks withthe real time data against historical normalized data stored in databases. Is there anyone has done this kind of work or design? Anysuggestion or material would be truly welcomed.

?6?7

Re: spark stream and spark sql with data warehouse

Reply via email to