Thanks, this looks better
// parse the lines of data into sensor objects
val sensorDStream = ssc.textFileStream("/stream").
map(Sensor.parseSensor)
sensorDStream.foreachRDD { rdd =>
// filter sensor data for low psi
val alertRDD = rdd.filter(sensor => sensor.psi <
Yes, for example "val sensorRDD = rdd.map(Sensor.parseSensor)" is a
line of code executed on the driver; it's part the function you
supplied to foreachRDD. However that line defines an operation on an
RDD, and the map function you supplied (parseSensor) will ultimately
be carried out on the cluster
sensor object
rdd.saveAsHadoopDataset(jobConfig)
}
Will perform the bulk of the work in the distributed processing, before the
results are returned to the driver for writing to HBase.
Thanks,
Ewan
From: Carol McDonald [mailto:cmcdon...@maprtech.com]
Sent: 28 August 2015 15:30
To: user
Subject: correct u
I would like to make sure that I am using the DStream foreachRDD
operation correctly. I would like to read from a DStream transform the
input and write to HBase. The code below works , but I became confused
when I read "Note that the function *func* is executed in the driver
process" ?
val