Hello. We're getting started with Spark Streaming. We're working to build
some unit/acceptance testing around functions that consume DStreams. The
current method for creating DStreams is to populate the data by creating an
InputDStream:

val input = Array(TestDataFactory.CreateEvent(123 notFoundData))
val queue =
scala.collection.mutable.Queue(ssc.sparkContext.parallelize(input))
val events: InputDStream[MyEvent] = ssc.queueStream(queue)

The 'events' InputDStream can then be fed into functions. However, the
stream does not allow checkpointing. This means that we're unable to use
this to feed methods/classes that execute stateful actions like
'updateStateByKey'.

Does anyone have a simple, contained method to create DStreams that allow
for checkpointing? I looked at the Spark unit test framework, but that
seems to require access to a bunch of spark internals (requiring that
you're within the spark package, etc.)

Reply via email to