Hello. We're getting started with Spark Streaming. We're working to build some unit/acceptance testing around functions that consume DStreams. The current method for creating DStreams is to populate the data by creating an InputDStream:
val input = Array(TestDataFactory.CreateEvent(123 notFoundData)) val queue = scala.collection.mutable.Queue(ssc.sparkContext.parallelize(input)) val events: InputDStream[MyEvent] = ssc.queueStream(queue) The 'events' InputDStream can then be fed into functions. However, the stream does not allow checkpointing. This means that we're unable to use this to feed methods/classes that execute stateful actions like 'updateStateByKey'. Does anyone have a simple, contained method to create DStreams that allow for checkpointing? I looked at the Spark unit test framework, but that seems to require access to a bunch of spark internals (requiring that you're within the spark package, etc.)