we share a single single sparksession across tests, and they can run in parallel. is pretty fast
On Mon, Aug 1, 2016 at 12:02 PM, Everett Anderson <ever...@nuna.com.invalid> wrote: > Hi, > > Right now, if any code uses DataFrame/Dataset, I need a test setup that > brings up a local master as in this article > <http://blog.cloudera.com/blog/2015/09/making-apache-spark-testing-easy-with-spark-testing-base/> > . > > That's a lot of overhead for unit testing and the tests can't run in > parallel, so testing is slow -- this is more like what I'd call an > integration test. > > Do people have any tricks to get around this? Maybe using spy mocks on > fake DataFrame/Datasets? > > Anyone know if there are plans to make more traditional unit testing > possible with Spark SQL, perhaps with a stripped down in-memory > implementation? (I admit this does seem quite hard since there's so much > functionality in these classes!) > > Thanks! > > - Everett > >