Hello,

I have been working on a large Spark Scala notebook. I recently had the
requirement to produce graphs/plots out of these data. Python and PySpark
seemed like a natural fit but since I've already invested a lot of time and
effort into the Scala version, I want to restrict my usage of python to
just plotting.

I found a good workflow for where in the scala paragraphs I can use
*registerTempTable
*and in python I can just use *sqlContext.table *to retrieve that table.

The problem now is that if I try to run all paragraphs to get the notebook
updated, the python paragraphs fail because they are running before the
scala ones eventhough they are placed after them.

It seems like the behavior in Zeppelin is that it attempts to run the
paragraphs concurrently if they were running on different interpreters
which might seem fine on the surface. But now that I want to introduce some
dependency between spark/pyspark paragraphs, is there any way to do that?

-- 
Cheers,
Ahmed

Reply via email to