You can use per-note scoped mode, so that there will be multiple python processes but share the same spark session. Check this doc for more details https://zeppelin.apache.org/docs/0.10.1/usage/interpreter/interpreter_binding_mode.html
On Tue, Jun 28, 2022 at 1:08 AM Chenyang Zhang <chenyang.zh...@c3.ai> wrote: > Hi, > > > > Here is Chenyang. I am working on a project using PySpark and I am blocked > because I want to share data between different Spark applications. The > situation is that we have a running java server which can handles incoming > requests with a thread pool, and each thread has a corresponding python > process. We want to use pandas on Spark, but have it so that any of the > python processes can access the same data in spark. For example, in a > python process, we created a SparkSession, read some data, modified the > data using pandas on Spark api and we want to get access to that data in a > different python process. Someone from Spark community point me to Apache > Zeppelin because it implements logic to share one Spark Session. How did > you achieve that? Are there any documentation or references I can refer to? > Thanks so much for your help. > > > > Best regards, > > Chenyang > > > > > *Chenyang Zhang* > Software Engineering Intern, Platform > Redwood City, California > <https://c3.ai/?utm_source=signature&utm_campaign=enterpriseai> > © 2022 C3.ai. Confidential Information. > > -- Best Regards Jeff Zhang