You can use per-note scoped mode, so that there will be multiple python
processes but share the same spark session.
Check this doc for more details
https://zeppelin.apache.org/docs/0.10.1/usage/interpreter/interpreter_binding_mode.html


On Tue, Jun 28, 2022 at 1:08 AM Chenyang Zhang <chenyang.zh...@c3.ai> wrote:

> Hi,
>
>
>
> Here is Chenyang. I am working on a project using PySpark and I am blocked
> because I want to share data between different Spark applications. The
> situation is that we have a running java server which can handles incoming
> requests with a thread pool, and each thread has a corresponding python
> process. We want to use pandas on Spark, but have it so that any of the
> python processes can access the same data in spark. For example, in a
> python process, we created a SparkSession, read some data, modified the
> data using pandas on Spark api and we want to get access to that data in a
> different python process. Someone from Spark community point me to Apache
> Zeppelin because it implements logic to share one Spark Session. How did
> you achieve that? Are there any documentation or references I can refer to?
> Thanks so much for your help.
>
>
>
> Best regards,
>
> Chenyang
>
>
>
>
> *Chenyang Zhang*
> Software Engineering Intern, Platform
> Redwood City, California
> <https://c3.ai/?utm_source=signature&utm_campaign=enterpriseai>
> © 2022 C3.ai. Confidential Information.
>
>

-- 
Best Regards

Jeff Zhang

Reply via email to