Hi,
Unfortunately, I don't have a working example I could share at hand, but
the flow will be roughly like this
- Retrieve an existing Python ClientServer (gateway) from the SparkContext
- Get its gateway_parameters (some are constant for PySpark, but you'll
need at least port and auth_token
Hi Jack,
My use case is a bit different, I created a subprocess instead of thread. I
can't pass the args to subprocess.
Jack Goodson 於 2022年12月12日 週一 晚上8:03寫道:
> apologies, the code should read as below
>
> from threading import Thread
>
> context = pyspark.sql.SparkSession.builder.appName("spa
apologies, the code should read as below
from threading import Thread
context = pyspark.sql.SparkSession.builder.appName("spark").getOrCreate()
t1 = Thread(target=my_func, args=(context,))
t1.start()
t2 = Thread(target=my_func, args=(context,))
t2.start()
On Tue, Dec 13, 2022 at 4:10 PM Jack G
Hi Kevin,
I had a similar use case (see below code) but with something that wasn’t
spark related. I think the below should work for you, you may need to edit
the context variable to suit your needs but hopefully it gives the general
idea of sharing a single object between multiple threads.
Thanks
Maciej, Thanks for the reply.
Could you share an example to achieve it?
Maciej 於 2022年12月12日 週一 下午4:41寫道:
> Technically speaking, it is possible in stock distribution (can't speak
> for Databricks) and not super hard to do (just check out how we
> initialize sessions), but definitely not somethi
Technically speaking, it is possible in stock distribution (can't speak
for Databricks) and not super hard to do (just check out how we
initialize sessions), but definitely not something that we test or
support, especially in a scenario you described.
If you want to achieve concurrent executio
I ran my spark job by using databricks job with a single python script.
IIUC, the databricks platform will create a spark context for this python
script.
However, I create a new subprocess in this script and run some spark code
in this subprocess, but this subprocess can't find the context created
In theory, maybe a Jupyter notebook or something similar could achieve
this? e.g. running some Jypyter kernel inside Spark driver, then another
Python process could connect to that kernel.
But in the end, this is like Spark Connect :)
On Mon, Dec 12, 2022 at 2:55 PM Kevin Su wrote:
> Also, is
Also, is there any way to workaround this issue without using Spark connect?
Kevin Su 於 2022年12月12日 週一 下午2:52寫道:
> nvm, I found the ticket.
> Also, is there any way to workaround this issue without using Spark
> connect?
>
> Kevin Su 於 2022年12月12日 週一 下午2:42寫道:
>
>> Thanks for the quick response
Spark Connect :)
(It’s work in progress)
On Mon, Dec 12 2022 at 2:29 PM, Kevin Su < pings...@gmail.com > wrote:
>
> Hey there, How can I get the same spark context in two different python
> processes?
> Let’s say I create a context in Process A, and then I want to use python
> subprocess B to g
10 matches
Mail list logo