Re: Spark stages very slow to complete

2015-08-25 Thread Olivier Girardot
I have pretty much the same "symptoms" - the computation itself is pretty fast, but most of my computation is spent in JavaToPython steps (~15min). I'm using the Spark 1.5.0-rc1 with DataFrame and ML Pipelines. Any insights into what these steps are exactly ? 2015-06-02 9:18 GMT+02:00 Karlson : >

Re: Spark stages very slow to complete

2015-06-02 Thread Karlson
Hi, the code is some hundreds lines of Python. I can try to compose a minimal example as soon as I find the time, though. Any ideas until then? Would you mind posting the code? On 2 Jun 2015 00:53, "Karlson" wrote: Hi, In all (pyspark) Spark jobs, that become somewhat more involved, I am e

Re: Spark stages very slow to complete

2015-06-01 Thread ayan guha
Would you mind posting the code? On 2 Jun 2015 00:53, "Karlson" wrote: > Hi, > > In all (pyspark) Spark jobs, that become somewhat more involved, I am > experiencing the issue that some stages take a very long time to complete > and sometimes don't at all. This clearly correlates with the size of