RE: [PySpark] - running processes and computing time

2017-07-04 Thread Sidney Feiner
ow what happens here? The time difference is too big for it to be networking right? From: Sudev A C [mailto:sudev...@go-mmt.com] Sent: Monday, July 3, 2017 7:48 PM To: Sidney Feiner ; user@spark.apache.org Subject: Re: [PySpark] - running processes You might want to do the initialisation per pa

[PySpark] - running processes

2017-07-03 Thread Sidney Feiner
In my Spark Streaming application, I have the need to build a graph from a file and initializing that graph takes between 5 and 10 seconds. So I tried initializing it once per executor so it'll be initialized only once. After running the application, I've noticed that it's initiated much more th