Re: Async RDD saves

2020-08-08 Thread Antonin Delpeuch (lists)
Hi both, Thanks for your replies! Sean, your proposal to use a driver-side future wrapping the blocking call sounds a lot easier indeed. But I want to ensure that canceling the future in the driver code kills the corresponding tasks on all executors. If I wrap the driver-side call in a standard

Re: Async RDD saves

2020-08-08 Thread kalyan
This looks interesting.. anyways, it will be good if you can elaborate more on the expectations and the various other ways you had tried before deciding to do it this way... Regards, Kalyan. On Fri, Aug 7, 2020, 11:24 PM Edward Mitchell wrote: > I will agree that the side effects of using Futur

Re: Async RDD saves

2020-08-07 Thread Edward Mitchell
I will agree that the side effects of using Futures in driver code tend to be tricky to track down. If you forget to clear the job description and job group information, when the LocalProperties on the SparkContext remain intact - SparkContext#submitJob makes sure to pass down the localProperties.

Re: Async RDD saves

2020-08-07 Thread Sean Owen
Why do you need to do it, and can you just use a future in your driver code? On Fri, Aug 7, 2020 at 9:01 AM Antonin Delpeuch (lists) wrote: > > Hi all, > > Following my request on the user mailing list [1], there does not seem > to be any simple way to save RDDs to the file system in an asynchron