Re:Re: Async action in Dataframe

2018-12-28 Thread 大啊
I check the `collect` of `DataSet`, this method call the `collect` of `RDD` and apply `decodeUnsafeRows`. So I think the function of the two `collect` is differenct. The `collect` of `DataSet` is used for spark sql. If you really want use `collectAsync`,please code following: `df.rdd.collectAsync`

Re: Async action in Dataframe

2018-12-23 Thread Jiaan Geng
RDD have not the method `collectAsync`.There exists a implicit conversion from RDD to AsyncRDDActions in object RDD. The implicit conversion is : implicit def rddToAsyncRDDActions[T: ClassTag](rdd: RDD[T]): AsyncRDDActions[T] = { new AsyncRDDActions(rdd) } The method collect of RDD use the

Async action in Dataframe

2018-12-22 Thread JiaTao Tao
Hi all As we all know, RDD has the operation: "collectAsync()", also submitJob can return a Future, but I cannot find the same thing in Dataset, anyone knows how I can archive this when using "Dataframe"? Thanks a lot. -- Regards! Aron Tao