great job! thanks a lot!
On Thu, Dec 6, 2018 at 9:39 AM Hyukjin Kwon wrote:
> It's merged now and in developer tools page -
> http://spark.apache.org/developer-tools.html#individual-tests
> Have some func with PySpark testing!
>
> 2018년 12월 5일 (수) 오후 4:30, Hyukjin Kwon 님이 작성:
>
>> Hey all, I kin
It's merged now and in developer tools page -
http://spark.apache.org/developer-tools.html#individual-tests
Have some func with PySpark testing!
2018년 12월 5일 (수) 오후 4:30, Hyukjin Kwon 님이 작성:
> Hey all, I kind of met the goal with a minimised fix with keeping
> available framework and options. See
I guess thats roughly it.
As of now theres no in-built support to co-ordinate the commits across the
executors in an atomic way. So you need to commit the batch (global commit)
at the driver.
And when the batch is replayed and if any of the intermediate operations
are not idempotent or can cause
Hi all,
We are working on implementing a streaming sink on 2.3.1 with the
DataSourceV2 APIs.
Can anyone help check if my understanding is correct, with respect to the
failure modes which need to be covered?
We are assuming that a Reliable Receiver (such as Kafka) is used as the
stream source. An
The bucket feature is designed to only work with data sources with table
support, and currently the table support is not public yet, which means no
external data sources can access bucketing information right now. The
bucket feature only works with Spark native file source tables.
We are working o
Hey all, I kind of met the goal with a minimised fix with keeping available
framework and options. See
https://github.com/apache/spark/pull/23203
https://github.com/apache/spark-website/pull/161
I know it's not perfect and other Python testing framework provide many
good other features but should