Shivaram: Yes, we can call it "gang scheduling" or "barrier
synchronization". Spark doesn't support it now. The proposal is to have a
proper support in Spark's job scheduler, so we can integrate well with
MPI-like frameworks.
On Tue, May 8, 2018 at 11:17 AM Nan Zhu wrote:
> .how I skipped th
Dear Spark community,
Just wanted to bring this issue up which was filed for Spark 1.6.1 (
https://issues.apache.org/jira/browse/SPARK-15544) but also exists in Spark
2.3.0 (https://issues.apache.org/jira/browse/SPARK-23530)
We have run into this on production, where Spark Master shuts down if th
Hi y'all,
With the renewed interest in ML in Apache Spark now seems like a good a
time as any to revisit the online serving situation in Spark ML. DB &
other's have done some excellent working moving a lot of the necessary
tools into a local linear algebra package that doesn't depend on having a
S
The repr() trick is neat when working on a notebook. When working in a
library, I used to use an evaluate(dataframe) -> DataFrame function that
simply forces the materialization of a dataframe. As Reynold mentions, this
is very convenient when working on a lot of chained UDFs, and it is a
standard