Hi,
I am evaluating Spark for an analytic component where we do batch
processing of data using SQL.
So, I am particularly interested in Spark SQL and in creating a SchemaRDD
from an existing API [1].
This API exposes elements in a database as datasources. Using the methods
allowed by this data s
The 1st was referring to different Spark applications connecting to the
standalone cluster manager, and the 2nd one was referring to within a
single Spark application, the jobs can be scheduled using a fair scheduler.
On Thu, Nov 27, 2014 at 3:47 AM, Praveen Sripati
wrote:
> Hi,
>
> There is a
1.1.1 was just released, and 1.2 is close to a release. That, plus
Thanksgiving in the US (where most Spark committers AFAIK are located),
probably means a temporary lull in committer activity on non-critical items
should be expected.
On Mon Nov 24 2014 at 9:33:27 AM York, Brennon
wrote:
> All,
Hi all,
Spark ML alpha version exists in the current master branch on Github.
If we want to add new machine learning algorithms or to modify algorithms
which already exists,
which package should we implement them at org.apache.spark.mllib or
org.apache.spark.ml?
thanks,
Yu
-
-- Yu Ishika
Hi,
There is a bit of inconsistent in the document. Which is the correct
statement?
`http://spark.apache.org/docs/latest/spark-standalone.html` says
The standalone cluster mode currently only supports a simple FIFO scheduler
across applications.
while `http://spark.apache.org/docs/latest/job-sc