Hello,When i used to submit a job with spark 1.4, it would return a job ID and
a status RUNNING, FAILED or something like this.I just upgraded to 1.6 and
there is no status returned by spark-submitIs there a way to get this
information back? When submit a job i want to know which one it is
Hello,When I used to submit a job with spark 1.4, it would return a job ID and
a status RUNNING, FAILED or something like this.I just upgraded to 1.6 and
there is no status returned by spark-submitIs there a way to get this
information back?
When I submit a job I want to know which one it is
In Spark 1.6
if I do (column name has dot in it, but is not a nested column):
df = df.withColumn("raw.hourOfDay", df.col("`raw.hourOfDay`"))scala> df =
df.withColumn("raw.hourOfDay",
df.col("`raw.hourOfDay`"))org.apache.spark.sql.AnalysisException: cannot
resolve 'raw.minOfDay' given input colu
Hello,
In MLLib with Spark 1.4, I was able to eval a model by loading it and using
`predict` on a vector of features. I would train on Spark but use my model on
my workflow.
In `spark.ml` it seems like the only way to eval is to use `transform` which
only takes a DataFrame.To build a DataFrame
Hi,
I'm working on a Spark Streaming application and I would like to know what
is the best storage to use
for checkpointing.
For testing purposes we're are using NFS between the worker, the master and
the driver program (in client mode),
but we have some issues with the CheckpointWriter (1 thread
uffice.
>
> My $0.02,
> dean
>
> Dean Wampler, Ph.D.
> Author: Programming Scala, 2nd Edition
> <http://shop.oreilly.com/product/0636920033073.do> (O'Reilly)
> Typesafe <http://typesafe.com>
> @deanwampler <http://twitter.com/deanwampler>
> http://p
this :
myRdd.filter(t => t._1.equals(param))
If I make a collect to get the only « tuple » , It takes about 12 seconds to
execute, I imagine that’s because Spark may be used differently...
Best regards,
Emmanuel
-
To unsubscribe
> It will never be efficient like a database lookup since this is
> implemented by scanning through all of the data. There is no index or
> anything.
>
> On Tue, Aug 19, 2014 at 8:43 AM, Emmanuel Castanier
> wrote:
>> Hi all,
>>
>> I’m totally newbie on Spark,
It did the job.
Thanks. :)
Le 19 août 2014 à 10:20, Sean Owen a écrit :
> In that case, why not collectAsMap() and have the whole result as a
> simple Map in memory? then lookups are trivial. RDDs aren't
> distributed maps.
>
> On Tue, Aug 19, 2014 at 9:17 AM, Emmanue