Dear Spark users,
My team is working on a small library that builds on PySpark and is organized
like PySpark as well -- it has a JVM component (that runs in the Spark driver
and executor) and a Python component (that runs in the PySpark driver and
executor processes). What's a good approach for
Hi all,
I am thinking of starting work on a profiler for Spark clusters. The current
idea is that it would collect jstacks from executor nodes and put them into
a central index (either a database or elasticsearch), and it would present
them to people in a UI that would let people slice and dice th
ps://urldefense.proofpoint.com/v1/url?u=https://github.com/apache/spark/
pull/119&k=fDZpZZQMmYwf27OU23GmAQ%3D%3D%0A&r=kTrYN051orSRhyA6mqYxbjRIX%2BBCP
m7thmzLC79vBeM%3D%0A&m=FPFPeXJiBQNyIG6CREbwusGj2ZQn1K10JLVA7ZNTjxY%3D%0A&s=0
3ce2711c63b039ff6ea09a592ea5f16ac287890bcb90d1bf5855ed968ecf
Hi all,
The Maven central repo contains an artifact for spark 0.9.0 built with
unmodified Hadoop, and the Cloudera repo contains an artifact for spark
0.9.0 built with CDH 5 beta. Is there a repo that contains spark-core built
against a non-beta version of CDH (such as 4.4.0)?
Punya
smime.p7
xtras-v2.jar")
print(sc2.filter(/* fn that depends on jar */).count)
}
... even if classes in extras-v1.jar and extras-v2.jar have name collisions.
Punya
From: Punya Biswal
Reply-To:
Date: Sunday, March 16, 2014 at 11:09 AM
To: "user@spark.apache.org"
Subject: Separating c
Hi all,
I'm trying to use Spark to support users who are interactively refining the
code that processes their data. As a concrete example, I might create an
RDD[String] and then write several versions of a function to map over the
RDD until I'm satisfied with the transformation. Right now, once I