Please vote on releasing the following candidate as Apache Spark version
2.0.1. The vote is open until Sunday, Sep 25, 2016 at 23:59 PDT and passes
if a majority of at least 3+1 PMC votes are cast.
[ ] +1 Release this package as Apache Spark 2.0.1
[ ] -1 Do not release this package because ...
T
Yes, I mean local here. Thanks for pointing this out. Also thanks for
explaining the problem.
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/java-lang-NoClassDefFoundError-is-this-a-bug-tp18972p19011.html
Sent from the Apache Spark Developers List mai
Did you try the proposed fix? Would be good to know whether it fixes the
issue.
On Thu, Sep 22, 2016 at 2:49 PM, Asher Krim wrote:
> Does anyone know what the status of SPARK-15717 is? It's a simple enough
> looking PR, but there has been no activity on it since June 16th.
>
> I believe that we
Does anyone know what the status of SPARK-15717 is? It's a simple enough
looking PR, but there has been no activity on it since June 16th.
I believe that we are hitting that bug with checkpointed distributed LDA.
It's a blocker for us and we would really appreciate getting it fixed.
Jira: https:/
Hash codes should try to avoid collisions of objects that are not
equal. Integer overflowing is not an issue by itself
On Wed, Sep 21, 2016 at 10:49 PM, WangJianfei
wrote:
> Than you very much sir! but what i want to know is whether the hashcode
> overflow will make a trouble. thank you!
>
>
>
>
I looked into this and found the problem. Will send a PR now to fix this.
If you are curious about what is happening here: When we build the
docs separately we don't have the JAR files from the Spark build in
the same tree. We added a new set of docs recently in SparkR called an
R vignette that ru
FWIW it worked for me, but I may not be executing the same thing. I
was running the commands given in R/DOCUMENTATION.md
It succeeded for me in creating the vignette, on branch-2.0.
Maybe it's a version or library issue? what R do you have installed,
and are you up to date with packages like devt
Hi,
I have a Spark resource scheduling order question when I read this code:
github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/master/Master.scala
In function schedule(), spark start drivers first, then start executors.
I’m wondering why we schedule in this order? Wi
Hi,
I've just discovered* that I can SerDe my case classes. What a nice
feature which I can use in spark-shell, too! Thanks a lot for offering
me so much fun!
What I don't really like about the code is the following part (esp.
that it conflicts with the implicit for Column):
import org.apache.sp
I am planning to write a thesis on certain aspects (i.e testing, performance
optimisation, security) of Apache Spark. I need to study some projects that
are based on Apache Spark and are available as open source.
If you know any such project (open source Spark based project), Please share
it here
There can be just one published version of the Spark artifacts and they
have to depend on something, though in truth they'd be binary-compatible
with anything 2.2+. So you merely manage the dependency versions up to the
desired version in your .
On Thu, Sep 22, 2016 at 7:05 AM, Olivier Girardot <
You should take also into account that spark has different option to represent
data in-memory, such as Java serialized objects, Kyro serialized, Tungsten
(columnar optionally compressed) etc. the tungsten thing depends heavily on the
underlying data and sorting especially if compressed.
Then, yo
zipWithIndex is fine. It will give you unique row IDs across your various
partitions.
You can also use zipWithUniqueId which saves an extra job that is fired by
zipWithIndex. However, there are some differences as to how indexes are
assigned to the row. You can read more about the two APIs in the
13 matches
Mail list logo