You are now in position 368 for unsubscription. If you wish for your
Unsubscription to occur immediately, please email
dev-unsubscr...@spark.apache.org
unsurprisingly, we had another GC:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70949/console
so, definitely not the system (everything looks hunky dory on the build node).
> It can always be some memory leak; if we increase the memory settings
> and OOMs still happen, that
Sent from my Samsung device
Can you show how you use the encoder in your UDAF?
Andy Dang wrote
> One more question about the behavior of ExpressionEncoder
>
> .
>
> I have a UDAF that has ExpressionEncoder
>
> as a member variable.
>
> However, if call resolveAndBind() eagerly on this encoder, it appears to
> break th
But is there any non-memory-leak reason why the tests should need more
memory? In theory each test should be cleaning up it's own Spark Context
etc. right? My memory is that OOM issues in the tests in the past have been
indicative of memory leaks somewhere.
I do agree that it doesn't seem likely
Would it be more robust to use the Path when creating the FileSystem?
https://github.com/graphframes/graphframes/issues/160
On Thu, Jan 5, 2017 at 4:57 PM, Felix Cheung
wrote:
> This is likely a factor of your hadoop config and Spark rather then
> anything specific with GraphFrames.
>
> You migh
On Thu, Jan 5, 2017 at 4:58 PM, Kay Ousterhout wrote:
> But is there any non-memory-leak reason why the tests should need more
> memory? In theory each test should be cleaning up it's own Spark Context
> etc. right? My memory is that OOM issues in the tests in the past have been
> indicative of m
This is likely a factor of your hadoop config and Spark rather then anything
specific with GraphFrames.
You might have better luck getting assistance if you could isolate the code to
a simple case that manifests the problem (without GraphFrames), and repost.
Fr
Seems like the OOM is coming from tests, which most probably means
it's not an infrastructure issue. Maybe tests just need more memory
these days and we need to update maven / sbt scripts.
On Thu, Jan 5, 2017 at 1:19 PM, shane knapp wrote:
> as of first thing this morning, here's the list of rece
Thanks for looking into this Shane!
On Thu, Jan 5, 2017 at 1:19 PM, shane knapp wrote:
> as of first thing this morning, here's the list of recent GC overhead
> build failures:
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70891/
> console
> https://amplab.cs.berkeley.edu/
Adding DEV mailing list to see if this is a defect with ConnectedComponent
or if they can recommend any solution.
Thanks
Ankur
On Thu, Jan 5, 2017 at 1:10 PM, Ankur Srivastava wrote:
> Yes I did try it out and it choses the local file system as my checkpoint
> location starts with s3n://
>
> I
Hi Cody,
thank you for bringing up this topic, I agree it is very important to keep
a cohesive community around some common, fluid goals. Here are a few
comments about the current document:
1. name: it should not overlap with an existing one such as SIP. Can you
imagine someone trying to discuss a
We returned a DataFrame since it is a nicer API, but I agree forcing RDD
operations is not ideal. I'd be OK with adding a new method, but I agree
with Felix that we cannot break the API for something like this.
On Thu, Jan 5, 2017 at 12:44 PM, Felix Cheung
wrote:
> Given how Word2Vec is used th
as of first thing this morning, here's the list of recent GC overhead
build failures:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70891/console
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70874/console
https://amplab.cs.berkeley.edu/jenkins/job/SparkPul
Given how Word2Vec is used the pipeline model in the new ml implementation, we
might need to keep the current behavior?
https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/ml/Word2VecExample.scala
_
From: Asher Krim mailto:a
So, it seems the only way I found for now is a recursive handling of the Row
instances directly, but to do that I have to go back to RDDs, i've put together
a simple test case demonstrating the problem :
import org.apache.spark.sql.{DataFrame, SparkSession}
import org.scalatest.{FlatSpec, Matchers}
One more question about the behavior of ExpressionEncoder.
I have a UDAF that has ExpressionEncoder as a member variable.
However, if call resolveAndBind() eagerly on this encoder, it appears to
break the UDAF. Bascially somehow the deserialized row are all the same
during the merge step. Is it t
Vincent
Perfect. The API in Java is bit clumsy though
What I ended up doing in Java (the val is from lombok, if anyone's
wondering):
val attributes =
JavaConversions.asJavaCollection(schema.toAttributes()).stream().map(Attribute::toAttribute).collect(Collectors.toList());
val encoder =
Row
20 matches
Mail list logo