Re: Unsubscribe

2017-01-05 Thread Kyle Kelley
You are now in position 368 for unsubscription. If you wish for your Unsubscription to occur immediately, please email dev-unsubscr...@spark.apache.org

Re: Tests failing with GC limit exceeded

2017-01-05 Thread shane knapp
unsurprisingly, we had another GC: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70949/console so, definitely not the system (everything looks hunky dory on the build node). > It can always be some memory leak; if we increase the memory settings > and OOMs still happen, that

Unsubscribe

2017-01-05 Thread write2sivakumar@gmail
Sent from my Samsung device

Re: Converting an InternalRow to a Row

2017-01-05 Thread Liang-Chi Hsieh
Can you show how you use the encoder in your UDAF? Andy Dang wrote > One more question about the behavior of ExpressionEncoder > > . > > I have a UDAF that has ExpressionEncoder > > as a member variable. > > However, if call resolveAndBind() eagerly on this encoder, it appears to > break th

Re: Tests failing with GC limit exceeded

2017-01-05 Thread Kay Ousterhout
But is there any non-memory-leak reason why the tests should need more memory? In theory each test should be cleaning up it's own Spark Context etc. right? My memory is that OOM issues in the tests in the past have been indicative of memory leaks somewhere. I do agree that it doesn't seem likely

Re: Spark GraphFrame ConnectedComponents

2017-01-05 Thread Joseph Bradley
Would it be more robust to use the Path when creating the FileSystem? https://github.com/graphframes/graphframes/issues/160 On Thu, Jan 5, 2017 at 4:57 PM, Felix Cheung wrote: > This is likely a factor of your hadoop config and Spark rather then > anything specific with GraphFrames. > > You migh

Re: Tests failing with GC limit exceeded

2017-01-05 Thread Marcelo Vanzin
On Thu, Jan 5, 2017 at 4:58 PM, Kay Ousterhout wrote: > But is there any non-memory-leak reason why the tests should need more > memory? In theory each test should be cleaning up it's own Spark Context > etc. right? My memory is that OOM issues in the tests in the past have been > indicative of m

Re: Spark GraphFrame ConnectedComponents

2017-01-05 Thread Felix Cheung
This is likely a factor of your hadoop config and Spark rather then anything specific with GraphFrames. You might have better luck getting assistance if you could isolate the code to a simple case that manifests the problem (without GraphFrames), and repost. Fr

Re: Tests failing with GC limit exceeded

2017-01-05 Thread Marcelo Vanzin
Seems like the OOM is coming from tests, which most probably means it's not an infrastructure issue. Maybe tests just need more memory these days and we need to update maven / sbt scripts. On Thu, Jan 5, 2017 at 1:19 PM, shane knapp wrote: > as of first thing this morning, here's the list of rece

Re: Tests failing with GC limit exceeded

2017-01-05 Thread Kay Ousterhout
Thanks for looking into this Shane! On Thu, Jan 5, 2017 at 1:19 PM, shane knapp wrote: > as of first thing this morning, here's the list of recent GC overhead > build failures: > https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70891/ > console > https://amplab.cs.berkeley.edu/

Re: Spark GraphFrame ConnectedComponents

2017-01-05 Thread Ankur Srivastava
Adding DEV mailing list to see if this is a defect with ConnectedComponent or if they can recommend any solution. Thanks Ankur On Thu, Jan 5, 2017 at 1:10 PM, Ankur Srivastava wrote: > Yes I did try it out and it choses the local file system as my checkpoint > location starts with s3n:// > > I

Re: Spark Improvement Proposals

2017-01-05 Thread Tim Hunter
Hi Cody, thank you for bringing up this topic, I agree it is very important to keep a cohesive community around some common, fluid goals. Here are a few comments about the current document: 1. name: it should not overlap with an existing one such as SIP. Can you imagine someone trying to discuss a

Re: ml word2vec finSynonyms return type

2017-01-05 Thread Joseph Bradley
We returned a DataFrame since it is a nicer API, but I agree forcing RDD operations is not ideal. I'd be OK with adding a new method, but I agree with Felix that we cannot break the API for something like this. On Thu, Jan 5, 2017 at 12:44 PM, Felix Cheung wrote: > Given how Word2Vec is used th

Re: Tests failing with GC limit exceeded

2017-01-05 Thread shane knapp
as of first thing this morning, here's the list of recent GC overhead build failures: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70891/console https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70874/console https://amplab.cs.berkeley.edu/jenkins/job/SparkPul

Re: ml word2vec finSynonyms return type

2017-01-05 Thread Felix Cheung
Given how Word2Vec is used the pipeline model in the new ml implementation, we might need to keep the current behavior? https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/ml/Word2VecExample.scala _ From: Asher Krim mailto:a

Re: Spark SQL - Applying transformation on a struct inside an array

2017-01-05 Thread Olivier Girardot
So, it seems the only way I found for now is a recursive handling of the Row instances directly, but to do that I have to go back to RDDs, i've put together a simple test case demonstrating the problem : import org.apache.spark.sql.{DataFrame, SparkSession} import org.scalatest.{FlatSpec, Matchers}

Re: Converting an InternalRow to a Row

2017-01-05 Thread Andy Dang
One more question about the behavior of ExpressionEncoder. I have a UDAF that has ExpressionEncoder as a member variable. However, if call resolveAndBind() eagerly on this encoder, it appears to break the UDAF. Bascially somehow the deserialized row are all the same during the merge step. Is it t

unsubscribe

2017-01-05 Thread Vincent Frochot
Vincent

unsubscribe

2017-01-05 Thread Nikola Z

Re: Converting an InternalRow to a Row

2017-01-05 Thread Andy Dang
Perfect. The API in Java is bit clumsy though What I ended up doing in Java (the val is from lombok, if anyone's wondering): val attributes = JavaConversions.asJavaCollection(schema.toAttributes()).stream().map(Attribute::toAttribute).collect(Collectors.toList()); val encoder = Row