Re: How to cache SparkPlan.execute for reusing?

2017-03-03 Thread Liang-Chi Hsieh
Not sure what you mean in "its parents have to reuse it by creating new RDDs". As SparkPlan.execute returns new RDD every time, you won't expect the cached RDD can be reused automatically, even you reuse the SparkPlan in several queries. Btw, is there any existing ways to reuse SparkPlan? sum

Re: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

2017-03-03 Thread Noorul Islam K M
> When Initial jobs have not accepted any resources then what all can be > wrong? Going through stackoverflow and various blogs does not help. Maybe > need better logging for this? Adding dev > Did you take a look at the spark UI to see your resource availability? Thanks and Regards Noorul

Re: Straw poll: dropping support for things like Scala 2.10

2017-03-03 Thread Sean Owen
Let's track further discussion at https://issues.apache.org/jira/browse/SPARK-19810 I am also in favor of removing Scala 2.10 support, and will open a WIP to discuss the change, but am not yet sure whether there are objections or deeper support for this. On Thu, Mar 2, 2017 at 7:51 PM Russell Spi

Re: Spark join over sorted columns of dataset.

2017-03-03 Thread Koert Kuipers
For RDD the shuffle is already skipped but the sort is not. In spark-sorted we track partitioning and sorting within partitions for key-value RDDs and can avoid the sort. See: https://github.com/tresata/spark-sorted For Dataset/DataFrame such optimizations are done automatically, however it's curr

RE: [Spark Namespace]: Expanding Spark ML under Different Namespace?

2017-03-03 Thread Shouheng Yi
Hi Spark dev list, Thank you guys so much for all your inputs. We really appreciated those suggestions. After some discussions in the team, we decided to stay under apache’s namespace for now, and attach some comments to explain what we did and why we did this. As the Spark dev list kindly poi

RE: [Spark Namespace]: Expanding Spark ML under Different Namespace?

2017-03-03 Thread Shouheng Yi
Hi Spark dev list, Thank you guys so much for all your inputs. We really appreciated those suggestions. After some discussions in the team, we decided to stay under apache’s namespace for now, and attach some comments to explain what we did and why we did this. As the Spark dev list kindly poi