Not sure what you mean in "its parents have to reuse it by creating new
RDDs".
As SparkPlan.execute returns new RDD every time, you won't expect the cached
RDD can be reused automatically, even you reuse the SparkPlan in several
queries.
Btw, is there any existing ways to reuse SparkPlan?
sum
> When Initial jobs have not accepted any resources then what all can be
> wrong? Going through stackoverflow and various blogs does not help. Maybe
> need better logging for this? Adding dev
>
Did you take a look at the spark UI to see your resource availability?
Thanks and Regards
Noorul
Let's track further discussion at
https://issues.apache.org/jira/browse/SPARK-19810
I am also in favor of removing Scala 2.10 support, and will open a WIP to
discuss the change, but am not yet sure whether there are objections or
deeper support for this.
On Thu, Mar 2, 2017 at 7:51 PM Russell Spi
For RDD the shuffle is already skipped but the sort is not. In spark-sorted
we track partitioning and sorting within partitions for key-value RDDs and
can avoid the sort. See:
https://github.com/tresata/spark-sorted
For Dataset/DataFrame such optimizations are done automatically, however
it's curr
Hi Spark dev list,
Thank you guys so much for all your inputs. We really appreciated those
suggestions. After some discussions in the team, we decided to stay under
apache’s namespace for now, and attach some comments to explain what we did and
why we did this.
As the Spark dev list kindly poi
Hi Spark dev list,
Thank you guys so much for all your inputs. We really appreciated those
suggestions. After some discussions in the team, we decided to stay under
apache’s namespace for now, and attach some comments to explain what we did and
why we did this.
As the Spark dev list kindly poi