Fwd: 2014 Mesos community survey results

2014-06-24 Thread Andy Konwinski
I think it's cool that the Mesos team did a survey of usage and published the aggregate results. It would be cool to do a survey for the Spark project and publish the results on the Spark website like the Mesos team did. -- Forwarded message -- From: "Dave Lester" Date: Jun 24, 201

Re: Checkpointed RDD still causing StackOverflow

2014-06-24 Thread dash
Due to SPARK-2245, you can not use count to materialize VertexRDD. That actually materialize PartitionRDD, so checkpoint for VertexRDD won't work. I'll trying to fix that right now. -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Checkpointed-RDD-still

Re: balancing RDDs

2014-06-24 Thread Mayur Rustagi
This would be really useful. Especially for Shark where shift of partitioning effects all subsequent queries unless task scheduling time beats spark.locality.wait. Can cause overall low performance for all subsequent tasks. Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @mayur

Re: Checkpointed RDD still causing StackOverflow

2014-06-24 Thread Mayur Rustagi
Do not call collect as that will perform materialization as well as transfer of data to driver (might actually cause driver to fail if the data is huge). You have to materialize the RDD in some way(call save, count, collect). Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @may

Re: RFC: [SPARK-529] Create constants for known config variables.

2014-06-24 Thread Marcelo Vanzin
Hi Matei, thanks for the comments. On Mon, Jun 23, 2014 at 7:58 PM, Matei Zaharia wrote: > When we did the configuration pull request, we actually avoided having a big > list of defaults in one class file, because this creates a file that all the > components in the project depend on. For examp