Hm thanks, I think what you are suggesting sounds like a recommendation for AWS EMR. However, my questions were wrt spark-ec2. For our uses involving spot-instances, EMR could potentially double/triple prices due to the additional premiums.
Thanks anyway! On Wed, Jan 27, 2016 at 2:12 PM, Alexander Pivovarov <apivova...@gmail.com> wrote: > you can use EMR-4.3.0 run on spot instances to control the price > > yes, you can add/remove instances to the cluster on fly (CORE instances > support add only, TASK instances - add and remove) > > > > On Wed, Jan 27, 2016 at 2:07 PM, Sung Hwan Chung <coded...@cs.stanford.edu > > wrote: > >> I noticed that in the main branch, the ec2 directory along with the >> spark-ec2 script is no longer present. >> >> Is spark-ec2 going away in the next release? If so, what would be the >> best alternative at that time? >> >> A couple more additional questions: >> 1. Is there any way to add/remove additional workers while the cluster is >> running without stopping/starting the EC2 cluster? >> 2. For 1, if no such capability is provided with the current script., do >> we have to write it ourselves? Or is there any plan in the future to add >> such functions? >> 2. In PySpark, is it possible to dynamically change driver/executor >> memory, number of cores per executor without having to restart it? (e.g. >> via changing sc configuration or recreating sc?) >> >> Our ideal scenario is to keep running PySpark (in our case, as a >> notebook) and connect/disconnect to any spark clusters on demand. >> > >