Sending this to the list again because I'm pretty sure it didn't work the first time. A colleague just realized he was having the same problem with the list not accepting his posts, but unsubscribing and re-subscribing seemed to fix the issue for him. I've just unsubscribed and re-subscribed too, so hopefully this works...
On Wednesday, December 2, 2015, Jonathan Kelly <jonathaka...@gmail.com> wrote: > EMR is currently running a private preview of an upcoming feature allowing > EMR clusters to be launched in VPC private subnets. This will allow you to > launch a cluster in a subnet without an Internet Gateway attached. Please > contact jonfr...@amazon.com > <javascript:_e(%7B%7D,'cvml','jonfr...@amazon.com');> if you would like > more information. > > ~ Jonathan > > Note: jonfr...@amazon.com > <javascript:_e(%7B%7D,'cvml','jonfr...@amazon.com');> is not me. I'm a > different Jonathan. :) > > On Wed, Dec 2, 2015 at 10:21 AM, Jerry Lam <chiling...@gmail.com > <javascript:_e(%7B%7D,'cvml','chiling...@gmail.com');>> wrote: > >> Hi Dana, >> >> Yes, we get VPC + EMR working but I'm not the person who deploys it. It >> is related to subnet as Alex points out. >> >> Just to want to add another point, spark-ec2 is nice to keep and improve >> because it allows users to any version of spark (nightly-build for >> example). EMR does not allow you to do that without manual process. >> >> Best Regards, >> >> Jerry >> >> On Wed, Dec 2, 2015 at 1:02 PM, Alexander Pivovarov <apivova...@gmail.com >> <javascript:_e(%7B%7D,'cvml','apivova...@gmail.com');>> wrote: >> >>> Do you think it's a security issue if EMR started in VPC with a subnet >>> having Auto-assign Public IP: Yes >>> >>> you can remove all Inbound rules having 0.0.0.0/0 Source in master and >>> slave Security Group >>> So, master and slave boxes will be accessible only for users who are on >>> VPN >>> >>> >>> >>> >>> On Wed, Dec 2, 2015 at 9:44 AM, Dana Powers <dana.pow...@gmail.com >>> <javascript:_e(%7B%7D,'cvml','dana.pow...@gmail.com');>> wrote: >>> >>>> EMR was a pain to configure on a private VPC last I tried. Has anyone >>>> had success with that? I found spark-ec2 easier to use w private >>>> networking, but also agree that I would use for prod. >>>> >>>> -Dana >>>> On Dec 1, 2015 12:29 PM, "Alexander Pivovarov" <apivova...@gmail.com >>>> <javascript:_e(%7B%7D,'cvml','apivova...@gmail.com');>> wrote: >>>> >>>>> 1. Emr 4.2.0 has Zeppelin as an alternative to DataBricks Notebooks >>>>> >>>>> 2. Emr has Ganglia 3.6.0 >>>>> >>>>> 3. Emr has hadoop fs settings to make s3 work fast >>>>> (direct.EmrFileSystem) >>>>> >>>>> 4. EMR has s3 keys in hadoop configs >>>>> >>>>> 5. EMR allows to resize cluster on fly. >>>>> >>>>> 6. EMR has aws sdk in spark classpath. Helps to reduce app assembly >>>>> jar size >>>>> >>>>> 7. ec2 script installs all in /root, EMR has dedicated users: hadoop, >>>>> zeppelin, etc. EMR is similar to Cloudera or Hortonworks >>>>> >>>>> 8. There are at least 3 spark-ec2 projects. (in apache/spark, in >>>>> mesos, in amplab). Master branch in spark has outdated ec2 script. Other >>>>> projects have broken links in readme. WHAT A MESS! >>>>> >>>>> 9. ec2 script has bad documentation and non informative error >>>>> messages. e.g. readme does not say anything about --private-ips option. If >>>>> you did not add the flag it will connect to empty string host (localhost) >>>>> instead of master. Fixed only last week. Not sure if fixed in all branches >>>>> >>>>> 10. I think Amazon will include spark-jobserver to EMR soon. >>>>> >>>>> 11. You do not need to be aws expert to start EMR cluster. Users can >>>>> use EMR web ui to start cluster to run some jobs or work in Zeppelun >>>>> during >>>>> the day >>>>> >>>>> 12. EMR cluster starts in abour 8 min. Ec2 script works longer and you >>>>> need to be online. >>>>> On Dec 1, 2015 9:22 AM, "Jerry Lam" <chiling...@gmail.com >>>>> <javascript:_e(%7B%7D,'cvml','chiling...@gmail.com');>> wrote: >>>>> >>>>>> Simply put: >>>>>> >>>>>> EMR = Hadoop Ecosystem (Yarn, HDFS, etc) + Spark + EMRFS + Amazon EMR >>>>>> API + Selected Instance Types + Amazon EC2 Friendly (bootstrapping) >>>>>> spark-ec2 = HDFS + Yarn (Optional) + Spark (Standalone Default) + Any >>>>>> Instance Type >>>>>> >>>>>> I use spark-ec2 for prototyping and I have never use it for >>>>>> production. >>>>>> >>>>>> just my $0.02 >>>>>> >>>>>> >>>>>> >>>>>> On Dec 1, 2015, at 11:15 AM, Nick Chammas <nicholas.cham...@gmail.com >>>>>> <javascript:_e(%7B%7D,'cvml','nicholas.cham...@gmail.com');>> wrote: >>>>>> >>>>>> Pinging this thread in case anyone has thoughts on the matter they >>>>>> want to share. >>>>>> >>>>>> On Sat, Nov 21, 2015 at 11:32 AM Nicholas Chammas <[hidden email]> >>>>>> wrote: >>>>>> >>>>>>> Spark has come bundled with spark-ec2 >>>>>>> <http://spark.apache.org/docs/latest/ec2-scripts.html> for many >>>>>>> years. At the same time, EMR has been capable of running Spark for a >>>>>>> while, >>>>>>> and earlier this year it added "official" support >>>>>>> <https://aws.amazon.com/blogs/aws/new-apache-spark-on-amazon-emr/>. >>>>>>> >>>>>>> If you're looking for a way to provision Spark clusters, there are >>>>>>> some clear differences between these 2 options. I think the biggest one >>>>>>> would be that EMR is a "production" solution backed by a company, >>>>>>> whereas >>>>>>> spark-ec2 is not really intended for production use (as far as I know). >>>>>>> >>>>>>> That particular difference in intended use may or may not matter to >>>>>>> you, but I'm curious: >>>>>>> >>>>>>> What are some of the other differences between the 2 that do matter >>>>>>> to you? If you were considering these 2 solutions for your use case at >>>>>>> one >>>>>>> point recently, why did you choose one over the other? >>>>>>> >>>>>>> I'd be especially interested in hearing about why people might >>>>>>> choose spark-ec2 over EMR, since the latter option seems to have shaped >>>>>>> up >>>>>>> nicely this year. >>>>>>> >>>>>>> Nick >>>>>>> >>>>>>> >>>>>> ------------------------------ >>>>>> View this message in context: Re: spark-ec2 vs. EMR >>>>>> <http://apache-spark-user-list.1001560.n3.nabble.com/Re-spark-ec2-vs-EMR-tp25538.html> >>>>>> Sent from the Apache Spark User List mailing list archive >>>>>> <http://apache-spark-user-list.1001560.n3.nabble.com/> at Nabble.com. >>>>>> >>>>>> >>>>>> >>> >> >