- IMHO, #2 is preferred as it could work in any environment (Mesos, Standalone et al). While Spark needs HDFS (for any decent distributed system) YARN is not required at all - Meson is a lot better. - Also managing the app with appropriate bootstrap/deployment framework is more flexible across multiple scenarios, topologies et al. - What kind of capabilities are you thinking of ? Automatic discovery ? Dynamic deployment based on resources available, versions of Hadoop et al ?
Cheers <k/> On Sun, Jul 27, 2014 at 6:32 PM, Mayur Rustagi <mayur.rust...@gmail.com> wrote: > Based on some discussions with my application users, I have been trying to > come up with a standard way to deploy applications built on Spark > > 1. Bundle the version of spark with your application and ask users store > it in hdfs before referring it in yarn to boot your application > 2. Provide ways to manage dependency in your app across various versions > of spark bundled in with Hadoop distributions > > 1 provides greater control and reliability as I am only working against > yarn versions and dependencies, I assume 2 gives me some benefits of > distribution versions of spark (easier management, common sysops tools ?? ) > . > I was wondering if anyone has thoughts around both and any reasons to > prefer one over the other. > > Sent from my iPad