Hi Chester, Writing a design / requirements doc sounds great. One comment though:
On Thu, May 14, 2015 at 11:18 PM, Chester At Work <ches...@alpinenow.com> wrote: > For #5 yes, it's about the command line args. These are args are > the input for the spark jobs. Seems a bit too much to create a file just to > specify spark job args. These args could be few thousands columns in > machine learning jobs. > > The problem is that large command lines are not a Spark limitation, but a platform limitation. So there's little that Spark can do if you run into platform limits. Spark could automate the "write all these args to a file instead" approach, but I wonder how many people actually run into that issue to justify the extra code in Spark. -- Marcelo