Re: Standardized Spark dev environment

Nicholas Chammas Tue, 20 Jan 2015 19:22:07 -0800

How many profiles (hadoop / hive /scala) would this development environment
support ?

As many as we want. We probably want to cover a good chunk of the build
matrix <https://issues.apache.org/jira/browse/SPARK-2004> that Spark
officially supports.

What does this provide, concretely?

It provides a reliable way to create a “good” Spark development
environment. Roughly speaking, this probably should mean an environment
that matches Jenkins, since that’s where we run “official” testing and
builds.

For example, Spark has to run on Java 6 and Python 2.6. When devs build and
run Spark locally, we can make sure they’re doing it on these versions of
the languages with a simple vagrant up.

Nate, could you comment on how something like this would relate to the
Bigtop effort?

http://chapeau.freevariable.com/2014/08/jvm-test-docker.html

Will, that’s pretty sweet. I tried something similar a few months ago as an
experiment to try building/testing Spark within a container. Here’s the
shell script I used <https://gist.github.com/nchammas/60b04141f3b9f053faaa>
against the base CentOS Docker image to setup an environment ready to build
and test Spark.

We want to run Spark unit tests within containers on Jenkins, so it might
make sense to develop a single Docker image that can be used as both a “dev
environment” as well as execution container on Jenkins.

Perhaps that’s the approach to take instead of looking into Vagrant.

Nick

On Tue Jan 20 2015 at 8:22:41 PM Will Benton <wi...@redhat.com> wrote:

Hey Nick,
>
> I did something similar with a Docker image last summer; I haven't updated
> the images to cache the dependencies for the current Spark master, but it
> would be trivial to do so:
>
> http://chapeau.freevariable.com/2014/08/jvm-test-docker.html
>
>
> best,
> wb
>
>
> ----- Original Message -----
> > From: "Nicholas Chammas" <nicholas.cham...@gmail.com>
> > To: "Spark dev list" <dev@spark.apache.org>
> > Sent: Tuesday, January 20, 2015 6:13:31 PM
> > Subject: Standardized Spark dev environment
> >
> > What do y'all think of creating a standardized Spark development
> > environment, perhaps encoded as a Vagrantfile, and publishing it under
> > `dev/`?
> >
> > The goal would be to make it easier for new developers to get started
> with
> > all the right configs and tools pre-installed.
> >
> > If we use something like Vagrant, we may even be able to make it so that
> a
> > single Vagrantfile creates equivalent development environments across OS
> X,
> > Linux, and Windows, without having to do much (or any) OS-specific work.
> >
> > I imagine for committers and regular contributors, this exercise may seem
> > pointless, since y'all are probably already very comfortable with your
> > workflow.
> >
> > I wonder, though, if any of you think this would be worthwhile as a
> > improvement to the "new Spark developer" experience.
> >
> > Nick
> >
>

Re: Standardized Spark dev environment

Reply via email to