Re: Local Dev Env with Mesos + Spark Streaming on Docker: Can't submit jobs.

Gerard Maas Wed, 14 May 2014 19:09:27 -0700

Hi Jacob,

Thanks for the help & answer on the docker question. Have you already
experimented with the new link feature in Docker? That does not help the
HDFS issue as the DataNode needs the namenode and vice-versa but it does
facilitate simpler client-server interactions.


My issue described at the beginning is  related to networking between the
host and the docker images, but I was loosing too much time tracking down
the exact problem, so I moved my Spark job driver into the mesos node and
it started working.  Sadly, my Mesos UI is partially crippled as workers
are not addressable (therefore spark job logs are hard to gather)

Your discussion about dynamic port allocation is very relevant to
understand why some components cannot talk with each other.  I'll need to
have a more in-depth read of that discussion to  find a better solution for
my local development environment.

regards,  Gerard.



On Tue, May 6, 2014 at 3:30 PM, Jacob Eisinger <jeis...@us.ibm.com> wrote:

> Howdy,
>
> You might find the discussion Andrew and I have been having about Docker
> and network security [1] applicable.
>
> Also, I posted an answer [2] to your stackoverflow question.
>
> [1]
> http://apache-spark-user-list.1001560.n3.nabble.com/spark-shell-driver-interacting-with-Workers-in-YARN-mode-firewall-blocking-communication-tp5237p5441.html
> [2]
> http://stackoverflow.com/questions/23410505/how-to-run-hdfs-cluster-without-dns/23495100#23495100
>
> Jacob D. Eisinger
> IBM Emerging Technologies
> jeis...@us.ibm.com - (512) 286-6075
>
> [image: Inactive hide details for Gerard Maas ---05/05/2014 04:18:08
> PM---Hi Benjamin, Yes, we initially used a modified version of the]Gerard
> Maas ---05/05/2014 04:18:08 PM---Hi Benjamin, Yes, we initially used a
> modified version of the AmpLabs docker scripts
>
> From: Gerard Maas <gerard.m...@gmail.com>
> To: user@spark.apache.org
> Date: 05/05/2014 04:18 PM
> Subject: Re: Local Dev Env with Mesos + Spark Streaming on Docker: Can't
> submit jobs.
> ------------------------------
>
>
>
> Hi Benjamin,
>
> Yes, we initially used a modified version of the AmpLabs docker scripts
> [1]. The amplab docker images are a good starting point.
> One of the biggest hurdles has been HDFS, which requires reverse-DNS and I
> didn't want to go the dnsmasq route to keep the containers relatively
> simple to use without the need of external scripts. Ended up running a
> 1-node setup nnode+dnode. I'm still looking for a better solution for HDFS
> [2]
>
> Our usecase using docker is to easily create local dev environments both
> for development and for automated functional testing (using cucumber). My
> aim is to strongly reduce the time of the develop-deploy-test cycle.
> That  also means that we run the minimum number of instances required to
> have a functionally working setup. E.g. 1 Zookeeper, 1 Kafka broker, ...
>
> For the actual cluster deployment we have Chef-based devops toolchain that
>  put things in place on public cloud providers.
> Personally, I think Docker rocks and would like to replace those complex
> cookbooks with Dockerfiles once the technology is mature enough.
>
> -greetz, Gerard.
>
> [1] 
> *https://github.com/amplab/docker-scripts*<https://github.com/amplab/docker-scripts>
>
> [2]
> *http://stackoverflow.com/questions/23410505/how-to-run-hdfs-cluster-without-dns*<http://stackoverflow.com/questions/23410505/how-to-run-hdfs-cluster-without-dns>
>
>
> On Mon, May 5, 2014 at 11:00 PM, Benjamin 
> <*bboui...@gmail.com*<bboui...@gmail.com>>
> wrote:
>
>    Hi,
>
>    Before considering running on Mesos, did you try to submit the
>    application on Spark deployed without Mesos on Docker containers ?
>
>    Currently investigating this idea to deploy quickly a complete set of
>    clusters with Docker, I'm interested by your findings on sharing the
>    settings of Kafka and Zookeeper across nodes. How many broker and zookeeper
>    do you use ?
>
>    Regards,
>
>
>
>    On Mon, May 5, 2014 at 10:11 PM, Gerard Maas 
> <*gerard.m...@gmail.com*<gerard.m...@gmail.com>>
>    wrote:
>       Hi all,
>
>       I'm currently working on creating a set of docker images to
>       facilitate local development with Spark/streaming on Mesos (+zk, hdfs,
>       kafka)
>
>       After solving the initial hurdles to get things working together in
>       docker containers, now everything seems to start-up correctly and the 
> mesos
>       UI shows slaves as they are started.
>
>       I'm trying to submit a job from IntelliJ and the jobs submissions
>       seem to get lost in Mesos translation. The logs are not helping me to
>       figure out what's wrong, so I'm posting them here in the hope that they 
> can
>       ring a bell and somebdoy could provide me a hint on what's wrong/missing
>       with my setup.
>
>
>       ---- DRIVER (IntelliJ running a Job.scala main) ----
>       14/05/05 21:52:31 INFO MetadataCleaner: Ran metadata cleaner for
>       SHUFFLE_BLOCK_MANAGER
>       14/05/05 21:52:31 INFO BlockManager: Dropping broadcast blocks
>       older than 1399319251962
>       14/05/05 21:52:31 INFO BlockManager: Dropping non broadcast blocks
>       older than 1399319251962
>       14/05/05 21:52:31 INFO MetadataCleaner: Ran metadata cleaner for
>       BROADCAST_VARS
>       14/05/05 21:52:31 INFO MetadataCleaner: Ran metadata cleaner for
>       BLOCK_MANAGER
>       14/05/05 21:52:32 INFO MetadataCleaner: Ran metadata cleaner for
>       HTTP_BROADCAST
>       14/05/05 21:52:32 INFO MetadataCleaner: Ran metadata cleaner for
>       MAP_OUTPUT_TRACKER
>       14/05/05 21:52:32 INFO MetadataCleaner: Ran metadata cleaner for
>       SPARK_CONTEXT
>
>
>       ---- MESOS MASTER ----
>       I0505 19:52:39.718080   388 master.cpp:690] Registering framework
>       201405051517-67113388-5050-383-6995 at scheduler(1)@
>       *127.0.1.1:58115* <http://127.0.1.1:58115/>
>       I0505 19:52:39.718261   388 master.cpp:493] Framework
>       201405051517-67113388-5050-383-6995 disconnected
>       I0505 19:52:39.718277   389 hierarchical_allocator_process.hpp:332]
>       Added framework 201405051517-67113388-5050-383-6995
>       I0505 19:52:39.718312   388 master.cpp:520] Giving framework
>       201405051517-67113388-5050-383-6995 0ns to failover
>       I0505 19:52:39.718431   389 hierarchical_allocator_process.hpp:408]
>       Deactivated framework 201405051517-67113388-5050-383-6995
>       W0505 19:52:39.718459   388 master.cpp:1388] Master returning
>       resources offered to framework 201405051517-67113388-5050-383-6995 
> because
>       the framework has terminated or is inactive
>       I0505 19:52:39.718567   388 master.cpp:1376] Framework failover
>       timeout, removing framework 201405051517-67113388-5050-383-6995
>
>
>
>       ---- MESOS SLAVE ----
>       I0505 19:49:27.662019    20 slave.cpp:1191] Asked to shut down
>       framework 201405051517-67113388-5050-383-6803 by
>       *master@172.17.0.4:5050* <http://master@172.17.0.4:5050/>
>       W0505 19:49:27.662072    20 slave.cpp:1206] Cannot shut down
>       unknown framework 201405051517-67113388-5050-383-6803
>       I0505 19:49:28.662153    18 slave.cpp:1191] Asked to shut down
>       framework 201405051517-67113388-5050-383-6804 by
>       *master@172.17.0.4:5050* <http://master@172.17.0.4:5050/>
>       W0505 19:49:28.662212    18 slave.cpp:1206] Cannot shut down
>       unknown framework 201405051517-67113388-5050-383-6804
>       I0505 19:49:29.662199    13 slave.cpp:1191] Asked to shut down
>       framework 201405051517-67113388-5050-383-6805 by
>       *master@172.17.0.4:5050* <http://master@172.17.0.4:5050/>
>       W0505 19:49:29.662256    13 slave.cpp:1206] Cannot shut down
>       unknown framework 201405051517-67113388-5050-383-6805
>       I0505 19:49:30.662443    16 slave.cpp:1191] Asked to shut down
>       framework 201405051517-67113388-5050-383-6806 by
>       *master@172.17.0.4:5050* <http://master@172.17.0.4:5050/>
>       W0505 19:49:30.662489    16 slave.cpp:1206] Cannot shut down
>       unknown framework 201405051517-67113388-5050-383-6806
>
>
>       Thanks in advance,
>
>       Gerard.
>
>
>
>    --
>    Benjamin Bouillé
> *+33 665 050 285* <%2B33%20665%20050%20285>
>
>
>

Re: Local Dev Env with Mesos + Spark Streaming on Docker: Can't submit jobs.

Reply via email to