Bumping 1on1 conversation to mailinglist: On 10 Feb 2015, at 13:24, Hans van den Bogert <hansbog...@gmail.com> wrote:
> > It’s self built, I can’t otherwise as I can’t install packages on the cluster > here. > > The problem seems with libtool. When compiling Mesos on a host with apr-devel > and apr-util-devel the shared libraries are named libapr*.so without prefix > (the ones with prefix are also installed of course). On our compute nodes no > *devel packages are installed, just the binary package, which have are named > libapr*.so.0 . But even the “make install”-ed binaries still refer to the > devel-packages’ shared library. I’m not sure if this is intended behaviour by > libtool, because it is the one changing at start/runtime the binaries’ RPATH > (which are initially well defined) to the libapr*.so. > > But this is probably autoconf fu, just hoping someone here has dealt with the > same issue. > > On 09 Feb 2015, at 20:37, Tim Chen <t...@mesosphere.io> wrote: > >> I'm still trying to grasp what your environment setup is like, it's odd to >> see a g++ stderr when you running mesos. >> >> Are you building Mesos yourself and running it, or you've installed it >> through some package? >> >> Tim >> >> >> >> On Mon, Feb 9, 2015 at 1:03 AM, Hans van den Bogert <hansbog...@gmail.com> >> wrote: >> Okay, I was kind of ambiguous, I assume you mean this one: >> >> [vdbogert@node002 ~]$ cat >> /local/vdbogert/var/lib/mesos/slaves/20150206-110658-16813322-5050-5515-S0/frameworks/20150208-200943-16813322-5050-26370-0000/executors/3/runs/latest/stdout >> [vdbogert@node002 ~]$ >> >> it’s empty. >> >> On 09 Feb 2015, at 06:22, Tim Chen <t...@mesosphere.io> wrote: >> >>> Hi Hans, >>> >>> I was referring to the stdout/stderr of the task, not the slave. >>> >>> Tim >>> >>> On Sun, Feb 8, 2015 at 1:21 PM, Hans van den Bogert <hansbog...@gmail.com> >>> wrote: >>> >>> >>> >>> > Hi there, >>> > >>> > It looks like while trying to launch the executor (or one of the process >>> > like the fetcher to fetch the uris) >>> The fetching seems to have succeeded as well as extracting, as the >>> “spark-1.2.0-bin-hadoop2.4” directory exists in the slave sandbox. >>> Furthermore, it seems the executor URI is superfluous in my environment as >>> I’ve checked the code, and if an URI is not provided, the task will not >>> refer to an extracted distro, but to a directory with the same path as the >>> current spark distro, which makes sense in a cluster environment where >>> data is on a network-shared disk. I’ve tried *not* supplying an >>> spark.executor.uri and fine-grained mode still works fine. Coarse-grained >>> mode still fails with the same libapr* errors. >>> >>> > was failing because of the dependencies problem you see. Your mesos-slave >>> > shouldn't be able to run though, were you running 0.20.0 slave and >>> > upgraded to 0.21.0? We introduced the dependencies for libapr and libsvn >>> > for Mesos 0.21.0. >>> I’ve only ever tried compiling 0.21.0. I’ve checked all binaries in >>> MESOS_HOME/build/src/.libs with ‘ldd’ and all are referring to a correct >>> existing libapr*-1.so.0 (mind the trailing “.0”). >>> >>> > What's the stdout for the task like? >>> >>> > > Mesos slaves' stdout are empty. >>> >>> >>> It’s a pity spark’s logging in this case is pretty marginal, as is mesos’. >>> One can’t log the (raw) task-descriptions as far as I can see, which would >>> be very helpful in this case. >>> I could resort to building spark from source as well and add some logging, >>> but I’m afraid I will introduce other peculiarities. Do you think it’s my >>> only option? >>> >>> Thanks, >>> >>> H. >>> >>> > Tim >>> > >>> > >>> > >>> > >>> > On Mon, Feb 9, 2015 at 4:10 AM, Hans van den Bogert >>> > <hansbog...@gmail.com> wrote: >>> > I wasn’t thorough, the complete stderr includes: >>> > >>> > g++: /usr/lib64/libaprutil-1.so: No such file or directory >>> > g++: /usr/lib64/libapr-1.so: No such file or directoryn >>> > (including that trailing ’n') >>> > >>> > Though I can’t figure out how the process indirection is going from the >>> > frontend spark application to mesos executors and where this shared >>> > library error comes from. >>> > >>> > Hope someone can shed some light, >>> > >>> > Thanks >>> > >>> > On 08 Feb 2015, at 14:15, Hans van den Bogert <hansbog...@gmail.com> >>> > wrote: >>> > >>> > > Hi, >>> > > >>> > > >>> > > I’m trying to get coarse mode to work under mesos(0.21.0), I thought >>> > > this would be a trivial change as Mesos was working well in >>> > > fine-grained mode. >>> > > >>> > > However the mesos tasks fail, I can’t pinpoint where things go wrong. >>> > > >>> > > This is a mesos stderr log from a slave: >>> > > >>> > > Fetching URI 'http://upperpaste.com/spark-1.2.0-bin-hadoop2.4.tgz' >>> > > I0208 12:57:45.415575 25720 fetcher.cpp:126] Downloading >>> > > 'http://upperpaste.com/spark-1.2.0-bin-hadoop2.4.tgz' to >>> > > '/local/vdbogert/var/lib/mesos//slaves/20150206-110658-16813322-5050-5515-S1/frameworks/20150208-125721-906005770-5050-32371-0000/executors/0/runs/cb525b32-387c-4698-a27e-8d4213080151/spark-1.2.0-bin-hadoop2.4.tgz' >>> > > I0208 12:58:09.146960 25720 fetcher.cpp:64] Extracted resource >>> > > '/local/vdbogert/var/lib/mesos//slaves/20150206-110658-16813322-5050-5515-S1/frameworks/20150208-125721-906005770-5050-32371-0000/executors/0/runs/cb525b32-387c-4698-a27e-8d4213080151/spark-1.2.0-bin-hadoop2.4.tgz' >>> > > into >>> > > '/local/vdbogert/var/lib/mesos//slaves/20150206-110658-16813322-5050-5515-S1/frameworks/20150208-125721-906005770-5050-32371-0000/executors/0/runs/cb525b32-387c-4698-a27e-8d4213080151’ >>> > > >>> > > Mesos slaves' stdout are empty. >>> > > >>> > > >>> > > And I can confirm the spark distro is correctly extracted: >>> > > $ ls >>> > > spark-1.2.0-bin-hadoop2.4 spark-1.2.0-bin-hadoop2.4.tgz stderr >>> > > stdout >>> > > >>> > > The spark-submit log is here: >>> > > http://pastebin.com/ms3uZ2BK >>> > > >>> > > Mesos-master >>> > > http://pastebin.com/QH2Vn1jX >>> > > >>> > > Mesos-slave >>> > > http://pastebin.com/DXFYemix >>> > > >>> > > >>> > > Can somebody pinpoint me to logs, etc to further investigate this, I’m >>> > > feeling kind of blind. >>> > > Furthermore, do the executors on mesos inherit all configs from the >>> > > spark application/submit? E.g. I’ve given my executors 20GB of memory >>> > > through a spark-submit "—conf” parameter. Should these settings also >>> > > be present in the spark-1.2.0-bin-hadoop2.4.tgz distribution’s configs? >>> > > >>> > > If, in order to be helped here, I need to present more logs etc, please >>> > > let me know. >>> > > >>> > > Regards, >>> > > >>> > > Hans van den Bogert >>> > >>> > >>> > --------------------------------------------------------------------- >>> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> > For additional commands, e-mail: user-h...@spark.apache.org >>> > >>> > >>> >>> >> >> >