Bumping 1on1 conversation to mailinglist:

On 10 Feb 2015, at 13:24, Hans van den Bogert <hansbog...@gmail.com> wrote:

> 
> It’s self built, I can’t otherwise as I can’t install packages on the cluster 
> here.
> 
> The problem seems with libtool. When compiling Mesos on a host with apr-devel 
> and apr-util-devel the shared libraries are named libapr*.so without prefix 
> (the ones with prefix are also installed of course). On our compute nodes no 
> *devel packages are installed, just the binary package, which have are named 
> libapr*.so.0 . But even the “make install”-ed binaries still refer to the 
> devel-packages’ shared library. I’m not sure if this is intended behaviour by 
> libtool, because it is the one changing at start/runtime the binaries’ RPATH 
> (which are initially well defined) to the  libapr*.so. 
> 
> But this is probably autoconf fu, just hoping someone here has dealt with the 
> same issue.
> 
> On 09 Feb 2015, at 20:37, Tim Chen <t...@mesosphere.io> wrote:
> 
>> I'm still trying to grasp what your environment setup is like, it's odd to 
>> see a g++ stderr when you running mesos.
>> 
>> Are you building Mesos yourself and running it, or you've installed it 
>> through some package?
>> 
>> Tim
>> 
>> 
>> 
>> On Mon, Feb 9, 2015 at 1:03 AM, Hans van den Bogert <hansbog...@gmail.com> 
>> wrote:
>> Okay, I was kind of ambiguous, I assume you mean this one:
>> 
>>     [vdbogert@node002 ~]$ cat 
>> /local/vdbogert/var/lib/mesos/slaves/20150206-110658-16813322-5050-5515-S0/frameworks/20150208-200943-16813322-5050-26370-0000/executors/3/runs/latest/stdout
>>     [vdbogert@node002 ~]$
>> 
>> it’s empty.
>> 
>> On 09 Feb 2015, at 06:22, Tim Chen <t...@mesosphere.io> wrote:
>> 
>>> Hi Hans,
>>> 
>>> I was referring to the stdout/stderr of the task, not the slave.
>>> 
>>> Tim
>>> 
>>> On Sun, Feb 8, 2015 at 1:21 PM, Hans van den Bogert <hansbog...@gmail.com> 
>>> wrote:
>>> 
>>> 
>>> 
>>> > Hi there,
>>> >
>>> > It looks like while trying to launch the executor (or one of the process 
>>> > like the fetcher to fetch the uris)
>>> The fetching seems to have succeeded as well as extracting, as the 
>>> “spark-1.2.0-bin-hadoop2.4” directory exists in the slave sandbox. 
>>> Furthermore, it seems the executor URI is superfluous in my environment as 
>>> I’ve checked the code, and if an URI is not provided, the task will not 
>>> refer to an extracted distro, but to a directory with the same path as the 
>>> current  spark distro, which makes sense in a cluster environment where 
>>> data is on a network-shared disk. I’ve tried *not* supplying an 
>>> spark.executor.uri and fine-grained mode still works fine. Coarse-grained 
>>> mode still  fails with the same libapr* errors.
>>> 
>>> > was failing because of the dependencies problem you see. Your mesos-slave 
>>> > shouldn't be able to run though, were you running 0.20.0 slave and 
>>> > upgraded to 0.21.0? We introduced the dependencies for libapr and libsvn 
>>> > for Mesos 0.21.0.
>>> I’ve only ever tried compiling 0.21.0. I’ve checked all binaries in 
>>> MESOS_HOME/build/src/.libs with ‘ldd’ and all are referring to a correct 
>>> existing libapr*-1.so.0 (mind the trailing “.0”).
>>> 
>>> > What's the stdout for the task like?
>>> 
>>> > > Mesos slaves' stdout are empty.
>>> 
>>> 
>>> It’s a pity spark’s logging in this case is pretty marginal, as is mesos’. 
>>> One can’t log the (raw) task-descriptions as far as I can see, which would 
>>> be very helpful in this case.
>>> I could resort to building spark from source as well and add some logging, 
>>> but I’m afraid I will introduce other peculiarities. Do you think it’s my 
>>> only option?
>>> 
>>> Thanks,
>>> 
>>> H.
>>> 
>>> > Tim
>>> >
>>> >
>>> >
>>> >
>>> > On Mon, Feb 9, 2015 at 4:10 AM, Hans van den Bogert 
>>> > <hansbog...@gmail.com> wrote:
>>> > I wasn’t thorough, the complete stderr includes:
>>> >
>>> >     g++: /usr/lib64/libaprutil-1.so: No such file or directory
>>> >     g++: /usr/lib64/libapr-1.so: No such file or directoryn
>>> > (including that trailing ’n')
>>> >
>>> > Though I can’t figure out how the process indirection is going from the 
>>> > frontend spark application to mesos executors and where this shared 
>>> > library error comes from.
>>> >
>>> > Hope someone can shed some light,
>>> >
>>> > Thanks
>>> >
>>> > On 08 Feb 2015, at 14:15, Hans van den Bogert <hansbog...@gmail.com> 
>>> > wrote:
>>> >
>>> > > Hi,
>>> > >
>>> > >
>>> > > I’m trying to get coarse mode to work under mesos(0.21.0), I thought 
>>> > > this would be a trivial change as Mesos was working well in 
>>> > > fine-grained mode.
>>> > >
>>> > > However the mesos tasks fail, I can’t pinpoint where things go wrong.
>>> > >
>>> > > This is a mesos stderr log from a slave:
>>> > >
>>> > >    Fetching URI 'http://upperpaste.com/spark-1.2.0-bin-hadoop2.4.tgz'
>>> > >    I0208 12:57:45.415575 25720 fetcher.cpp:126] Downloading 
>>> > > 'http://upperpaste.com/spark-1.2.0-bin-hadoop2.4.tgz' to 
>>> > > '/local/vdbogert/var/lib/mesos//slaves/20150206-110658-16813322-5050-5515-S1/frameworks/20150208-125721-906005770-5050-32371-0000/executors/0/runs/cb525b32-387c-4698-a27e-8d4213080151/spark-1.2.0-bin-hadoop2.4.tgz'
>>> > >    I0208 12:58:09.146960 25720 fetcher.cpp:64] Extracted resource 
>>> > > '/local/vdbogert/var/lib/mesos//slaves/20150206-110658-16813322-5050-5515-S1/frameworks/20150208-125721-906005770-5050-32371-0000/executors/0/runs/cb525b32-387c-4698-a27e-8d4213080151/spark-1.2.0-bin-hadoop2.4.tgz'
>>> > >  into 
>>> > > '/local/vdbogert/var/lib/mesos//slaves/20150206-110658-16813322-5050-5515-S1/frameworks/20150208-125721-906005770-5050-32371-0000/executors/0/runs/cb525b32-387c-4698-a27e-8d4213080151’
>>> > >
>>> > > Mesos slaves' stdout are empty.
>>> > >
>>> > >
>>> > > And I can confirm the spark distro is correctly extracted:
>>> > >    $ ls
>>> > >    spark-1.2.0-bin-hadoop2.4  spark-1.2.0-bin-hadoop2.4.tgz  stderr  
>>> > > stdout
>>> > >
>>> > > The spark-submit log is here:
>>> > > http://pastebin.com/ms3uZ2BK
>>> > >
>>> > > Mesos-master
>>> > > http://pastebin.com/QH2Vn1jX
>>> > >
>>> > > Mesos-slave
>>> > > http://pastebin.com/DXFYemix
>>> > >
>>> > >
>>> > > Can somebody pinpoint me to logs, etc to further investigate this, I’m 
>>> > > feeling kind of blind.
>>> > > Furthermore, do the executors on mesos inherit all configs from the 
>>> > > spark application/submit? E.g. I’ve given my executors 20GB of memory 
>>> > > through a spark-submit "—conf”  parameter. Should these settings also 
>>> > > be present in the spark-1.2.0-bin-hadoop2.4.tgz distribution’s configs?
>>> > >
>>> > > If, in order to be helped here, I need to present more logs etc, please 
>>> > > let me know.
>>> > >
>>> > > Regards,
>>> > >
>>> > > Hans van den Bogert
>>> >
>>> >
>>> > ---------------------------------------------------------------------
>>> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> > For additional commands, e-mail: user-h...@spark.apache.org
>>> >
>>> >
>>> 
>>> 
>> 
>> 
> 

Reply via email to