FWIW I've run into similar BLAS related problems before and wrote up a
document on how to do this for Spark EC2 clusters at
https://github.com/amplab/ml-matrix/blob/master/EC2.md -- Note that this
works with a vanilla Spark build (you only need to link to netlib-lgpl in
your App) but requires the app jar to be present on all the machines.

Thanks
Shivaram

On Tue, Jul 21, 2015 at 7:37 AM, Arun Ahuja <aahuj...@gmail.com> wrote:

> Yes, I imagine it's the driver's classpath -  I'm pulling those
> screenshots straight from the Spark UI environment page.  Is there
> somewhere else to grab the executor class path?
>
> Also, the warning is only printing once, so it's also not clear whether
> the warning is from the driver or exectuor, would you know?
>
> Thanks,
> Arun
>
> On Tue, Jul 21, 2015 at 7:52 AM, Sean Owen <so...@cloudera.com> wrote:
>
>> Great, and that file exists on HDFS and is world readable? just
>> double-checking.
>>
>> What classpath is this -- your driver or executor? this is the driver,
>> no? I assume so just because it looks like it references the assembly you
>> built locally and from which you're launching the driver.
>>
>> I think we're concerned with the executors and what they have on the
>> classpath. I suspect there is still a problem somewhere in there.
>>
>> On Mon, Jul 20, 2015 at 4:59 PM, Arun Ahuja <aahuj...@gmail.com> wrote:
>>
>>> Cool, I tried that as well, and doesn't seem different:
>>>
>>> spark.yarn.jar seems set
>>>
>>> [image: Inline image 1]
>>>
>>> This actually doesn't change the classpath, not sure if it should:
>>>
>>> [image: Inline image 3]
>>>
>>> But same netlib warning.
>>>
>>> Thanks for the help!
>>> - Arun
>>>
>>> On Fri, Jul 17, 2015 at 3:18 PM, Sandy Ryza <sandy.r...@cloudera.com>
>>> wrote:
>>>
>>>> Can you try setting the spark.yarn.jar property to make sure it points
>>>> to the jar you're thinking of?
>>>>
>>>> -Sandy
>>>>
>>>> On Fri, Jul 17, 2015 at 11:32 AM, Arun Ahuja <aahuj...@gmail.com>
>>>> wrote:
>>>>
>>>>> Yes, it's a YARN cluster and using spark-submit to run.  I have
>>>>> SPARK_HOME set to the directory above and using the spark-submit script
>>>>> from there.
>>>>>
>>>>> bin/spark-submit --master yarn-client --executor-memory 10g 
>>>>> --driver-memory 8g --num-executors 400 --executor-cores 1 --class 
>>>>> org.hammerlab.guacamole.Guacamole --conf spark.default.parallelism=4000 
>>>>> --conf spark.storage.memoryFraction=0.15
>>>>>
>>>>> ​
>>>>>
>>>>> libgfortran.so.3 is also there
>>>>>
>>>>> ls  /usr/lib64/libgfortran.so.3
>>>>> /usr/lib64/libgfortran.so.3
>>>>>
>>>>> These are jniloader files in the jar
>>>>>
>>>>> jar tf 
>>>>> /hpc/users/ahujaa01/src/spark/assembly/target/scala-2.10/spark-assembly-1.5.0-SNAPSHOT-hadoop2.6.0.jar
>>>>>  | grep jniloader
>>>>> META-INF/maven/com.github.fommil/jniloader/
>>>>> META-INF/maven/com.github.fommil/jniloader/pom.xml
>>>>> META-INF/maven/com.github.fommil/jniloader/pom.properties
>>>>>
>>>>> ​
>>>>>
>>>>> Thanks,
>>>>> Arun
>>>>>
>>>>> On Fri, Jul 17, 2015 at 1:30 PM, Sean Owen <so...@cloudera.com> wrote:
>>>>>
>>>>>> Make sure /usr/lib64 contains libgfortran.so.3; that's really the
>>>>>> issue.
>>>>>>
>>>>>> I'm pretty sure the answer is 'yes', but, make sure the assembly has
>>>>>> jniloader too. I don't see why it wouldn't, but, that's needed.
>>>>>>
>>>>>> What is your env like -- local, standalone, YARN? how are you running?
>>>>>> Just want to make sure you are using this assembly across your
>>>>>> cluster.
>>>>>>
>>>>>> On Fri, Jul 17, 2015 at 6:26 PM, Arun Ahuja <aahuj...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Sean,
>>>>>>>
>>>>>>> Thanks for the reply! I did double-check that the jar is one I think
>>>>>>> I am running:
>>>>>>>
>>>>>>> [image: Inline image 2]
>>>>>>>
>>>>>>> jar tf 
>>>>>>> /hpc/users/ahujaa01/src/spark/assembly/target/scala-2.10/spark-assembly-1.5.0-SNAPSHOT-hadoop2.6.0.jar
>>>>>>>  | grep netlib | grep Native
>>>>>>> com/github/fommil/netlib/NativeRefARPACK.class
>>>>>>> com/github/fommil/netlib/NativeRefBLAS.class
>>>>>>> com/github/fommil/netlib/NativeRefLAPACK.class
>>>>>>> com/github/fommil/netlib/NativeSystemARPACK.class
>>>>>>> com/github/fommil/netlib/NativeSystemBLAS.class
>>>>>>> com/github/fommil/netlib/NativeSystemLAPACK.class
>>>>>>>
>>>>>>> Also, I checked the gfortran version on the cluster nodes and it is
>>>>>>> available and is 5.1
>>>>>>>
>>>>>>> $ gfortran --version
>>>>>>> GNU Fortran (GCC) 5.1.0
>>>>>>> Copyright (C) 2015 Free Software Foundation, Inc.
>>>>>>>
>>>>>>> and still see:
>>>>>>>
>>>>>>> 15/07/17 13:20:53 WARN BLAS: Failed to load implementation from: 
>>>>>>> com.github.fommil.netlib.NativeSystemBLAS
>>>>>>> 15/07/17 13:20:53 WARN BLAS: Failed to load implementation from: 
>>>>>>> com.github.fommil.netlib.NativeRefBLAS
>>>>>>> 15/07/17 13:20:53 WARN LAPACK: Failed to load implementation from: 
>>>>>>> com.github.fommil.netlib.NativeSystemLAPACK
>>>>>>> 15/07/17 13:20:53 WARN LAPACK: Failed to load implementation from: 
>>>>>>> com.github.fommil.netlib.NativeRefLAPACK
>>>>>>>
>>>>>>> ​
>>>>>>>
>>>>>>> Does anything need to be adjusted in my application POM?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Arun
>>>>>>>
>>>>>>> On Thu, Jul 16, 2015 at 5:26 PM, Sean Owen <so...@cloudera.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Yes, that's most of the work, just getting the native libs into the
>>>>>>>> assembly. netlib can find them from there even if you don't have
>>>>>>>> BLAS
>>>>>>>> libs on your OS, since it includes a reference implementation as a
>>>>>>>> fallback.
>>>>>>>>
>>>>>>>> One common reason it won't load is not having libgfortran installed
>>>>>>>> on
>>>>>>>> your OSes though. It has to be 4.6+ too. That can't be shipped even
>>>>>>>> in
>>>>>>>> netlib and has to exist on your hosts.
>>>>>>>>
>>>>>>>> The other thing I'd double-check is whether you are really using
>>>>>>>> this
>>>>>>>> assembly you built for your job -- like, it's the actually the
>>>>>>>> assembly the executors are using.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Jul 7, 2015 at 8:47 PM, Arun Ahuja <aahuj...@gmail.com>
>>>>>>>> wrote:
>>>>>>>> > Is there more documentation on what is needed to setup
>>>>>>>> BLAS/LAPACK native
>>>>>>>> > suport with Spark.
>>>>>>>> >
>>>>>>>> > I’ve built spark with the -Pnetlib-lgpl flag and see that the
>>>>>>>> netlib classes
>>>>>>>> > are in the assembly jar.
>>>>>>>> >
>>>>>>>> > jar tvf spark-assembly-1.5.0-SNAPSHOT-hadoop2.6.0.jar  | grep
>>>>>>>> netlib | grep
>>>>>>>> > Native
>>>>>>>> >   6625 Tue Jul 07 15:22:08 EDT 2015
>>>>>>>> > com/github/fommil/netlib/NativeRefARPACK.class
>>>>>>>> >  21123 Tue Jul 07 15:22:08 EDT 2015
>>>>>>>> > com/github/fommil/netlib/NativeRefBLAS.class
>>>>>>>> > 178334 Tue Jul 07 15:22:08 EDT 2015
>>>>>>>> > com/github/fommil/netlib/NativeRefLAPACK.class
>>>>>>>> >   6640 Tue Jul 07 15:22:10 EDT 2015
>>>>>>>> > com/github/fommil/netlib/NativeSystemARPACK.class
>>>>>>>> >  21138 Tue Jul 07 15:22:10 EDT 2015
>>>>>>>> > com/github/fommil/netlib/NativeSystemBLAS.class
>>>>>>>> > 178349 Tue Jul 07 15:22:10 EDT 2015
>>>>>>>> > com/github/fommil/netlib/NativeSystemLAPACK.class
>>>>>>>> >
>>>>>>>> > Also I see the following in /usr/lib64
>>>>>>>> >
>>>>>>>> >> ls /usr/lib64/libblas.
>>>>>>>> > libblas.a         libblas.so        libblas.so.3
>>>>>>>> libblas.so.3.2
>>>>>>>> > libblas.so.3.2.1
>>>>>>>> >
>>>>>>>> >> ls /usr/lib64/liblapack
>>>>>>>> > liblapack.a         liblapack_pic.a     liblapack.so
>>>>>>>> liblapack.so.3
>>>>>>>> > liblapack.so.3.2    liblapack.so.3.2.1
>>>>>>>> >
>>>>>>>> > But I stil see the following in the Spark logs:
>>>>>>>> >
>>>>>>>> > 15/07/07 15:36:25 WARN BLAS: Failed to load implementation from:
>>>>>>>> > com.github.fommil.netlib.NativeSystemBLAS
>>>>>>>> > 15/07/07 15:36:25 WARN BLAS: Failed to load implementation from:
>>>>>>>> > com.github.fommil.netlib.NativeRefBLAS
>>>>>>>> > 15/07/07 15:36:26 WARN LAPACK: Failed to load implementation from:
>>>>>>>> > com.github.fommil.netlib.NativeSystemLAPACK
>>>>>>>> > 15/07/07 15:36:26 WARN LAPACK: Failed to load implementation from:
>>>>>>>> > com.github.fommil.netlib.NativeRefLAPACK
>>>>>>>> >
>>>>>>>> > Anything in this process I missed?
>>>>>>>> >
>>>>>>>> > Thanks,
>>>>>>>> > Arun
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to