Issue has been fixed after lots of R&D around finally found preety simple things causing this problem
It was related to permission issue on the python libraries. The user I am logged in was not having enough permission to read/execute the following python liabraries. /usr/lib/python2.7/site-packages/ /usr/lib64/python2.7/ so above path should have read/execute permission to user executing python/pyspark program. Thanks everyone for your help with same. Appreciate! Regards On Sun, Jun 5, 2016 at 12:04 AM, Daniel Rodriguez <df.rodriguez...@gmail.com > wrote: > Like people have said you need numpy in all the nodes of the cluster. The > easiest way in my opinion is to use anaconda: > https://www.continuum.io/downloads but that can get tricky to manage in > multiple nodes if you don't have some configuration management skills. > > How are you deploying the spark cluster? If you are using cloudera I > recommend to use the Anaconda Parcel: > http://blog.cloudera.com/blog/2016/02/making-python-on-apache-hadoop-easier-with-anaconda-and-cdh/ > > On 4 Jun 2016, at 11:13, Gourav Sengupta <gourav.sengu...@gmail.com> > wrote: > > Hi, > > I think that solution is too simple. Just download anaconda (if you pay > for the licensed version you will eventually feel like being in heaven when > you move to CI and CD and live in a world where you have a data product > actually running in real life). > > Then start the pyspark program by including the following: > > PYSPARK_PYTHON=<<path to your anaconda > installation>>/anaconda2/bin/python2.7 PATH=$PATH:<<path to your anaconda > installation>>/anaconda/bin <<path to your pyspark>>/pyspark > > :) > > In case you are using it in EMR the solution is a bit tricky. Just let me > know in case you want any further help. > > > Regards, > Gourav Sengupta > > > > > > On Thu, Jun 2, 2016 at 7:59 PM, Eike von Seggern < > eike.segg...@sevenval.com> wrote: > >> Hi, >> >> are you using Spark on one machine or many? >> >> If on many, are you sure numpy is correctly installed on all machines? >> >> To check that the environment is set-up correctly, you can try something >> like >> >> import os >> pythonpaths = sc.range(10).map(lambda i: >> os.environ.get("PYTHONPATH")).collect() >> print(pythonpaths) >> >> HTH >> >> Eike >> >> 2016-06-02 15:32 GMT+02:00 Bhupendra Mishra <bhupendra.mis...@gmail.com>: >> >>> did not resolved. :( >>> >>> On Thu, Jun 2, 2016 at 3:01 PM, Sergio Fernández <wik...@apache.org> >>> wrote: >>> >>>> >>>> On Thu, Jun 2, 2016 at 9:59 AM, Bhupendra Mishra < >>>> bhupendra.mis...@gmail.com> wrote: >>>>> >>>>> and i have already exported environment variable in spark-env.sh as >>>>> follows.. error still there error: ImportError: No module named numpy >>>>> >>>>> export PYSPARK_PYTHON=/usr/bin/python >>>>> >>>> >>>> According the documentation at >>>> http://spark.apache.org/docs/latest/configuration.html#environment-variables >>>> the PYSPARK_PYTHON environment variable is for poniting to the Python >>>> interpreter binary. >>>> >>>> If you check the programming guide >>>> https://spark.apache.org/docs/0.9.0/python-programming-guide.html#installing-and-configuring-pyspark >>>> it says you need to add your custom path to PYTHONPATH (the script >>>> automatically adds the bin/pyspark there). >>>> >>>> So typically in Linux you would need to add the following (assuming you >>>> installed numpy there): >>>> >>>> export PYTHONPATH=$PYTHONPATH:/usr/lib/python2.7/dist-packages >>>> >>>> Hope that helps. >>>> >>>> >>>> >>>> >>>>> On Thu, Jun 2, 2016 at 12:04 AM, Julio Antonio Soto de Vicente < >>>>> ju...@esbet.es> wrote: >>>>> >>>>>> Try adding to spark-env.sh (renaming if you still have it with >>>>>> .template at the end): >>>>>> >>>>>> PYSPARK_PYTHON=/path/to/your/bin/python >>>>>> >>>>>> Where your bin/python is your actual Python environment with Numpy >>>>>> installed. >>>>>> >>>>>> >>>>>> El 1 jun 2016, a las 20:16, Bhupendra Mishra < >>>>>> bhupendra.mis...@gmail.com> escribió: >>>>>> >>>>>> I have numpy installed but where I should setup PYTHONPATH? >>>>>> >>>>>> >>>>>> On Wed, Jun 1, 2016 at 11:39 PM, Sergio Fernández <wik...@apache.org> >>>>>> wrote: >>>>>> >>>>>>> sudo pip install numpy >>>>>>> >>>>>>> On Wed, Jun 1, 2016 at 5:56 PM, Bhupendra Mishra < >>>>>>> bhupendra.mis...@gmail.com> wrote: >>>>>>> >>>>>>>> Thanks . >>>>>>>> How can this be resolved? >>>>>>>> >>>>>>>> On Wed, Jun 1, 2016 at 9:02 PM, Holden Karau <hol...@pigscanfly.ca> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Generally this means numpy isn't installed on the system or your >>>>>>>>> PYTHONPATH has somehow gotten pointed somewhere odd, >>>>>>>>> >>>>>>>>> On Wed, Jun 1, 2016 at 8:31 AM, Bhupendra Mishra < >>>>>>>>> bhupendra.mis...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> If any one please can help me with following error. >>>>>>>>>> >>>>>>>>>> File >>>>>>>>>> "/opt/mapr/spark/spark-1.6.1/python/lib/pyspark.zip/pyspark/mllib/__init__.py", >>>>>>>>>> line 25, in <module> >>>>>>>>>> >>>>>>>>>> ImportError: No module named numpy >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks in advance! >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Cell : 425-233-8271 >>>>>>>>> Twitter: https://twitter.com/holdenkarau >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Sergio Fernández >>>>>>> Partner Technology Manager >>>>>>> Redlink GmbH >>>>>>> m: +43 6602747925 >>>>>>> e: sergio.fernan...@redlink.co >>>>>>> w: http://redlink.co >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>>> >>>> -- >>>> Sergio Fernández >>>> Partner Technology Manager >>>> Redlink GmbH >>>> m: +43 6602747925 >>>> e: sergio.fernan...@redlink.co >>>> w: http://redlink.co >>>> >>> >>> >> >> >> -- >> ------------------------------------------------ >> *Jan Eike von Seggern* >> Data Scientist >> ------------------------------------------------ >> *Sevenval Technologies GmbH * >> >> FRONT-END-EXPERTS SINCE 1999 >> >> Köpenicker Straße 154 | 10997 Berlin >> >> office +49 30 707 190 - 229 >> mail eike.segg...@sevenval.com >> >> www.sevenval.com >> >> Sitz: Köln, HRB 79823 >> Geschäftsführung: Jan Webering (CEO), Thorsten May, Sascha Langfus, >> Joern-Carlos Kuntze >> >> *Wir erhöhen den Return On Investment bei Ihren Mobile und Web-Projekten. >> Sprechen Sie uns an:*http://roi.sevenval.com/ >> >> ----------------------------------------------------------------------------------------------------------------------------------------------- >> FOLLOW US on >> >> [image: Sevenval blog] >> <http://sevenval.us11.list-manage1.com/track/click?u=5f2d34577b3182d6f029ebe63&id=ff955ef848&e=b789cc1a5f> >> >> [image: sevenval on twitter] >> <http://sevenval.us11.list-manage.com/track/click?u=5f2d34577b3182d6f029ebe63&id=998e8f655c&e=b789cc1a5f> >> [image: sevenval on linkedin] >> <http://sevenval.us11.list-manage.com/track/click?u=5f2d34577b3182d6f029ebe63&id=7ae7d93d42&e=b789cc1a5f>[image: >> sevenval on pinterest] >> <http://sevenval.us11.list-manage2.com/track/click?u=5f2d34577b3182d6f029ebe63&id=f8c66fb950&e=b789cc1a5f> >> > > >