Re: numpy + pyspark

2014-06-27 Thread Avishek Saha
Thanks for the great links guys -- let me check out/try both !! On 27 June 2014 08:15, Shannon Quinn wrote: > I suppose along those lines, there's also Anaconda: > https://store.continuum.io/cshop/anaconda/ > > > On 6/27/14, 11:13 AM, Nick Pentreath wrote: > > Hadoopy uses http://www.pyinstall

Re: numpy + pyspark

2014-06-27 Thread Shannon Quinn
I suppose along those lines, there's also Anaconda: https://store.continuum.io/cshop/anaconda/ On 6/27/14, 11:13 AM, Nick Pentreath wrote: Hadoopy uses http://www.pyinstaller.org/ to package things up into an executable that should be runnable without root privileges. It says it support numpy

Re: numpy + pyspark

2014-06-27 Thread Nick Pentreath
Hadoopy uses http://www.pyinstaller.org/ to package things up into an executable that should be runnable without root privileges. It says it support numpy On Fri, Jun 27, 2014 at 5:08 PM, Shannon Quinn wrote: > Would deploying virtualenv on each directory on the cluster be viable? > The depend

Re: numpy + pyspark

2014-06-27 Thread Shannon Quinn
Would deploying virtualenv on each directory on the cluster be viable? The dependencies would get tricky but I think this is the sort of situation it's built for. On 6/27/14, 11:06 AM, Avishek Saha wrote: I too felt the same Nick but I don't have root privileges on the cluster, unfortunately.

Re: numpy + pyspark

2014-06-27 Thread Avishek Saha
I too felt the same Nick but I don't have root privileges on the cluster, unfortunately. Are there any alternatives? On 27 June 2014 08:04, Nick Pentreath wrote: > I've not tried this - but numpy is a tricky and complex package with many > dependencies on Fortran/C libraries etc. I'd say by the

Re: numpy + pyspark

2014-06-27 Thread Nick Pentreath
I've not tried this - but numpy is a tricky and complex package with many dependencies on Fortran/C libraries etc. I'd say by the time you figure out correctly deploying numpy in this manner, you may as well have just built it into your cluster bootstrap process, or PSSH install it on each node...

Re: numpy + pyspark

2014-06-27 Thread Avishek Saha
To clarify I tried it and it almost worked -- but I am getting some problems from the Random module in numpy. If anyone has successfully passed a numpy module (via the --py-files option) to spark-submit then please let me know. Thanks !! Avishek On 26 June 2014 17:45, Avishek Saha wrote: > Hi