pickling error with PySpark and Elasticsearch-py analyzer

2015-08-22 Thread pkphlam
:PicklingError I'm not sure what the error means. Am I doing something wrong? Is there a way to map the ES analyze function onto records of an RDD? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/pickling-error-with-PySpark-and-Elasticsearch-py-anal

Re: error with pyspark

2014-08-11 Thread Baoqiang Cao
Thanks Daves and Ron! It indeed was due to ulimit issue. Thanks a lot! Best, Baoqiang Cao Blog: http://baoqiang.org Email: bqcaom...@gmail.com On Aug 11, 2014, at 3:08 AM, Ron Gonzalez wrote: > If you're running on Ubuntu, do ulimit -n, which gives the max number of > allowed open files.

Re: error with pyspark

2014-08-11 Thread Ron Gonzalez
If you're running on Ubuntu, do ulimit -n, which gives the max number of allowed open files. You will have to change the value in /etc/security/limits.conf to something like 1, logout and log back in. Thanks, Ron Sent from my iPad > On Aug 10, 2014, at 10:19 PM, Davies Liu wrote: > >> On

Re: error with pyspark

2014-08-10 Thread Davies Liu
On Fri, Aug 8, 2014 at 9:12 AM, Baoqiang Cao wrote: > Hi There > > I ran into a problem and can’t find a solution. > > I was running bin/pyspark < ../python/wordcount.py you could use bin/spark-submit ../python/wordcount.py > The wordcount.py is here: > > ===

error with pyspark

2014-08-08 Thread Baoqiang Cao
Hi There I ran into a problem and can’t find a solution. I was running bin/pyspark < ../python/wordcount.py The wordcount.py is here: import sys from operator import add from pyspark import SparkContext datafile = '/mnt/data/m1.txt' sc = SparkContext(