Hi ,
i solved the issue when i extract the method into another class.
Failure:
Class extract.py - contains the whole implementation.
Because of this single class driver trying to serialize spacy(english)
object and sending to executor. There i am facing pickling exception.
Success:
Class extract
So you left out the exception. On one hand I’m also not sure how well spacy
serializes, so to debug this I would start off by moving the nlp = inside
of my function and see if it still fails.
On Thu, Feb 15, 2018 at 9:08 PM Selvam Raman wrote:
> import spacy
>
> nlp = spacy.load('en')
>
>
>
> de
import spacy
nlp = spacy.load('en')
def getPhrases(content):
phrases = []
doc = nlp(str(content))
for chunks in doc.noun_chunks:
phrases.append(chunks.text)
return phrases
the above function will retrieve the noun phrases from the content and
return list of phrases.
d
pyspark - 2.2.1
spacy - 2.0.7
python - 3.6
Placing full logs here
Traceback (most recent call last):
File
"/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pyspark/cloudpickle.py",
line 148, in dump
return Pickler.dump(self, obj)
File
"/Library/Frameworks/Pyt
import spacy
nlp = spacy.load('en')
def getPhrases(content):
phrases = []
doc = nlp(str(content))
for chunks in doc.noun_chunks:
phrases.append(chunks.text)
return phrases
the above function will retrieve the noun phrases from the content and
return list of phrases.
d