Re: Lemmatization using StanfordNLP in ML 2.0

janardhan shetty Sun, 18 Sep 2016 12:30:36 -0700

Also sometimes hitting this Error when spark-shell is used:

Caused by: edu.stanford.nlp.io.RuntimeIOException: Error while loading a
tagger model (probably missing model file)
  at
edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:770)
  at
edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:298)
  at
edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:263)
  at
edu.stanford.nlp.pipeline.POSTaggerAnnotator.loadModel(POSTaggerAnnotator.java:97)
  at
edu.stanford.nlp.pipeline.POSTaggerAnnotator.<init>(POSTaggerAnnotator.java:77)
  at
edu.stanford.nlp.pipeline.AnnotatorImplementations.posTagger(AnnotatorImplementations.java:59)
  at
edu.stanford.nlp.pipeline.AnnotatorFactories$4.create(AnnotatorFactories.java:290)
  ... 114 more
Caused by: java.io.IOException: Unable to open
"edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger"
as class path, filename or URL
  at
edu.stanford.nlp.io.IOUtils.getInputStreamFromURLOrClasspathOrFileSystem(IOUtils.java:485)
  at
edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:765)




On Sun, Sep 18, 2016 at 12:27 PM, janardhan shetty <[email protected]>
wrote:

> Using: spark-shell --packages databricks:spark-corenlp:0.2.0-s_2.11
>
> On Sun, Sep 18, 2016 at 12:26 PM, janardhan shetty <[email protected]
> > wrote:
>
>> Hi Jacek,
>>
>> Thanks for your response. This is the code I am trying to execute
>>
>> import org.apache.spark.sql.functions._
>> import com.databricks.spark.corenlp.functions._
>>
>> val inputd = Seq(
>>   (1, "<xml>Stanford University is located in California. </xml>")
>> ).toDF("id", "text")
>>
>> val output = 
>> inputd.select(cleanxml(col("text"))).withColumnRenamed("UDF(text)",
>> "text")
>>
>> val out = output.select(lemma(col("text"))).withColumnRenamed("UDF(text)",
>> "text")
>>
>> output.show() works
>>
>> Error happens when I execute *out.show()*
>>
>>
>>
>> On Sun, Sep 18, 2016 at 11:58 AM, Jacek Laskowski <[email protected]>
>> wrote:
>>
>>> Hi Jonardhan,
>>>
>>> Can you share the code that you execute? What's the command? Mind
>>> sharing the complete project on github?
>>>
>>> Pozdrawiam,
>>> Jacek Laskowski
>>> ----
>>> https://medium.com/@jaceklaskowski/
>>> Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
>>> Follow me at https://twitter.com/jaceklaskowski
>>>
>>>
>>> On Sun, Sep 18, 2016 at 8:01 PM, janardhan shetty
>>> <[email protected]> wrote:
>>> > Hi,
>>> >
>>> > I am trying to use lemmatization as a transformer and added belwo to
>>> the
>>> > build.sbt
>>> >
>>> >  "edu.stanford.nlp" % "stanford-corenlp" % "3.6.0",
>>> >     "com.google.protobuf" % "protobuf-java" % "2.6.1",
>>> >     "edu.stanford.nlp" % "stanford-corenlp" % "3.6.0" % "test"
>>> classifier
>>> > "models",
>>> >     "org.scalatest" %% "scalatest" % "2.2.6" % "test"
>>> >
>>> >
>>> > Error:
>>> > Exception in thread "main" java.lang.NoClassDefFoundError:
>>> > edu/stanford/nlp/pipeline/StanfordCoreNLP
>>> >
>>> > I have tried other versions of this spark package.
>>> >
>>> > Any help is appreciated..
>>>
>>
>>
>

Re: Lemmatization using StanfordNLP in ML 2.0

Reply via email to