Apologies that I am late on this thread.
On 02/12/10 17:39, Sascha Wolfer wrote:
I seem to have a problem with the openNLP package, I'm actually stuck
in the very beginning. Here's what I did:
> install.packages("openNLP")
> install.packages("openNLPmodels.de", repos =
"http://datacube.wu.ac.at/", type = "source")
> library(openNLPmodels.de)
> library(openNLP)
So I installed the main package as well as the supplementary german
model. Now, I try to use the "sentDetect" function:
> s <- c("Das hier ist ein Satz. Und hier ist noch einer - sogar mit
Gedankenstrich. Ist das nicht toll?")
> sentDetect(s, language = "de", model = "openNLPmodels.de")
I get the following error message which I can't make any sense of:
Fehler in .jnew("opennlp/maxent/io/SuffixSensitiveGISModelReader",
.jnew("java.io.File", :
java.io.FileNotFoundException: openNLPmodels.de (No such file or
directory)
The correct syntax seems to be
sentDetect(s, model = system.file("models", "de-sent.bin", package =
"openNLPmodels.de"))
but unfortunately I get
Error in .jcall(.jnew("opennlp/maxent/io/SuffixSensitiveGISModelReader", :
java.io.UTFDataFormatException: malformed input around byte 48
YMMV. But you get the idea on the syntax of the model= argument. This
"works":
sentDetect(s, model = system.file("models", "sentdetect", "EnglishSD.bin.gz", package =
"openNLPmodels.en"))
# [1] "Das hier ist ein Satz. "
# [2] "Und hier ist noch einer - sogar mit Gedankenstrich. "
# [3] "Ist das nicht toll?"
Hope this helps you a little.
Allan
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.