This is an automated email from the ASF dual-hosted git repository.
mawiesne pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/opennlp.git
The following commit(s) were added to refs/heads/main by this push:
new 32fa977b OPENNLP-855: Add SentimentDetector to derive sentiment from
text (#579)
32fa977b is described below
commit 32fa977b274dc1fb74304bb76fe154ae92677426
Author: Martin Wiesner <[email protected]>
AuthorDate: Sun Mar 22 15:53:39 2026 +0100
OPENNLP-855: Add SentimentDetector to derive sentiment from text (#579)
* OPENNLP-855: Add SentimentDetector to derive sentiment from text
- adapts existing Sentiment code to OpenNLP 3.x module structures
- introduces SentimentDetector API interface
- cleans up Sentiment Analysis implementation and add tests
- fixes broken sequence labeling code (find/predict2) from SentimentME and
SentimentDetector; sentiment is a classification task, not sequence labeling
- removes getSentimentModel() from SentimentModel (wrapped MaxentModel in
unused BeamSearch)
- adds toString/equals/hashCode to SentimentSample, remove unused id field
- fixes SentimentSampleTypeFilter to actually filter by sentiment type
- fixes SentimentDetailedFMeasureListener.asSpanArray() returning null
- removes dead detailedFListener code in CLI tools
- adds 44 unit tests covering all runtime sentiment classes
- enhances dev manual in docs
- fine-tunes test classes
---------
Co-authored-by: amensiko <[email protected]>
Co-authored-by: Richard Zowalla <[email protected]>
---
.../opennlp/tools/sentiment/SentimentDetector.java | 37 ++++
.../sentiment/SentimentEvaluationMonitor.java | 29 +++
.../opennlp/tools/sentiment/SentimentSample.java | 108 ++++++++++
.../src/main/java/opennlp/tools/cmdline/CLI.java | 8 +
.../sentiment/SentimentCrossValidatorTool.java | 112 ++++++++++
.../SentimentDetailedFMeasureListener.java | 43 ++++
.../SentimentEvaluationErrorListener.java | 62 ++++++
.../cmdline/sentiment/SentimentEvaluatorTool.java | 146 +++++++++++++
.../cmdline/sentiment/SentimentModelLoader.java | 49 +++++
.../cmdline/sentiment/SentimentTrainerTool.java | 110 ++++++++++
.../tools/cmdline/StreamFactoryRegistry.java | 3 +
.../formats/SentimentSampleStreamFactory.java | 80 +++++++
.../tools/sentiment/SentimentContextGenerator.java | 83 ++++++++
.../tools/sentiment/SentimentCrossValidator.java | 236 +++++++++++++++++++++
.../tools/sentiment/SentimentEvaluator.java | 66 ++++++
.../tools/sentiment/SentimentEventStream.java | 85 ++++++++
.../opennlp/tools/sentiment/SentimentFactory.java | 72 +++++++
.../java/opennlp/tools/sentiment/SentimentME.java | 118 +++++++++++
.../opennlp/tools/sentiment/SentimentModel.java | 106 +++++++++
.../tools/sentiment/SentimentSampleStream.java | 76 +++++++
.../tools/sentiment/SentimentSampleTypeFilter.java | 76 +++++++
.../sentiment/SentimentContextGeneratorTest.java | 69 ++++++
.../sentiment/SentimentCrossValidatorTest.java | 87 ++++++++
.../tools/sentiment/SentimentEvaluatorTest.java | 182 ++++++++++++++++
.../tools/sentiment/SentimentEventStreamTest.java | 81 +++++++
.../tools/sentiment/SentimentFactoryTest.java | 54 +++++
.../opennlp/tools/sentiment/SentimentMETest.java | 210 ++++++++++++++++++
.../tools/sentiment/SentimentSampleStreamTest.java | 76 +++++++
.../tools/sentiment/SentimentSampleTest.java | 99 +++++++++
.../sentiment/SentimentSampleTypeFilterTest.java | 85 ++++++++
.../resources/opennlp/tools/sentiment/train.txt | 32 +++
opennlp-docs/src/docbkx/introduction.xml | 7 +-
opennlp-docs/src/docbkx/opennlp.xml | 1 +
opennlp-docs/src/docbkx/sentiment.xml | 186 ++++++++++++++++
34 files changed, 2871 insertions(+), 3 deletions(-)
diff --git
a/opennlp-api/src/main/java/opennlp/tools/sentiment/SentimentDetector.java
b/opennlp-api/src/main/java/opennlp/tools/sentiment/SentimentDetector.java
new file mode 100644
index 00000000..3c0db433
--- /dev/null
+++ b/opennlp-api/src/main/java/opennlp/tools/sentiment/SentimentDetector.java
@@ -0,0 +1,37 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package opennlp.tools.sentiment;
+
+public interface SentimentDetector {
+
+ /**
+ * Conducts a sentiment prediction for the specifed sentence.
+ *
+ * @param sentence The text to be analysed for its sentiment.
+ * @return The predicted sentiment.
+ */
+ String predict(String sentence);
+
+ /**
+ * Conducts a sentiment prediction for the specifed sentence.
+ *
+ * @param tokens The text to be analysed for its sentiment.
+ * @return The predicted sentiment.
+ */
+ String predict(String[] tokens);
+}
diff --git
a/opennlp-api/src/main/java/opennlp/tools/sentiment/SentimentEvaluationMonitor.java
b/opennlp-api/src/main/java/opennlp/tools/sentiment/SentimentEvaluationMonitor.java
new file mode 100644
index 00000000..932dcf4c
--- /dev/null
+++
b/opennlp-api/src/main/java/opennlp/tools/sentiment/SentimentEvaluationMonitor.java
@@ -0,0 +1,29 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package opennlp.tools.sentiment;
+
+import opennlp.tools.util.eval.EvaluationMonitor;
+
+/**
+ * An sentiment specific {@link EvaluationMonitor} to be used by the evaluator.
+ *
+ * @see SentimentSample
+ */
+public interface SentimentEvaluationMonitor extends
EvaluationMonitor<SentimentSample> {
+
+}
diff --git
a/opennlp-api/src/main/java/opennlp/tools/sentiment/SentimentSample.java
b/opennlp-api/src/main/java/opennlp/tools/sentiment/SentimentSample.java
new file mode 100644
index 00000000..9937be82
--- /dev/null
+++ b/opennlp-api/src/main/java/opennlp/tools/sentiment/SentimentSample.java
@@ -0,0 +1,108 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package opennlp.tools.sentiment;
+
+import java.io.Serial;
+import java.util.List;
+import java.util.Objects;
+
+import opennlp.tools.commons.Sample;
+
+/**
+ * Class for holding text used for sentiment analysis.
+ */
+public class SentimentSample implements Sample {
+
+ @Serial
+ private static final long serialVersionUID = 2477213313738337539L;
+
+ private final String sentiment;
+ private final List<String> sentence;
+ private final boolean isClearAdaptiveData;
+
+ /**
+ * Instantiates a {@link SentimentSample} object.
+ *
+ * @param sentiment
+ * training sentiment
+ * @param sentence
+ * training sentence
+ */
+ public SentimentSample(String sentiment, String[] sentence) {
+ this(sentiment, sentence, true);
+ }
+
+ public SentimentSample(String sentiment, String[] sentence,
+ boolean clearAdaptiveData) {
+ if (sentiment == null) {
+ throw new IllegalArgumentException("sentiment must not be null");
+ }
+ if (sentence == null) {
+ throw new IllegalArgumentException("sentence must not be null");
+ }
+
+ this.sentiment = sentiment;
+ this.sentence = List.of(sentence);
+ this.isClearAdaptiveData = clearAdaptiveData;
+ }
+
+ /**
+ * @return Returns the sentiment.
+ */
+ public String getSentiment() {
+ return sentiment;
+ }
+
+ /**
+ * @return Returns the sentence.
+ */
+ public String[] getSentence() {
+ return sentence.toArray(new String[0]);
+ }
+
+ /**
+ * @return Returns the value of isClearAdaptiveData, {@code true} or {@code
false}.
+ */
+ public boolean isClearAdaptiveDataSet() {
+ return isClearAdaptiveData;
+ }
+
+ @Override
+ public String toString() {
+ return sentiment + " " + String.join(" ", sentence);
+ }
+
+ @Override
+ public boolean equals(Object obj) {
+ if (this == obj) {
+ return true;
+ }
+ if (obj == null || getClass() != obj.getClass()) {
+ return false;
+ }
+ SentimentSample that = (SentimentSample) obj;
+ return Objects.equals(sentiment, that.sentiment)
+ && Objects.equals(sentence, that.sentence);
+ }
+
+ @Override
+ public int hashCode() {
+ return Objects.hash(sentiment, sentence);
+ }
+
+}
diff --git
a/opennlp-core/opennlp-cli/src/main/java/opennlp/tools/cmdline/CLI.java
b/opennlp-core/opennlp-cli/src/main/java/opennlp/tools/cmdline/CLI.java
index 7904561c..58921218 100644
--- a/opennlp-core/opennlp-cli/src/main/java/opennlp/tools/cmdline/CLI.java
+++ b/opennlp-core/opennlp-cli/src/main/java/opennlp/tools/cmdline/CLI.java
@@ -74,6 +74,9 @@ import
opennlp.tools.cmdline.sentdetect.SentenceDetectorCrossValidatorTool;
import opennlp.tools.cmdline.sentdetect.SentenceDetectorEvaluatorTool;
import opennlp.tools.cmdline.sentdetect.SentenceDetectorTool;
import opennlp.tools.cmdline.sentdetect.SentenceDetectorTrainerTool;
+import opennlp.tools.cmdline.sentiment.SentimentCrossValidatorTool;
+import opennlp.tools.cmdline.sentiment.SentimentEvaluatorTool;
+import opennlp.tools.cmdline.sentiment.SentimentTrainerTool;
import opennlp.tools.cmdline.tokenizer.DictionaryDetokenizerTool;
import opennlp.tools.cmdline.tokenizer.SimpleTokenizerTool;
import opennlp.tools.cmdline.tokenizer.TokenizerConverterTool;
@@ -173,6 +176,11 @@ public final class CLI {
// Entity Linker
tools.add(new EntityLinkerTool());
+
+ // Sentiment Analysis Parser
+ tools.add(new SentimentTrainerTool());
+ tools.add(new SentimentEvaluatorTool());
+ tools.add(new SentimentCrossValidatorTool());
// Language Model
tools.add(new NGramLanguageModelTool());
diff --git
a/opennlp-core/opennlp-cli/src/main/java/opennlp/tools/cmdline/sentiment/SentimentCrossValidatorTool.java
b/opennlp-core/opennlp-cli/src/main/java/opennlp/tools/cmdline/sentiment/SentimentCrossValidatorTool.java
new file mode 100644
index 00000000..e138f217
--- /dev/null
+++
b/opennlp-core/opennlp-cli/src/main/java/opennlp/tools/cmdline/sentiment/SentimentCrossValidatorTool.java
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package opennlp.tools.cmdline.sentiment;
+
+import java.io.IOException;
+import java.util.LinkedList;
+import java.util.List;
+
+import opennlp.tools.cmdline.AbstractCrossValidatorTool;
+import opennlp.tools.cmdline.CmdLineUtil;
+import opennlp.tools.cmdline.TerminateToolException;
+import opennlp.tools.cmdline.params.BasicTrainingParams;
+import opennlp.tools.cmdline.params.CVParams;
+import
opennlp.tools.cmdline.sentiment.SentimentCrossValidatorTool.CVToolParams;
+import opennlp.tools.sentiment.SentimentCrossValidator;
+import opennlp.tools.sentiment.SentimentEvaluationMonitor;
+import opennlp.tools.sentiment.SentimentFactory;
+import opennlp.tools.sentiment.SentimentSample;
+import opennlp.tools.util.eval.EvaluationMonitor;
+import opennlp.tools.util.model.ModelUtil;
+
+/**
+ * Class for helping perform cross validation on the Sentiment Analysis Parser.
+ */
+public class SentimentCrossValidatorTool
+ extends AbstractCrossValidatorTool<SentimentSample, CVToolParams> {
+
+ /**
+ * Interface for parameters
+ */
+ interface CVToolParams extends BasicTrainingParams, CVParams {
+
+ }
+
+ /**
+ * Constructor
+ */
+ public SentimentCrossValidatorTool() {
+ super(SentimentSample.class, CVToolParams.class);
+ }
+
+ /**
+ * Returns the short description of the tool
+ *
+ * @return short description
+ */
+ public String getShortDescription() {
+ return "K-fold cross validator for the learnable Sentiment Analysis
Parser";
+ }
+
+ /**
+ * Runs the tool
+ *
+ * @param format
+ * the format to be used
+ * @param args
+ * the arguments
+ */
+ public void run(String format, String[] args) {
+ super.run(format, args);
+
+ mlParams = CmdLineUtil.loadTrainingParameters(params.getParams(), true);
+ if (mlParams == null) {
+ mlParams = ModelUtil.createDefaultTrainingParameters();
+ }
+
+ List<EvaluationMonitor<SentimentSample>> listeners = new LinkedList<>();
+ if (params.getMisclassified()) {
+ listeners.add(new SentimentEvaluationErrorListener());
+ }
+ SentimentFactory sentimentFactory = new SentimentFactory();
+
+ SentimentCrossValidator validator;
+ try {
+ validator = new SentimentCrossValidator(params.getLang(), mlParams,
sentimentFactory,
+ listeners.toArray(new SentimentEvaluationMonitor[listeners.size()]));
+ validator.evaluate(sampleStream, params.getFolds());
+ } catch (IOException e) {
+ throw new TerminateToolException(-1,
+ "IO error while reading training data or indexing data: "
+ + e.getMessage(),
+ e);
+ } finally {
+ try {
+ sampleStream.close();
+ } catch (IOException e) {
+ // sorry that this can fail
+ }
+ }
+
+ System.out.println("done");
+
+ System.out.println();
+ System.out.println(validator.getFMeasure());
+ }
+
+}
diff --git
a/opennlp-core/opennlp-cli/src/main/java/opennlp/tools/cmdline/sentiment/SentimentDetailedFMeasureListener.java
b/opennlp-core/opennlp-cli/src/main/java/opennlp/tools/cmdline/sentiment/SentimentDetailedFMeasureListener.java
new file mode 100644
index 00000000..fda23f35
--- /dev/null
+++
b/opennlp-core/opennlp-cli/src/main/java/opennlp/tools/cmdline/sentiment/SentimentDetailedFMeasureListener.java
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package opennlp.tools.cmdline.sentiment;
+
+import opennlp.tools.cmdline.DetailedFMeasureListener;
+import opennlp.tools.sentiment.SentimentEvaluationMonitor;
+import opennlp.tools.sentiment.SentimentSample;
+import opennlp.tools.util.Span;
+
+/**
+ * Class for creating a detailed F-Measure listener
+ */
+public class SentimentDetailedFMeasureListener
+ extends DetailedFMeasureListener<SentimentSample>
+ implements SentimentEvaluationMonitor {
+
+ /**
+ * Returns the sentiment sample as a span array
+ *
+ * @param sample
+ * the sentiment sample to be returned
+ * @return span array of the sample
+ */
+ @Override
+ protected Span[] asSpanArray(SentimentSample sample) {
+ return new Span[] { new Span(0, 0, sample.getSentiment()) };
+ }
+}
diff --git
a/opennlp-core/opennlp-cli/src/main/java/opennlp/tools/cmdline/sentiment/SentimentEvaluationErrorListener.java
b/opennlp-core/opennlp-cli/src/main/java/opennlp/tools/cmdline/sentiment/SentimentEvaluationErrorListener.java
new file mode 100644
index 00000000..dabe2f70
--- /dev/null
+++
b/opennlp-core/opennlp-cli/src/main/java/opennlp/tools/cmdline/sentiment/SentimentEvaluationErrorListener.java
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package opennlp.tools.cmdline.sentiment;
+
+import java.io.OutputStream;
+
+import opennlp.tools.cmdline.EvaluationErrorPrinter;
+import opennlp.tools.sentiment.SentimentSample;
+import opennlp.tools.util.eval.EvaluationMonitor;
+
+/**
+ * Class for creating an evaluation error listener.
+ */
+public class SentimentEvaluationErrorListener
+ extends EvaluationErrorPrinter<SentimentSample>
+ implements EvaluationMonitor<SentimentSample> {
+
+ /**
+ * Constructor
+ */
+ public SentimentEvaluationErrorListener() {
+ super(System.err);
+ }
+
+ /**
+ * Constructor
+ */
+ protected SentimentEvaluationErrorListener(OutputStream outputStream) {
+ super(outputStream);
+ }
+
+ /**
+ * Prints the error in case of a missclassification in the evaluator
+ *
+ * @param reference
+ * the sentiment sample reference to be used
+ * @param prediction
+ * the sentiment sampple prediction
+ */
+ @Override
+ public void misclassified(SentimentSample reference, SentimentSample
prediction) {
+ printError(new String[] { reference.getSentiment() },
+ new String[] { prediction.getSentiment() }, reference, prediction,
+ reference.getSentence());
+ }
+
+}
diff --git
a/opennlp-core/opennlp-cli/src/main/java/opennlp/tools/cmdline/sentiment/SentimentEvaluatorTool.java
b/opennlp-core/opennlp-cli/src/main/java/opennlp/tools/cmdline/sentiment/SentimentEvaluatorTool.java
new file mode 100644
index 00000000..b743af48
--- /dev/null
+++
b/opennlp-core/opennlp-cli/src/main/java/opennlp/tools/cmdline/sentiment/SentimentEvaluatorTool.java
@@ -0,0 +1,146 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package opennlp.tools.cmdline.sentiment;
+
+import java.io.IOException;
+import java.util.LinkedList;
+import java.util.List;
+
+import opennlp.tools.cmdline.AbstractEvaluatorTool;
+import opennlp.tools.cmdline.ArgumentParser.OptionalParameter;
+import opennlp.tools.cmdline.ArgumentParser.ParameterDescription;
+import opennlp.tools.cmdline.PerformanceMonitor;
+import opennlp.tools.cmdline.TerminateToolException;
+import opennlp.tools.cmdline.params.EvaluatorParams;
+import opennlp.tools.cmdline.sentiment.SentimentEvaluatorTool.EvalToolParams;
+import opennlp.tools.sentiment.SentimentEvaluationMonitor;
+import opennlp.tools.sentiment.SentimentEvaluator;
+import opennlp.tools.sentiment.SentimentME;
+import opennlp.tools.sentiment.SentimentModel;
+import opennlp.tools.sentiment.SentimentSample;
+import opennlp.tools.sentiment.SentimentSampleTypeFilter;
+import opennlp.tools.util.ObjectStream;
+import opennlp.tools.util.eval.EvaluationMonitor;
+
+/**
+ * Class for creating an evaluation tool for sentiment analysis.
+ *
+ * @see EvalToolParams
+ * @see SentimentSample
+ */
+public class SentimentEvaluatorTool
+ extends AbstractEvaluatorTool<SentimentSample, EvalToolParams> {
+
+ /**
+ * Interface for parameters to be used in evaluation
+ */
+ interface EvalToolParams extends EvaluatorParams {
+ @OptionalParameter
+ @ParameterDescription(valueName = "types", description = "name types to
use for evaluation")
+ String getNameTypes();
+ }
+
+ /**
+ * Constructor
+ */
+ public SentimentEvaluatorTool() {
+ super(SentimentSample.class, EvalToolParams.class);
+ }
+
+ /**
+ * Returns the short description of the tool
+ *
+ * @return short description
+ */
+ public String getShortDescription() {
+ return "Measures the performance of the Sentiment model with the reference
data";
+ }
+
+ /**
+ * Runs the tool
+ *
+ * @param format
+ * the format to be used
+ * @param args
+ * the arguments
+ */
+ public void run(String format, String[] args) {
+ super.run(format, args);
+
+ SentimentModel model = new SentimentModelLoader().load(params.getModel());
+ // TODO: check EvalToolParams --> getNameTypes()
+
+ List<EvaluationMonitor<SentimentSample>> listeners = new LinkedList<>();
+ if (params.getMisclassified()) {
+ listeners.add(new SentimentEvaluationErrorListener());
+ }
+ if (params.getNameTypes() != null) {
+ String[] nameTypes = params.getNameTypes().split(",");
+ sampleStream = new SentimentSampleTypeFilter(nameTypes, sampleStream);
+ }
+
+ SentimentEvaluator evaluator = new SentimentEvaluator(new
SentimentME(model),
+ listeners.toArray(new SentimentEvaluationMonitor[listeners.size()]));
+
+ final PerformanceMonitor monitor = new PerformanceMonitor("sent");
+
+ ObjectStream<SentimentSample> measuredSampleStream = new
ObjectStream<SentimentSample>() {
+
+ @Override
+ public SentimentSample read() throws IOException {
+ SentimentSample sample = sampleStream.read();
+ if (sample != null) {
+ monitor.incrementCounter();
+ }
+ return sample;
+ }
+
+ @Override
+ public void reset() throws IOException {
+ sampleStream.reset();
+ }
+
+ @Override
+ public void close() throws IOException {
+ sampleStream.close();
+ }
+ };
+
+ monitor.startAndPrintThroughput();
+
+ try {
+ evaluator.evaluate(measuredSampleStream);
+ } catch (IOException e) {
+ System.err.println("failed");
+ throw new TerminateToolException(-1,
+ "IO error while reading test data: " + e.getMessage(), e);
+ } finally {
+ try {
+ measuredSampleStream.close();
+ } catch (IOException e) {
+ // sorry that this can fail
+ }
+ }
+
+ monitor.stopAndPrintFinalResult();
+
+ System.out.println();
+ System.out.println(evaluator.getFMeasure());
+ }
+
+}
diff --git
a/opennlp-core/opennlp-cli/src/main/java/opennlp/tools/cmdline/sentiment/SentimentModelLoader.java
b/opennlp-core/opennlp-cli/src/main/java/opennlp/tools/cmdline/sentiment/SentimentModelLoader.java
new file mode 100644
index 00000000..e5e3e20c
--- /dev/null
+++
b/opennlp-core/opennlp-cli/src/main/java/opennlp/tools/cmdline/sentiment/SentimentModelLoader.java
@@ -0,0 +1,49 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package opennlp.tools.cmdline.sentiment;
+
+import java.io.IOException;
+import java.io.InputStream;
+
+import opennlp.tools.cmdline.ModelLoader;
+import opennlp.tools.sentiment.SentimentModel;
+
+/**
+ * Class for loading a sentiment model.
+ */
+public class SentimentModelLoader extends ModelLoader<SentimentModel> {
+
+ /**
+ * Constructor
+ */
+ public SentimentModelLoader() {
+ super("Sentiment");
+ }
+
+ /**
+ * Loads the sentiment model
+ *
+ * @param modelIn
+ * the input stream model
+ * @return the model
+ */
+ @Override
+ protected SentimentModel loadModel(InputStream modelIn) throws IOException {
+ return new SentimentModel(modelIn);
+ }
+}
diff --git
a/opennlp-core/opennlp-cli/src/main/java/opennlp/tools/cmdline/sentiment/SentimentTrainerTool.java
b/opennlp-core/opennlp-cli/src/main/java/opennlp/tools/cmdline/sentiment/SentimentTrainerTool.java
new file mode 100644
index 00000000..a690f4cf
--- /dev/null
+++
b/opennlp-core/opennlp-cli/src/main/java/opennlp/tools/cmdline/sentiment/SentimentTrainerTool.java
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package opennlp.tools.cmdline.sentiment;
+
+import java.io.File;
+import java.io.IOException;
+
+import opennlp.tools.cmdline.AbstractTrainerTool;
+import opennlp.tools.cmdline.CLI;
+import opennlp.tools.cmdline.CmdLineUtil;
+import opennlp.tools.cmdline.TerminateToolException;
+import opennlp.tools.cmdline.params.TrainingToolParams;
+import opennlp.tools.sentiment.SentimentFactory;
+import opennlp.tools.sentiment.SentimentME;
+import opennlp.tools.sentiment.SentimentModel;
+import opennlp.tools.sentiment.SentimentSample;
+import opennlp.tools.util.model.ModelUtil;
+
+/**
+ * Class for helping train a sentiment analysis model.
+ */
+public class SentimentTrainerTool
+ extends AbstractTrainerTool<SentimentSample, TrainingToolParams> {
+
+ /**
+ * Constructor
+ */
+ public SentimentTrainerTool() {
+ super(SentimentSample.class, TrainingToolParams.class);
+ }
+
+ /**
+ * Runs the trainer
+ *
+ * @param format
+ * the format to be used
+ * @param args
+ * the arguments
+ */
+ @Override
+ public void run(String format, String[] args) {
+ super.run(format, args);
+ if (0 == args.length) {
+ System.out.println(getHelp());
+ } else {
+
+ mlParams = CmdLineUtil.loadTrainingParameters(params.getParams(), false);
+ if (mlParams == null) {
+ mlParams = ModelUtil.createDefaultTrainingParameters();
+ }
+
+ File modelOutFile = params.getModel();
+
+ CmdLineUtil.checkOutputFile("sentiment analysis model", modelOutFile);
+
+ SentimentModel model;
+ try {
+ SentimentFactory factory = new SentimentFactory();
+ model = SentimentME.train(params.getLang(), sampleStream, mlParams,
factory);
+ } catch (IOException e) {
+ throw new TerminateToolException(-1,
+ "IO error while reading training data or indexing data: " +
e.getMessage(), e);
+ } finally {
+ try {
+ sampleStream.close();
+ } catch (IOException e) {
+ // sorry that this can fail
+ }
+ }
+
+ CmdLineUtil.writeModel("sentiment analysis", modelOutFile, model);
+ }
+ }
+
+ /**
+ * Returns the help message
+ *
+ * @return the message
+ */
+ @Override
+ public String getHelp() {
+ return "Usage: " + CLI.CMD + " " + getName() + " model < documents";
+ }
+
+ /**
+ * Returns the short description of the programme
+ *
+ * @return the description
+ */
+ @Override
+ public String getShortDescription() {
+ return "learnable sentiment analysis";
+ }
+
+}
diff --git
a/opennlp-core/opennlp-formats/src/main/java/opennlp/tools/cmdline/StreamFactoryRegistry.java
b/opennlp-core/opennlp-formats/src/main/java/opennlp/tools/cmdline/StreamFactoryRegistry.java
index cfa46f1f..09cfef02 100644
---
a/opennlp-core/opennlp-formats/src/main/java/opennlp/tools/cmdline/StreamFactoryRegistry.java
+++
b/opennlp-core/opennlp-formats/src/main/java/opennlp/tools/cmdline/StreamFactoryRegistry.java
@@ -36,6 +36,7 @@ import opennlp.tools.formats.LemmatizerSampleStreamFactory;
import opennlp.tools.formats.NameSampleDataStreamFactory;
import opennlp.tools.formats.ParseSampleStreamFactory;
import opennlp.tools.formats.SentenceSampleStreamFactory;
+import opennlp.tools.formats.SentimentSampleStreamFactory;
import opennlp.tools.formats.TokenSampleStreamFactory;
import opennlp.tools.formats.TwentyNewsgroupSampleStreamFactory;
import opennlp.tools.formats.WordTagSampleStreamFactory;
@@ -142,6 +143,8 @@ public final class StreamFactoryRegistry {
MascPOSSampleStreamFactory.registerFactory();
MascSentenceSampleStreamFactory.registerFactory();
MascTokenSampleStreamFactory.registerFactory();
+
+ SentimentSampleStreamFactory.registerFactory();
}
public static final String DEFAULT_FORMAT = "opennlp";
diff --git
a/opennlp-core/opennlp-formats/src/main/java/opennlp/tools/formats/SentimentSampleStreamFactory.java
b/opennlp-core/opennlp-formats/src/main/java/opennlp/tools/formats/SentimentSampleStreamFactory.java
new file mode 100644
index 00000000..e39c66c9
--- /dev/null
+++
b/opennlp-core/opennlp-formats/src/main/java/opennlp/tools/formats/SentimentSampleStreamFactory.java
@@ -0,0 +1,80 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package opennlp.tools.formats;
+
+import java.io.IOException;
+
+import opennlp.tools.cmdline.ArgumentParser;
+import opennlp.tools.cmdline.StreamFactoryRegistry;
+import opennlp.tools.cmdline.TerminateToolException;
+import opennlp.tools.cmdline.params.BasicFormatParams;
+import opennlp.tools.sentiment.SentimentSample;
+import opennlp.tools.sentiment.SentimentSampleStream;
+import opennlp.tools.util.InputStreamFactory;
+import opennlp.tools.util.ObjectStream;
+import opennlp.tools.util.PlainTextByLineStream;
+
+/**
+ * Factory for creating a sample stream factory for sentiment analysis.
+ *
+ * @see SentimentSample
+ */
+public class SentimentSampleStreamFactory<P> extends
AbstractSampleStreamFactory<SentimentSample, P> {
+
+ /**
+ * Instantiates a {@link SentimentSampleStreamFactory} object.
+ *
+ * @param params
+ * any given parameters
+ */
+ protected SentimentSampleStreamFactory(Class<P> params) {
+ super(params);
+ }
+
+ /**
+ * Creates a sentiment sample stream.
+ *
+ * @param args
+ * the necessary arguments
+ * @return A {@link SentimentSample} stream.
+ */
+ @Override
+ public ObjectStream<SentimentSample> create(String[] args) {
+ BasicFormatParams params = ArgumentParser.parse(args,
BasicFormatParams.class);
+
+ FormatUtil.checkInputFile("Data", params.getData());
+ ObjectStream<String> lineStream;
+ try {
+ InputStreamFactory sampleDataIn =
FormatUtil.createInputStreamFactory(params.getData());
+ lineStream = new PlainTextByLineStream(sampleDataIn,
params.getEncoding());
+ } catch (IOException ex) {
+ throw new TerminateToolException(-1,
+ "IO Error while creating an Input Stream: " + ex.getMessage(),
ex);
+ }
+ return new SentimentSampleStream(lineStream);
+ }
+
+ /**
+ * Registers a SentimentSample stream factory
+ */
+ public static void registerFactory() {
+ StreamFactoryRegistry.registerFactory(SentimentSample.class,
StreamFactoryRegistry.DEFAULT_FORMAT,
+ new SentimentSampleStreamFactory<>(BasicFormatParams.class));
+ }
+
+}
diff --git
a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/sentiment/SentimentContextGenerator.java
b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/sentiment/SentimentContextGenerator.java
new file mode 100644
index 00000000..a8a41972
--- /dev/null
+++
b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/sentiment/SentimentContextGenerator.java
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package opennlp.tools.sentiment;
+
+import opennlp.tools.util.BeamSearchContextGenerator;
+import opennlp.tools.util.featuregen.AdaptiveFeatureGenerator;
+
+/**
+ * Class for using a Context Generator for Sentiment Analysis.
+ */
+public class SentimentContextGenerator
+ implements BeamSearchContextGenerator<String> {
+
+ private final AdaptiveFeatureGenerator[] featureGenerators;
+
+ public SentimentContextGenerator() {
+ this(new AdaptiveFeatureGenerator[0]);
+ }
+
+ public SentimentContextGenerator(
+ AdaptiveFeatureGenerator[] featureGenerators) {
+ this.featureGenerators = featureGenerators;
+ }
+
+ /**
+ * Returns the context
+ *
+ * @param text
+ * the given text to be returned as context
+ * @return the text (the context)
+ */
+ public String[] getContext(String[] text) {
+ return text;
+ }
+
+ /**
+ * Returns the context.
+ *
+ * @param index
+ * the index of the context
+ * @param sequence
+ * String sequence given
+ * @param priorDecisions
+ * decisions given earlier
+ * @param additionalContext
+ * any additional context
+ * @return the context
+ */
+ @Override
+ public String[] getContext(int index, String[] sequence,
+ String[] priorDecisions, Object[] additionalContext) {
+ return new String[] {};
+ }
+
+ public void updateAdaptiveData(String[] tokens, String[] outcomes) {
+
+ if (tokens != null && outcomes != null
+ && tokens.length != outcomes.length) {
+ throw new IllegalArgumentException(
+ "The tokens and outcome arrays MUST have the same size!");
+ }
+
+ for (AdaptiveFeatureGenerator featureGenerator : featureGenerators) {
+ featureGenerator.updateAdaptiveData(tokens, outcomes);
+ }
+ }
+
+}
diff --git
a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/sentiment/SentimentCrossValidator.java
b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/sentiment/SentimentCrossValidator.java
new file mode 100644
index 00000000..1329412b
--- /dev/null
+++
b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/sentiment/SentimentCrossValidator.java
@@ -0,0 +1,236 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package opennlp.tools.sentiment;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.Iterator;
+import java.util.List;
+
+import opennlp.tools.util.FilterObjectStream;
+import opennlp.tools.util.ObjectStream;
+import opennlp.tools.util.TrainingParameters;
+import opennlp.tools.util.eval.CrossValidationPartitioner;
+import opennlp.tools.util.eval.FMeasure;
+
+/**
+ * Class for performing cross validation on the Sentiment Analysis Parser.
+ */
+public class SentimentCrossValidator {
+
+ /**
+ * Class for creating a document sample
+ */
+ private static class DocumentSample {
+
+ private final SentimentSample[] samples;
+
+ /**
+ * Constructor
+ */
+ DocumentSample(SentimentSample[] samples) {
+ this.samples = samples;
+ }
+
+ /**
+ * Returns the short description of the tool
+ *
+ * @return the samples
+ */
+ private SentimentSample[] getSamples() {
+ return samples;
+ }
+ }
+
+ /**
+ * Reads Sentiment Samples to group them as a document based on the clear
+ * adaptive data flag.
+ */
+ private static class SentimentToDocumentSampleStream
+ extends FilterObjectStream<SentimentSample, DocumentSample> {
+
+ private SentimentSample beginSample;
+
+ /**
+ * Constructor
+ */
+ protected SentimentToDocumentSampleStream(ObjectStream<SentimentSample>
samples) {
+ super(samples);
+ }
+
+ /**
+ * Reads Sentiment Samples to group them as a document
+ *
+ * @return the resulting DocumentSample
+ */
+ @Override
+ public DocumentSample read() throws IOException {
+
+ List<SentimentSample> document = new ArrayList<>();
+
+ if (beginSample == null) {
+ // Assume that the clear flag is set
+ beginSample = samples.read();
+ }
+
+ // Underlying stream is exhausted!
+ if (beginSample == null) {
+ return null;
+ }
+
+ document.add(beginSample);
+
+ SentimentSample sample;
+ while ((sample = samples.read()) != null) {
+
+ if (sample.isClearAdaptiveDataSet()) {
+ beginSample = sample;
+ break;
+ }
+
+ document.add(sample);
+ }
+
+ // Underlying stream is exhausted,
+ // next call must return null
+ if (sample == null) {
+ beginSample = null;
+ }
+
+ return new DocumentSample(document.toArray(new SentimentSample[0]));
+ }
+
+ /**
+ * Performs a reset
+ */
+ @Override
+ public void reset() throws IOException, UnsupportedOperationException {
+ super.reset();
+ beginSample = null;
+ }
+ }
+
+ /**
+ * Splits DocumentSample into SentimentSamples.
+ */
+ private static class DocumentToSentimentSampleStream
+ extends FilterObjectStream<DocumentSample, SentimentSample> {
+
+ /**
+ * Constructor
+ */
+ protected DocumentToSentimentSampleStream(
+ ObjectStream<DocumentSample> samples) {
+ super(samples);
+ }
+
+ private Iterator<SentimentSample> documentSamples =
Collections.emptyIterator();
+
+ /**
+ * Reads Document Sample into SentimentSample
+ *
+ * @return the resulting DocumentSample
+ */
+ @Override
+ public SentimentSample read() throws IOException {
+
+ // Note: Empty document samples should be skipped
+
+ if (documentSamples.hasNext()) {
+ return documentSamples.next();
+ } else {
+ DocumentSample docSample = samples.read();
+
+ if (docSample != null) {
+ documentSamples = Arrays.asList(docSample.getSamples()).iterator();
+
+ return read();
+ } else {
+ return null;
+ }
+ }
+ }
+ }
+
+ private final String languageCode;
+ private final TrainingParameters params;
+ private final SentimentEvaluationMonitor[] listeners;
+
+ private final SentimentFactory factory;
+ private final FMeasure fmeasure = new FMeasure();
+
+ /**
+ * Constructor
+ */
+ public SentimentCrossValidator(String lang, TrainingParameters params,
+ SentimentFactory factory, SentimentEvaluationMonitor[] monitors) {
+
+ this.languageCode = lang;
+ this.factory = factory;
+ this.params = params;
+ this.listeners = monitors;
+ }
+
+ /**
+ * Performs evaluation
+ *
+ * @param samples
+ * stream of SentimentSamples
+ * @param nFolds
+ * the number of folds to be used in cross validation
+ */
+ public void evaluate(ObjectStream<SentimentSample> samples, int nFolds)
+ throws IOException {
+
+ // Note: The sentiment samples need to be grouped on a document basis.
+
+ CrossValidationPartitioner<DocumentSample> partitioner = new
CrossValidationPartitioner<>(
+ new SentimentToDocumentSampleStream(samples), nFolds);
+
+ while (partitioner.hasNext()) {
+
+ CrossValidationPartitioner.TrainingSampleStream<DocumentSample>
trainingSampleStream = partitioner
+ .next();
+
+ SentimentModel model = SentimentME.train(languageCode,
+ new DocumentToSentimentSampleStream(trainingSampleStream), params,
+ factory);
+
+ // do testing
+ SentimentEvaluator evaluator = new SentimentEvaluator(
+ new SentimentME(model), listeners);
+
+ evaluator.evaluate(new DocumentToSentimentSampleStream(
+ trainingSampleStream.getTestSampleStream()));
+
+ fmeasure.mergeInto(evaluator.getFMeasure());
+ }
+ }
+
+ /**
+ * Returns the F-Measure
+ *
+ * @return the F-Measure
+ */
+ public FMeasure getFMeasure() {
+ return fmeasure;
+ }
+
+}
diff --git
a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/sentiment/SentimentEvaluator.java
b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/sentiment/SentimentEvaluator.java
new file mode 100644
index 00000000..ecdb4747
--- /dev/null
+++
b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/sentiment/SentimentEvaluator.java
@@ -0,0 +1,66 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package opennlp.tools.sentiment;
+
+import opennlp.tools.util.eval.Evaluator;
+import opennlp.tools.util.eval.FMeasure;
+
+/**
+ * The {@link SentimentEvaluator} measures the performance of
+ * the given {@link SentimentME} with the provided reference
+ * {@link SentimentSample}s.
+ *
+ * @see Evaluator
+ * @see SentimentME
+ * @see SentimentSample
+ */
+public class SentimentEvaluator extends Evaluator<SentimentSample> {
+
+ private final FMeasure fmeasure = new FMeasure();
+
+ private final SentimentME sentiment;
+
+ /**
+ * Initializes the current instance.
+ *
+ * @param sentiment The {@link SentimentME} to be used for predicting
sentiment.
+ * @param listeners The {@link SentimentEvaluationMonitor evaluation sample
listeners}.
+ */
+ public SentimentEvaluator(SentimentME sentiment,
SentimentEvaluationMonitor... listeners) {
+ super(listeners);
+ this.sentiment = sentiment;
+ }
+
+ /**
+ * {@inheritDoc}
+ */
+ @Override
+ protected SentimentSample processSample(SentimentSample reference) {
+ String prediction = sentiment.predict(reference.getSentence());
+ String label = reference.getSentiment();
+
+ fmeasure.updateScores(new String[] { label }, new String[] { prediction });
+
+ return new SentimentSample(prediction, reference.getSentence());
+ }
+
+ public FMeasure getFMeasure() {
+ return fmeasure;
+ }
+
+}
diff --git
a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/sentiment/SentimentEventStream.java
b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/sentiment/SentimentEventStream.java
new file mode 100644
index 00000000..fe1078d0
--- /dev/null
+++
b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/sentiment/SentimentEventStream.java
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package opennlp.tools.sentiment;
+
+import java.util.Iterator;
+
+import opennlp.tools.ml.model.Event;
+import opennlp.tools.util.AbstractEventStream;
+import opennlp.tools.util.ObjectStream;
+
+/**
+ * Class for creating events for Sentiment Analysis that is later sent to
+ * MaxEnt.
+ *
+ * @see SentimentSample
+ */
+public class SentimentEventStream extends AbstractEventStream<SentimentSample>
{
+
+ private final SentimentContextGenerator contextGenerator;
+
+ /**
+ * Instantiates a {@link SentimentEventStream event stream}.
+ *
+ * @param samples
+ * the sentiment samples to be used
+ * @param createContextGenerator
+ * the context generator to be used
+ */
+ public SentimentEventStream(ObjectStream<SentimentSample> samples,
+ SentimentContextGenerator createContextGenerator) {
+ super(samples);
+ contextGenerator = createContextGenerator;
+ }
+
+ /**
+ * Creates events from {@link SentimentSample sentiment samples}.
+ *
+ * @param sample
+ * the sentiment sample to be used
+ * @return event iterator
+ */
+ @Override
+ protected Iterator<Event> createEvents(final SentimentSample sample) {
+
+ return new Iterator<>() {
+
+ private boolean isVirgin = true;
+
+ @Override
+ public boolean hasNext() {
+ return isVirgin;
+ }
+
+ @Override
+ public Event next() {
+
+ isVirgin = false;
+
+ return new Event(sample.getSentiment(),
+ contextGenerator.getContext(sample.getSentence()));
+ }
+
+ @Override
+ public void remove() {
+ throw new UnsupportedOperationException();
+ }
+ };
+ }
+
+}
diff --git
a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/sentiment/SentimentFactory.java
b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/sentiment/SentimentFactory.java
new file mode 100644
index 00000000..0d0c1c31
--- /dev/null
+++
b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/sentiment/SentimentFactory.java
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package opennlp.tools.sentiment;
+
+import opennlp.tools.tokenize.Tokenizer;
+import opennlp.tools.tokenize.WhitespaceTokenizer;
+import opennlp.tools.util.BaseToolFactory;
+import opennlp.tools.util.InvalidFormatException;
+import opennlp.tools.util.ext.ExtensionLoader;
+
+/**
+ * Class for creating sentiment factories for training.
+ */
+public class SentimentFactory extends BaseToolFactory {
+
+ private static final String TOKENIZER_NAME = "sentiment.tokenizer";
+
+ private Tokenizer tokenizer;
+
+ /**
+ * Validates the artifact map --> nothing to validate.
+ */
+ @Override
+ public void validateArtifactMap() throws InvalidFormatException {
+ // nothing to validate
+ }
+
+ /**
+ * Creates a new {@link SentimentContextGenerator context generator}.
+ *
+ * @return a context generator for Sentiment Analysis
+ */
+ public SentimentContextGenerator createContextGenerator() {
+ return new SentimentContextGenerator();
+ }
+
+ /**
+ *
+ *
+ * @return Retrieves the {@link Tokenizer}.
+ */
+ public Tokenizer getTokenizer() {
+ if (this.tokenizer == null) {
+ if (artifactProvider != null) {
+ String className =
artifactProvider.getManifestProperty(TOKENIZER_NAME);
+ if (className != null) {
+ this.tokenizer =
ExtensionLoader.instantiateExtension(Tokenizer.class, className);
+ }
+ }
+ if (this.tokenizer == null) { // could not load using artifact provider
+ this.tokenizer = WhitespaceTokenizer.INSTANCE;
+ }
+ }
+ return tokenizer;
+ }
+
+}
diff --git
a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/sentiment/SentimentME.java
b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/sentiment/SentimentME.java
new file mode 100644
index 00000000..e159ce26
--- /dev/null
+++
b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/sentiment/SentimentME.java
@@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package opennlp.tools.sentiment;
+
+import java.io.IOException;
+import java.util.HashMap;
+import java.util.Map;
+
+import opennlp.tools.ml.EventTrainer;
+import opennlp.tools.ml.TrainerFactory;
+import opennlp.tools.ml.model.Event;
+import opennlp.tools.ml.model.MaxentModel;
+import opennlp.tools.util.ObjectStream;
+import opennlp.tools.util.TrainingParameters;
+
+/**
+ * A {@link SentimentDetector} implementation for creating and using
+ * maximum-entropy-based Sentiment Analysis models.
+ *
+ * @see SentimentModel
+ */
+public class SentimentME implements SentimentDetector {
+
+ private final SentimentContextGenerator contextGenerator;
+ private final SentimentFactory factory;
+ private final MaxentModel maxentModel;
+
+ /**
+ * Instantiates a {@link SentimentME} with the specified model.
+ *
+ * @param sentModel The {@link SentimentModel sentiment analysis model} to
use.
+ * It must not be {@code null}.
+ * @throws IllegalArgumentException Thrown if parameters are invalid.
+ */
+ public SentimentME(SentimentModel sentModel) {
+ if (sentModel == null) {
+ throw new IllegalArgumentException("SentimentModel must not be null!");
+ }
+ maxentModel = sentModel.getMaxentModel();
+ factory = sentModel.getFactory();
+ contextGenerator = factory.createContextGenerator();
+ }
+
+ /**
+ * Trains a {@link SentimentModel Sentiment Analysis model}.
+ *
+ * @param languageCode
+ * the code for the language of the text, e.g. "en"
+ * @param samples
+ * the sentiment samples to be used
+ * @param trainParams
+ * parameters for training
+ * @param factory
+ * a Sentiment Analysis factory
+ * @return A valid {@link SentimentModel}.
+ * @throws IOException Thrown if IO errors occurred during training.
+ */
+ public static SentimentModel train(String languageCode,
ObjectStream<SentimentSample> samples,
+ TrainingParameters trainParams,
SentimentFactory factory)
+ throws IOException {
+
+ Map<String, String> entries = new HashMap<>();
+
+ ObjectStream<Event> eventStream = new SentimentEventStream(samples,
factory.createContextGenerator());
+
+ EventTrainer<TrainingParameters> trainer =
TrainerFactory.getEventTrainer(trainParams, entries);
+ MaxentModel sentimentModel = trainer.train(eventStream);
+
+ return new SentimentModel(languageCode, sentimentModel, entries, factory);
+ }
+
+ @Override
+ public String predict(String sentence) {
+ String[] tokens = factory.getTokenizer().tokenize(sentence);
+ return predict(tokens);
+ }
+
+ @Override
+ public String predict(String[] tokens) {
+ double[] prob = probabilities(tokens);
+ return getBestSentiment(prob);
+ }
+
+ /**
+ * Returns the best chosen sentiment for the given probability distribution.
+ *
+ * @param outcome the probability distribution over outcomes.
+ * @return the best sentiment label.
+ */
+ public String getBestSentiment(double[] outcome) {
+ return maxentModel.getBestOutcome(outcome);
+ }
+
+ /**
+ * Returns the probability distribution over sentiment labels for the given
tokens.
+ *
+ * @param text the tokens to classify.
+ * @return the probability distribution over sentiment labels.
+ */
+ public double[] probabilities(String[] text) {
+ return maxentModel.eval(contextGenerator.getContext(text));
+ }
+}
diff --git
a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/sentiment/SentimentModel.java
b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/sentiment/SentimentModel.java
new file mode 100644
index 00000000..641939d5
--- /dev/null
+++
b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/sentiment/SentimentModel.java
@@ -0,0 +1,106 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package opennlp.tools.sentiment;
+
+import java.io.File;
+import java.io.IOException;
+import java.io.InputStream;
+import java.net.URL;
+import java.util.Map;
+
+import opennlp.tools.ml.model.MaxentModel;
+import opennlp.tools.util.model.BaseModel;
+
+/**
+ * Class for the basis of the Sentiment Analysis model.
+ */
+public class SentimentModel extends BaseModel {
+
+ private static final String COMPONENT_NAME = "SentimentME";
+ private static final String SENTIMENT_MODEL_ENTRY_NAME = "sentiment.model";
+
+ /**
+ * Instantiates a {@link SentimentModel} model.
+ *
+ * @param languageCode
+ * The code for the language of the text, e.g. "en"
+ * @param sentimentModel
+ * A {@link MaxentModel} sentiment model
+ * @param manifestInfoEntries
+ * Additional information in the manifest
+ * @param factory
+ * A {@link SentimentFactory} instance
+ */
+ public SentimentModel(String languageCode, MaxentModel sentimentModel,
+ Map<String, String> manifestInfoEntries, SentimentFactory factory) {
+ super(COMPONENT_NAME, languageCode, manifestInfoEntries, factory);
+ artifactMap.put(SENTIMENT_MODEL_ENTRY_NAME, sentimentModel);
+ checkArtifactMap();
+ }
+
+ /**
+ * Instantiates a {@link SentimentModel} model via a {@link URL} reference.
+ *
+ * @param modelURL
+ * The {@link URL} to a file required to load the model.
+ *
+ * @throws IOException Thrown if IO errors occurred.
+ */
+ public SentimentModel(URL modelURL) throws IOException {
+ super(COMPONENT_NAME, modelURL);
+ }
+
+ /**
+ * Instantiates a {@link SentimentModel} model via a {@link File} reference.
+ *
+ * @param file
+ * The {@link File} required to load the model.
+ *
+ * @throws IOException Thrown if IO errors occurred.
+ */
+ public SentimentModel(File file) throws IOException {
+ super(COMPONENT_NAME, file);
+ }
+
+ /**
+ * Instantiates a {@link SentimentModel} model via a {@link InputStream}
reference.
+ *
+ * @param modelIn
+ * The {@link InputStream} required to load the model.
+ *
+ * @throws IOException Thrown if IO errors occurred.
+ */
+ public SentimentModel(InputStream modelIn) throws IOException {
+ super(COMPONENT_NAME, modelIn);
+ }
+
+ /**
+ * @return Retrieves the {@link SentimentFactory} for the model.
+ */
+ public SentimentFactory getFactory() {
+ return (SentimentFactory) this.toolFactory;
+ }
+
+ /**
+ * @return Retrieves the {@link MaxentModel}.
+ */
+ public MaxentModel getMaxentModel() {
+ return (MaxentModel) artifactMap.get(SENTIMENT_MODEL_ENTRY_NAME);
+ }
+
+}
diff --git
a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/sentiment/SentimentSampleStream.java
b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/sentiment/SentimentSampleStream.java
new file mode 100644
index 00000000..e2ca6a3f
--- /dev/null
+++
b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/sentiment/SentimentSampleStream.java
@@ -0,0 +1,76 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package opennlp.tools.sentiment;
+
+import java.io.IOException;
+
+import opennlp.tools.tokenize.WhitespaceTokenizer;
+import opennlp.tools.util.FilterObjectStream;
+import opennlp.tools.util.ObjectStream;
+
+/**
+ * Class for converting Strings through Data Stream to {@link SentimentSample}
using
+ * tokenised text.
+ */
+public class SentimentSampleStream
+ extends FilterObjectStream<String, SentimentSample> {
+
+ /**
+ * Instantiates a {@link SentimentSampleStream} object.
+ *
+ * @param samples
+ * the sentiment samples to be used
+ */
+ public SentimentSampleStream(ObjectStream<String> samples) {
+ super(samples);
+ }
+
+ /**
+ * Reads the text.
+ *
+ * @return A ready-to-be-trained {@link SentimentSample} object.
+ */
+ @Override
+ public SentimentSample read() throws IOException {
+ String sentence = samples.read();
+
+ if (sentence != null) {
+
+ // Whitespace tokenize entire string
+ String[] tokens = WhitespaceTokenizer.INSTANCE.tokenize(sentence);
+
+ SentimentSample sample;
+
+ if (tokens.length > 1) {
+ String sentiment = tokens[0];
+ String[] sentTokens = new String[tokens.length - 1];
+ System.arraycopy(tokens, 1, sentTokens, 0, tokens.length - 1);
+
+ sample = new SentimentSample(sentiment, sentTokens);
+ } else {
+ throw new IOException(
+ "Empty lines, or lines with only a category string are not
allowed!");
+ }
+
+ return sample;
+ }
+
+ return null;
+ }
+
+}
diff --git
a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/sentiment/SentimentSampleTypeFilter.java
b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/sentiment/SentimentSampleTypeFilter.java
new file mode 100644
index 00000000..984d146e
--- /dev/null
+++
b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/sentiment/SentimentSampleTypeFilter.java
@@ -0,0 +1,76 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package opennlp.tools.sentiment;
+
+import java.io.IOException;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashSet;
+import java.util.Set;
+
+import opennlp.tools.util.FilterObjectStream;
+import opennlp.tools.util.ObjectStream;
+
+/**
+ * Class for creating a type filter.
+ *
+ * @see FilterObjectStream
+ * @see SentimentSample
+ */
+public class SentimentSampleTypeFilter
+ extends FilterObjectStream<SentimentSample, SentimentSample> {
+
+ private final Set<String> types;
+
+ /**
+ * Constructor
+ */
+ public SentimentSampleTypeFilter(String[] types,
ObjectStream<SentimentSample> samples) {
+ super(samples);
+ this.types = Collections.unmodifiableSet(new
HashSet<>(Arrays.asList(types)));
+ }
+
+ /**
+ * Instantiates a {@link SentimentSampleTypeFilter} object.
+ *
+ * @param types
+ * the types to filter.
+ * @param samples
+ * the sentiment samples to be used.
+ */
+ public SentimentSampleTypeFilter(Set<String> types,
ObjectStream<SentimentSample> samples) {
+ super(samples);
+ this.types = Set.copyOf(types);
+ }
+
+ /**
+ * @return Reads and returns the next {@link SentimentSample} whose sentiment
+ * matches the configured types, or {@code null} if the stream is
exhausted.
+ */
+ @Override
+ public SentimentSample read() throws IOException {
+ SentimentSample sample;
+ while ((sample = samples.read()) != null) {
+ if (types.contains(sample.getSentiment())) {
+ return sample;
+ }
+ }
+ return null;
+ }
+
+}
diff --git
a/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/sentiment/SentimentContextGeneratorTest.java
b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/sentiment/SentimentContextGeneratorTest.java
new file mode 100644
index 00000000..ce14e78f
--- /dev/null
+++
b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/sentiment/SentimentContextGeneratorTest.java
@@ -0,0 +1,69 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package opennlp.tools.sentiment;
+
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.Test;
+
+/**
+ * Tests for the {@link SentimentContextGenerator} class.
+ */
+public class SentimentContextGeneratorTest {
+
+ @Test
+ void testGetContextReturnsInput() {
+ SentimentContextGenerator cg = new SentimentContextGenerator();
+ String[] tokens = {"I", "love", "this"};
+ Assertions.assertArrayEquals(tokens, cg.getContext(tokens));
+ }
+
+ @Test
+ void testGetContextEmptyArray() {
+ SentimentContextGenerator cg = new SentimentContextGenerator();
+ String[] tokens = {};
+ Assertions.assertArrayEquals(tokens, cg.getContext(tokens));
+ }
+
+ @Test
+ void testGetContextWithIndexReturnsEmpty() {
+ SentimentContextGenerator cg = new SentimentContextGenerator();
+ String[] result = cg.getContext(0, new String[] {"a"}, new String[] {"b"},
null);
+ Assertions.assertNotNull(result);
+ Assertions.assertEquals(0, result.length);
+ }
+
+ @Test
+ void testUpdateAdaptiveDataMismatchThrows() {
+ SentimentContextGenerator cg = new SentimentContextGenerator();
+ Assertions.assertThrows(IllegalArgumentException.class,
+ () -> cg.updateAdaptiveData(new String[] {"a", "b"}, new String[]
{"x"}));
+ }
+
+ @Test
+ void testUpdateAdaptiveDataMatchingLengths() {
+ SentimentContextGenerator cg = new SentimentContextGenerator();
+ Assertions.assertDoesNotThrow(
+ () -> cg.updateAdaptiveData(new String[] {"a"}, new String[] {"x"}));
+ }
+
+ @Test
+ void testUpdateAdaptiveDataWithNulls() {
+ SentimentContextGenerator cg = new SentimentContextGenerator();
+ Assertions.assertDoesNotThrow(() -> cg.updateAdaptiveData(null, null));
+ }
+}
diff --git
a/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/sentiment/SentimentCrossValidatorTest.java
b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/sentiment/SentimentCrossValidatorTest.java
new file mode 100644
index 00000000..2aaa9c57
--- /dev/null
+++
b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/sentiment/SentimentCrossValidatorTest.java
@@ -0,0 +1,87 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package opennlp.tools.sentiment;
+
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+import java.util.concurrent.atomic.AtomicInteger;
+
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.Test;
+
+import opennlp.tools.formats.ResourceAsStreamFactory;
+import opennlp.tools.util.InputStreamFactory;
+import opennlp.tools.util.PlainTextByLineStream;
+import opennlp.tools.util.TrainingParameters;
+
+/**
+ * Tests for the {@link SentimentCrossValidator} class.
+ */
+public class SentimentCrossValidatorTest {
+
+ private SentimentSampleStream createSampleStream() throws IOException {
+ InputStreamFactory in = new ResourceAsStreamFactory(
+ SentimentCrossValidatorTest.class,
"/opennlp/tools/sentiment/train.txt");
+
+ return new SentimentSampleStream(
+ new PlainTextByLineStream(in, StandardCharsets.UTF_8));
+ }
+
+ @Test
+ void testCrossValidation() throws IOException {
+ SentimentCrossValidator cv = new SentimentCrossValidator("eng",
+ TrainingParameters.defaultParams(), new SentimentFactory(),
+ new SentimentEvaluationMonitor[] {});
+
+ cv.evaluate(createSampleStream(), 2);
+
+ Assertions.assertNotNull(cv.getFMeasure());
+ Assertions.assertTrue(cv.getFMeasure().getRecallScore() >= 0);
+ Assertions.assertTrue(cv.getFMeasure().getPrecisionScore() >= 0);
+ }
+
+ @Test
+ void testCrossValidationWithMonitor() throws IOException {
+ AtomicInteger correctCount = new AtomicInteger();
+ AtomicInteger incorrectCount = new AtomicInteger();
+
+ SentimentEvaluationMonitor monitor = new SentimentEvaluationMonitor() {
+ @Override
+ public void correctlyClassified(SentimentSample reference,
+ SentimentSample prediction) {
+ correctCount.incrementAndGet();
+ }
+
+ @Override
+ public void misclassified(SentimentSample reference,
+ SentimentSample prediction) {
+ incorrectCount.incrementAndGet();
+ }
+ };
+
+ SentimentCrossValidator cv = new SentimentCrossValidator("eng",
+ TrainingParameters.defaultParams(), new SentimentFactory(),
+ new SentimentEvaluationMonitor[] {monitor});
+
+ cv.evaluate(createSampleStream(), 2);
+
+ Assertions.assertTrue(correctCount.get() + incorrectCount.get() > 0,
+ "Monitor should have been called at least once");
+ Assertions.assertNotNull(cv.getFMeasure());
+ }
+}
diff --git
a/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/sentiment/SentimentEvaluatorTest.java
b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/sentiment/SentimentEvaluatorTest.java
new file mode 100644
index 00000000..4b32518a
--- /dev/null
+++
b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/sentiment/SentimentEvaluatorTest.java
@@ -0,0 +1,182 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package opennlp.tools.sentiment;
+
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.List;
+
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.BeforeAll;
+import org.junit.jupiter.api.Test;
+
+import opennlp.tools.formats.ResourceAsStreamFactory;
+import opennlp.tools.util.InputStreamFactory;
+import opennlp.tools.util.PlainTextByLineStream;
+import opennlp.tools.util.TrainingParameters;
+
+/**
+ * Tests for the {@link SentimentEvaluator} class.
+ */
+public class SentimentEvaluatorTest {
+
+ private static SentimentModel model;
+
+ @BeforeAll
+ static void trainModel() throws IOException {
+ InputStreamFactory in = new ResourceAsStreamFactory(
+ SentimentEvaluatorTest.class, "/opennlp/tools/sentiment/train.txt");
+
+ SentimentSampleStream sampleStream = new SentimentSampleStream(
+ new PlainTextByLineStream(in, StandardCharsets.UTF_8));
+
+ model = SentimentME.train("eng", sampleStream,
+ TrainingParameters.defaultParams(), new SentimentFactory());
+ }
+
+ @Test
+ void testEvaluateSampleCorrectPrediction() {
+ SentimentME me = new SentimentME(model);
+
+ List<SentimentSample> references = new ArrayList<>();
+ List<SentimentSample> predictions = new ArrayList<>();
+ SentimentEvaluationMonitor monitor = new SentimentEvaluationMonitor() {
+ @Override
+ public void correctlyClassified(SentimentSample reference,
SentimentSample prediction) {
+ references.add(reference);
+ predictions.add(prediction);
+ }
+
+ @Override
+ public void misclassified(SentimentSample reference, SentimentSample
prediction) {
+ references.add(reference);
+ predictions.add(prediction);
+ }
+ };
+
+ SentimentEvaluator evaluator = new SentimentEvaluator(me, monitor);
+
+ SentimentSample sample = new SentimentSample("positive",
+ new String[] {"I", "love", "this", "great", "product"});
+
+ evaluator.evaluateSample(sample);
+
+ Assertions.assertEquals(1, references.size());
+ Assertions.assertEquals(1, predictions.size());
+
+ // Verify the reference sample is the original
+ Assertions.assertEquals("positive", references.getFirst().getSentiment());
+ Assertions.assertArrayEquals(sample.getSentence(),
references.getFirst().getSentence());
+
+ // Verify the predicted sample has a valid sentiment and the same sentence
+ SentimentSample predicted = predictions.getFirst();
+ Assertions.assertNotNull(predicted.getSentiment());
+ Assertions.assertArrayEquals(sample.getSentence(),
predicted.getSentence());
+
+ Assertions.assertNotNull(evaluator.getFMeasure());
+ Assertions.assertTrue(evaluator.getFMeasure().getRecallScore() > 0);
+ Assertions.assertTrue(evaluator.getFMeasure().getPrecisionScore() > 0);
+ }
+
+ @Test
+ void testGetFMeasureBeforeEvaluation() {
+ SentimentME me = new SentimentME(model);
+ SentimentEvaluator evaluator = new SentimentEvaluator(me);
+
+ Assertions.assertNotNull(evaluator.getFMeasure());
+ // FMeasure with no data should be -1 or 0
+ Assertions.assertEquals(-1.0, evaluator.getFMeasure().getFMeasure());
+ }
+
+ @Test
+ void testEvaluateMultipleSamples() {
+ SentimentME me = new SentimentME(model);
+
+ List<SentimentSample> predictions = new ArrayList<>();
+ SentimentEvaluationMonitor monitor = new SentimentEvaluationMonitor() {
+ @Override
+ public void correctlyClassified(SentimentSample reference,
SentimentSample prediction) {
+ predictions.add(prediction);
+ }
+
+ @Override
+ public void misclassified(SentimentSample reference, SentimentSample
prediction) {
+ predictions.add(prediction);
+ }
+ };
+
+ SentimentEvaluator evaluator = new SentimentEvaluator(me, monitor);
+
+ SentimentSample posSample = new SentimentSample("positive",
+ new String[] {"wonderful", "amazing", "great"});
+ SentimentSample negSample = new SentimentSample("negative",
+ new String[] {"terrible", "horrible", "awful"});
+
+ evaluator.evaluateSample(posSample);
+ evaluator.evaluateSample(negSample);
+
+ Assertions.assertEquals(2, predictions.size());
+
+ // Each predicted sample should have a valid sentiment and the original
sentence
+ Assertions.assertNotNull(predictions.get(0).getSentiment());
+ Assertions.assertArrayEquals(posSample.getSentence(),
predictions.get(0).getSentence());
+
+ Assertions.assertNotNull(predictions.get(1).getSentiment());
+ Assertions.assertArrayEquals(negSample.getSentence(),
predictions.get(1).getSentence());
+
+ Assertions.assertNotNull(evaluator.getFMeasure());
+ Assertions.assertTrue(evaluator.getFMeasure().getRecallScore() > 0);
+ Assertions.assertTrue(evaluator.getFMeasure().getPrecisionScore() > 0);
+ }
+
+ @Test
+ void testProcessSampleReturnsPrediction() {
+ SentimentME me = new SentimentME(model);
+ SentimentEvaluator evaluator = new SentimentEvaluator(me);
+
+ SentimentSample reference = new SentimentSample("positive",
+ new String[] {"I", "love", "this", "great", "product"});
+
+ SentimentSample result = evaluator.processSample(reference);
+
+ Assertions.assertNotNull(result);
+ // The returned sample should contain the model's prediction as sentiment
+ Assertions.assertNotNull(result.getSentiment());
+ Assertions.assertTrue("positive".equals(result.getSentiment())
+ || "negative".equals(result.getSentiment()));
+ // The sentence should be preserved from the reference
+ Assertions.assertArrayEquals(reference.getSentence(),
result.getSentence());
+ }
+
+ @Test
+ void testProcessSampleUpdatesScores() {
+ SentimentME me = new SentimentME(model);
+ SentimentEvaluator evaluator = new SentimentEvaluator(me);
+
+ // FMeasure should have no data initially
+ Assertions.assertEquals(-1.0, evaluator.getFMeasure().getFMeasure());
+
+ evaluator.processSample(new SentimentSample("positive",
+ new String[] {"wonderful", "amazing", "great"}));
+
+ // After processing, FMeasure should have been updated
+ Assertions.assertTrue(evaluator.getFMeasure().getRecallScore() > 0);
+ Assertions.assertTrue(evaluator.getFMeasure().getPrecisionScore() > 0);
+ }
+}
diff --git
a/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/sentiment/SentimentEventStreamTest.java
b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/sentiment/SentimentEventStreamTest.java
new file mode 100644
index 00000000..e1f19066
--- /dev/null
+++
b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/sentiment/SentimentEventStreamTest.java
@@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package opennlp.tools.sentiment;
+
+import java.io.IOException;
+
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.Test;
+
+import opennlp.tools.ml.model.Event;
+import opennlp.tools.util.ObjectStream;
+import opennlp.tools.util.ObjectStreamUtils;
+
+/**
+ * Tests for the {@link SentimentEventStream} class.
+ */
+public class SentimentEventStreamTest {
+
+ @Test
+ void testEventOutcome() throws IOException {
+ SentimentSample sample = new SentimentSample("positive",
+ new String[] {"I", "love", "this"});
+
+ ObjectStream<Event> eventStream = new SentimentEventStream(
+ ObjectStreamUtils.createObjectStream(sample),
+ new SentimentContextGenerator());
+
+ Event event = eventStream.read();
+ Assertions.assertNotNull(event);
+ Assertions.assertEquals("positive", event.getOutcome());
+ Assertions.assertArrayEquals(new String[] {"I", "love", "this"},
event.getContext());
+ }
+
+ @Test
+ void testMultipleEvents() throws IOException {
+ SentimentSample pos = new SentimentSample("positive",
+ new String[] {"great", "product"});
+ SentimentSample neg = new SentimentSample("negative",
+ new String[] {"terrible", "service"});
+
+ ObjectStream<Event> eventStream = new SentimentEventStream(
+ ObjectStreamUtils.createObjectStream(pos, neg),
+ new SentimentContextGenerator());
+
+ Event first = eventStream.read();
+ Assertions.assertEquals("positive", first.getOutcome());
+
+ Event second = eventStream.read();
+ Assertions.assertEquals("negative", second.getOutcome());
+
+ Assertions.assertNull(eventStream.read());
+ }
+
+ @Test
+ void testOneEventPerSample() throws IOException {
+ SentimentSample sample = new SentimentSample("positive",
+ new String[] {"a", "b", "c", "d", "e"});
+
+ ObjectStream<Event> eventStream = new SentimentEventStream(
+ ObjectStreamUtils.createObjectStream(sample),
+ new SentimentContextGenerator());
+
+ Assertions.assertNotNull(eventStream.read());
+ Assertions.assertNull(eventStream.read());
+ }
+}
diff --git
a/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/sentiment/SentimentFactoryTest.java
b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/sentiment/SentimentFactoryTest.java
new file mode 100644
index 00000000..f9890069
--- /dev/null
+++
b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/sentiment/SentimentFactoryTest.java
@@ -0,0 +1,54 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package opennlp.tools.sentiment;
+
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.Test;
+
+import opennlp.tools.tokenize.WhitespaceTokenizer;
+
+/**
+ * Tests for the {@link SentimentFactory} class.
+ */
+public class SentimentFactoryTest {
+
+ @Test
+ void testCreateContextGenerator() {
+ SentimentFactory factory = new SentimentFactory();
+ SentimentContextGenerator cg = factory.createContextGenerator();
+ Assertions.assertNotNull(cg);
+ }
+
+ @Test
+ void testGetTokenizerDefaultsToWhitespace() {
+ SentimentFactory factory = new SentimentFactory();
+ Assertions.assertSame(WhitespaceTokenizer.INSTANCE,
factory.getTokenizer());
+ }
+
+ @Test
+ void testGetTokenizerIsCached() {
+ SentimentFactory factory = new SentimentFactory();
+ Assertions.assertSame(factory.getTokenizer(), factory.getTokenizer());
+ }
+
+ @Test
+ void testValidateArtifactMapDoesNotThrow() {
+ SentimentFactory factory = new SentimentFactory();
+ Assertions.assertDoesNotThrow(factory::validateArtifactMap);
+ }
+}
diff --git
a/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/sentiment/SentimentMETest.java
b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/sentiment/SentimentMETest.java
new file mode 100644
index 00000000..81ba1e97
--- /dev/null
+++
b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/sentiment/SentimentMETest.java
@@ -0,0 +1,210 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package opennlp.tools.sentiment;
+
+import java.io.ByteArrayInputStream;
+import java.io.ByteArrayOutputStream;
+import java.io.File;
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+import java.nio.file.Path;
+
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.BeforeAll;
+import org.junit.jupiter.api.BeforeEach;
+import org.junit.jupiter.api.Test;
+
+import opennlp.tools.formats.ResourceAsStreamFactory;
+import opennlp.tools.util.InputStreamFactory;
+import opennlp.tools.util.PlainTextByLineStream;
+import opennlp.tools.util.TrainingParameters;
+
+/**
+ * Tests for the {@link SentimentME} class.
+ */
+public class SentimentMETest {
+
+ private static SentimentModel model;
+
+ // SUT
+ private SentimentME sentiment;
+
+ @BeforeAll
+ static void trainModel() throws IOException {
+ InputStreamFactory in = new ResourceAsStreamFactory(
+ SentimentMETest.class, "/opennlp/tools/sentiment/train.txt");
+
+ SentimentSampleStream sampleStream = new SentimentSampleStream(
+ new PlainTextByLineStream(in, StandardCharsets.UTF_8));
+
+ model = SentimentME.train("eng", sampleStream,
+ TrainingParameters.defaultParams(), new SentimentFactory());
+
+ Assertions.assertNotNull(model);
+ }
+
+ @BeforeEach
+ void setup() {
+ sentiment = new SentimentME(model);
+ }
+
+ @Test
+ void testPredictWithTokens() {
+
+ String prediction = sentiment.predict(new String[] {"I", "love", "this",
"product"});
+ Assertions.assertNotNull(prediction);
+ Assertions.assertEquals("positive", prediction);
+ }
+
+ @Test
+ void testPredictWithString() {
+ String prediction = sentiment.predict("I love this product");
+ Assertions.assertNotNull(prediction);
+ }
+
+ @Test
+ void testProbabilities() {
+ double[] probs = sentiment.probabilities(new String[] {"great", "amazing",
"wonderful"});
+ Assertions.assertNotNull(probs);
+ Assertions.assertTrue(probs.length > 0);
+
+ double sum = 0;
+ for (double p : probs) {
+ Assertions.assertTrue(p >= 0 && p <= 1);
+ sum += p;
+ }
+ Assertions.assertEquals(1.0, sum, 0.01);
+ }
+
+ @Test
+ void testGetBestSentiment() {
+ double[] probs = sentiment.probabilities(new String[] {"terrible",
"awful", "bad"});
+ String best = sentiment.getBestSentiment(probs);
+ Assertions.assertNotNull(best);
+ }
+
+ @Test
+ void testModelSerialization() throws IOException {
+ try (ByteArrayOutputStream out = new ByteArrayOutputStream()) {
+ model.serialize(out);
+
+ byte[] bytes = out.toByteArray();
+ Assertions.assertTrue(bytes.length > 0);
+
+ final SentimentModel deserialized = new SentimentModel(new
ByteArrayInputStream(bytes));
+ Assertions.assertNotNull(deserialized);
+ Assertions.assertEquals("eng", deserialized.getLanguage());
+ Assertions.assertNotNull(deserialized.getMaxentModel());
+ Assertions.assertNotNull(deserialized.getFactory());
+
+ SentimentME me = new SentimentME(deserialized);
+ String prediction = me.predict(new String[] {"love", "great", "happy"});
+ Assertions.assertNotNull(prediction);
+ Assertions.assertEquals("positive", prediction);
+ }
+ }
+
+ @Test
+ void testNullModelThrows() {
+ Assertions.assertThrows(IllegalArgumentException.class, () -> new
SentimentME(null));
+ }
+
+ @Test
+ void testModelLanguage() {
+ Assertions.assertEquals("eng", model.getLanguage());
+ }
+
+ @Test
+ void testModelFactory() {
+ Assertions.assertNotNull(model.getFactory());
+ Assertions.assertInstanceOf(SentimentFactory.class, model.getFactory());
+ }
+
+ @Test
+ void testModelMaxentModel() {
+ Assertions.assertNotNull(model.getMaxentModel());
+ }
+
+ @Test
+ void testPredictPositiveSentence() {
+ String prediction = sentiment.predict("I love this wonderful amazing great
product");
+ Assertions.assertEquals("positive", prediction);
+ }
+
+ @Test
+ void testPredictReturnsValidLabel() {
+ String prediction = sentiment.predict("I hate this terrible awful horrible
product");
+ Assertions.assertTrue("positive".equals(prediction) ||
"negative".equals(prediction),
+ "Prediction should be a valid sentiment label but was: " + prediction);
+ }
+
+ @Test
+ void testModelFileSerializationRoundTrip() throws IOException {
+ File tempFile = File.createTempFile("sentiment-model", ".bin");
+ tempFile.deleteOnExit();
+
+ model.serialize(tempFile);
+ Assertions.assertTrue(tempFile.length() > 0);
+
+ final SentimentModel deserialized = new SentimentModel(tempFile);
+ Assertions.assertNotNull(deserialized);
+ Assertions.assertEquals("eng", deserialized.getLanguage());
+ Assertions.assertNotNull(deserialized.getMaxentModel());
+ Assertions.assertNotNull(deserialized.getFactory());
+ Assertions.assertTrue(deserialized.isLoadedFromSerialized());
+
+ SentimentME me = new SentimentME(deserialized);
+ String prediction = me.predict(new String[] {"love", "great", "happy"});
+ Assertions.assertNotNull(prediction);
+ }
+
+ @Test
+ void testModelPathSerializationRoundTrip() throws IOException {
+ Path tempPath = File.createTempFile("sentiment-model", ".bin").toPath();
+ tempPath.toFile().deleteOnExit();
+
+ model.serialize(tempPath);
+ Assertions.assertTrue(tempPath.toFile().length() > 0);
+
+ SentimentModel deserialized = new SentimentModel(tempPath.toUri().toURL());
+ Assertions.assertNotNull(deserialized);
+ Assertions.assertEquals("eng", deserialized.getLanguage());
+ Assertions.assertNotNull(deserialized.getMaxentModel());
+ Assertions.assertNotNull(deserialized.getFactory());
+
+ SentimentME me = new SentimentME(deserialized);
+ String prediction = me.predict(new String[] {"terrible", "bad"});
+ Assertions.assertNotNull(prediction);
+ }
+
+ @Test
+ void testModelNotLoadedFromSerialized() {
+ Assertions.assertFalse(model.isLoadedFromSerialized());
+ }
+
+ @Test
+ void testModelIsLoadedFromSerialized() throws IOException {
+ try (ByteArrayOutputStream out = new ByteArrayOutputStream()) {
+ model.serialize(out);
+
+ final SentimentModel deserialized = new SentimentModel(
+ new ByteArrayInputStream(out.toByteArray()));
+ Assertions.assertTrue(deserialized.isLoadedFromSerialized());
+ }
+ }
+}
diff --git
a/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/sentiment/SentimentSampleStreamTest.java
b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/sentiment/SentimentSampleStreamTest.java
new file mode 100644
index 00000000..fdbb37f8
--- /dev/null
+++
b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/sentiment/SentimentSampleStreamTest.java
@@ -0,0 +1,76 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package opennlp.tools.sentiment;
+
+import java.io.IOException;
+
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.Test;
+
+import opennlp.tools.util.ObjectStreamUtils;
+
+/**
+ * Tests for the {@link SentimentSampleStream} class.
+ */
+public class SentimentSampleStreamTest {
+
+ @Test
+ void testReadSample() throws IOException {
+ SentimentSampleStream stream = new SentimentSampleStream(
+ ObjectStreamUtils.createObjectStream("positive I love this"));
+
+ SentimentSample sample = stream.read();
+ Assertions.assertNotNull(sample);
+ Assertions.assertEquals("positive", sample.getSentiment());
+ Assertions.assertArrayEquals(new String[] {"I", "love", "this"},
sample.getSentence());
+ }
+
+ @Test
+ void testReadMultipleSamples() throws IOException {
+ SentimentSampleStream stream = new SentimentSampleStream(
+ ObjectStreamUtils.createObjectStream(
+ "positive I love this",
+ "negative I hate this"));
+
+ SentimentSample first = stream.read();
+ Assertions.assertNotNull(first);
+ Assertions.assertEquals("positive", first.getSentiment());
+
+ SentimentSample second = stream.read();
+ Assertions.assertNotNull(second);
+ Assertions.assertEquals("negative", second.getSentiment());
+
+ Assertions.assertNull(stream.read());
+ }
+
+ @Test
+ void testReadEmptyStreamReturnsNull() throws IOException {
+ SentimentSampleStream stream = new SentimentSampleStream(
+ ObjectStreamUtils.createObjectStream());
+
+ Assertions.assertNull(stream.read());
+ }
+
+ @Test
+ void testSingleTokenLineThrows() {
+ SentimentSampleStream stream = new SentimentSampleStream(
+ ObjectStreamUtils.createObjectStream("onlysentiment"));
+
+ Assertions.assertThrows(IOException.class, stream::read);
+ }
+}
diff --git
a/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/sentiment/SentimentSampleTest.java
b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/sentiment/SentimentSampleTest.java
new file mode 100644
index 00000000..9cd693b1
--- /dev/null
+++
b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/sentiment/SentimentSampleTest.java
@@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package opennlp.tools.sentiment;
+
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.Test;
+
+/**
+ * Tests for the {@link SentimentSample} class.
+ */
+public class SentimentSampleTest {
+
+ @Test
+ void testCreation() {
+ SentimentSample sample = new SentimentSample("positive",
+ new String[] {"I", "love", "this"});
+
+ Assertions.assertEquals("positive", sample.getSentiment());
+ Assertions.assertArrayEquals(new String[] {"I", "love", "this"},
sample.getSentence());
+ Assertions.assertTrue(sample.isClearAdaptiveDataSet());
+ }
+
+ @Test
+ void testCreationWithClearAdaptiveDataFalse() {
+ SentimentSample sample = new SentimentSample("negative",
+ new String[] {"I", "hate", "this"}, false);
+
+ Assertions.assertEquals("negative", sample.getSentiment());
+ Assertions.assertArrayEquals(new String[] {"I", "hate", "this"},
sample.getSentence());
+ Assertions.assertFalse(sample.isClearAdaptiveDataSet());
+ }
+
+ @Test
+ void testNullSentimentThrows() {
+ Assertions.assertThrows(IllegalArgumentException.class,
+ () -> new SentimentSample(null, new String[] {"test"}));
+ }
+
+ @Test
+ void testNullSentenceThrows() {
+ Assertions.assertThrows(IllegalArgumentException.class,
+ () -> new SentimentSample("positive", null));
+ }
+
+ @Test
+ void testSentenceIsDefensiveCopy() {
+ String[] original = {"a", "b", "c"};
+ SentimentSample sample = new SentimentSample("positive", original);
+ String[] returned = sample.getSentence();
+ returned[0] = "modified";
+ Assertions.assertEquals("a", sample.getSentence()[0]);
+ }
+
+ @Test
+ void testEmptySentence() {
+ SentimentSample sample = new SentimentSample("neutral", new String[] {});
+ Assertions.assertEquals(0, sample.getSentence().length);
+ }
+
+ @Test
+ void testToString() {
+ SentimentSample sample = new SentimentSample("positive",
+ new String[] {"I", "love", "this"});
+ Assertions.assertEquals("positive I love this", sample.toString());
+ }
+
+ @Test
+ void testEquals() {
+ SentimentSample a = new SentimentSample("positive", new String[] {"a",
"b"});
+ SentimentSample b = new SentimentSample("positive", new String[] {"a",
"b"});
+ SentimentSample c = new SentimentSample("negative", new String[] {"a",
"b"});
+
+ Assertions.assertEquals(a, b);
+ Assertions.assertNotEquals(a, c);
+ Assertions.assertNotEquals(null, a);
+ }
+
+ @Test
+ void testHashCode() {
+ SentimentSample a = new SentimentSample("positive", new String[] {"a",
"b"});
+ SentimentSample b = new SentimentSample("positive", new String[] {"a",
"b"});
+ Assertions.assertEquals(a.hashCode(), b.hashCode());
+ }
+}
diff --git
a/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/sentiment/SentimentSampleTypeFilterTest.java
b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/sentiment/SentimentSampleTypeFilterTest.java
new file mode 100644
index 00000000..04bc57c9
--- /dev/null
+++
b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/sentiment/SentimentSampleTypeFilterTest.java
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package opennlp.tools.sentiment;
+
+import java.io.IOException;
+import java.util.Set;
+
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.Test;
+
+import opennlp.tools.util.ObjectStreamUtils;
+
+/**
+ * Tests for the {@link SentimentSampleTypeFilter} class.
+ */
+public class SentimentSampleTypeFilterTest {
+
+ @Test
+ void testFilterMatchingType() throws IOException {
+ SentimentSample pos = new SentimentSample("positive", new String[]
{"great"});
+ SentimentSample neg = new SentimentSample("negative", new String[]
{"bad"});
+
+ SentimentSampleTypeFilter filter = new SentimentSampleTypeFilter(
+ new String[] {"positive"},
+ ObjectStreamUtils.createObjectStream(pos, neg));
+
+ SentimentSample result = filter.read();
+ Assertions.assertNotNull(result);
+ Assertions.assertEquals("positive", result.getSentiment());
+
+ Assertions.assertNull(filter.read());
+ }
+
+ @Test
+ void testFilterMultipleTypes() throws IOException {
+ SentimentSample pos = new SentimentSample("positive", new String[]
{"great"});
+ SentimentSample neu = new SentimentSample("neutral", new String[] {"ok"});
+ SentimentSample neg = new SentimentSample("negative", new String[]
{"bad"});
+
+ SentimentSampleTypeFilter filter = new SentimentSampleTypeFilter(
+ Set.of("positive", "negative"),
+ ObjectStreamUtils.createObjectStream(pos, neu, neg));
+
+ SentimentSample first = filter.read();
+ Assertions.assertEquals("positive", first.getSentiment());
+
+ SentimentSample second = filter.read();
+ Assertions.assertEquals("negative", second.getSentiment());
+
+ Assertions.assertNull(filter.read());
+ }
+
+ @Test
+ void testFilterNoMatch() throws IOException {
+ SentimentSample neg = new SentimentSample("negative", new String[]
{"bad"});
+
+ SentimentSampleTypeFilter filter = new SentimentSampleTypeFilter(
+ new String[] {"positive"}, ObjectStreamUtils.createObjectStream(neg));
+
+ Assertions.assertNull(filter.read());
+ }
+
+ @Test
+ void testEmptyStreamReturnsNull() throws IOException {
+ SentimentSampleTypeFilter filter = new SentimentSampleTypeFilter(
+ new String[] {"positive"}, ObjectStreamUtils.createObjectStream());
+
+ Assertions.assertNull(filter.read());
+ }
+}
diff --git
a/opennlp-core/opennlp-runtime/src/test/resources/opennlp/tools/sentiment/train.txt
b/opennlp-core/opennlp-runtime/src/test/resources/opennlp/tools/sentiment/train.txt
new file mode 100644
index 00000000..9bae16d3
--- /dev/null
+++
b/opennlp-core/opennlp-runtime/src/test/resources/opennlp/tools/sentiment/train.txt
@@ -0,0 +1,32 @@
+positive I love this movie it is absolutely wonderful and amazing
+positive This product is great and I am very happy with it
+positive What a beautiful day to enjoy the sunshine and fresh air
+positive The food was delicious and the service was excellent
+positive I had a fantastic experience at this restaurant highly recommend
+positive The staff was friendly and helpful made my visit enjoyable
+positive Amazing quality and fast delivery very satisfied customer
+positive Best purchase I have made this year totally worth the price
+positive The concert was incredible and the performers were outstanding
+positive I am thrilled with the results exceeded all my expectations
+positive Wonderful atmosphere and great company made for a perfect evening
+positive The book was captivating and I could not put it down
+positive Exceptional service and attention to detail truly impressive
+positive The hotel room was clean comfortable and had a great view
+positive I love the new design it looks modern and sleek
+positive The team did an outstanding job on this project well done
+negative I hate this product it broke after one day of use
+negative Terrible experience the worst customer service I have ever had
+negative The food was cold and tasteless completely disappointed
+negative What a waste of money this product does not work at all
+negative I am very unhappy with the quality it fell apart quickly
+negative The movie was boring and had no interesting plot whatsoever
+negative Awful service the waiter was rude and ignored us completely
+negative This is the worst hotel I have ever stayed in dirty
+negative The delivery was late and the package arrived damaged
+negative I regret buying this product it is cheaply made garbage
+negative Horrible experience would never recommend this to anyone ever
+negative The book was poorly written and had no coherent storyline
+negative I am extremely frustrated with the lack of support terrible
+negative The restaurant was dirty and the food gave me stomach problems
+negative Worst purchase ever the item does not match the description
+negative The flight was delayed and the airline lost my luggage
diff --git a/opennlp-docs/src/docbkx/introduction.xml
b/opennlp-docs/src/docbkx/introduction.xml
index e188baad..e7ac5c7c 100644
--- a/opennlp-docs/src/docbkx/introduction.xml
+++ b/opennlp-docs/src/docbkx/introduction.xml
@@ -28,7 +28,8 @@ under the License.
<para>
The Apache OpenNLP library is a machine learning based toolkit for the
processing of natural language text.
It supports the most common NLP tasks, such as tokenization, sentence
segmentation,
- part-of-speech tagging, named entity extraction, chunking, parsing,
and coreference resolution.
+ part-of-speech tagging, named entity extraction, chunking, parsing,
coreference resolution,
+ and sentiment analysis.
These tasks are usually required to build more advanced text
processing services.
OpenNLP also includes maximum entropy and perceptron based machine
learning.
</para>
@@ -45,8 +46,8 @@ under the License.
<para>The Apache OpenNLP library contains several components, enabling
one to build
a full natural language processing pipeline. These components
include: sentence detector, tokenizer,
- name finder, document categorizer, part-of-speech tagger, chunker,
parser,
- coreference resolution. Components contain parts which enable one
to execute the
+ name finder, document categorizer, sentiment analyzer,
part-of-speech tagger,
+ chunker, parser, coreference resolution. Components contain parts
which enable one to execute the
respective natural language processing task, to train a model and
often also to evaluate a
model. Each of these facilities is accessible via its application
program
interface (API). In addition, a command line interface (CLI) is
provided for convenience
diff --git a/opennlp-docs/src/docbkx/opennlp.xml
b/opennlp-docs/src/docbkx/opennlp.xml
index fea7437d..c40692f1 100644
--- a/opennlp-docs/src/docbkx/opennlp.xml
+++ b/opennlp-docs/src/docbkx/opennlp.xml
@@ -103,6 +103,7 @@ under the License.
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude"
href="./tokenizer.xml" />
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude"
href="./namefinder.xml" />
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude"
href="./doccat.xml" />
+ <xi:include xmlns:xi="http://www.w3.org/2001/XInclude"
href="./sentiment.xml" />
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude"
href="./postagger.xml" />
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude"
href="./lemmatizer.xml" />
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude"
href="./chunker.xml" />
diff --git a/opennlp-docs/src/docbkx/sentiment.xml
b/opennlp-docs/src/docbkx/sentiment.xml
new file mode 100644
index 00000000..8ebd8319
--- /dev/null
+++ b/opennlp-docs/src/docbkx/sentiment.xml
@@ -0,0 +1,186 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V5.0//EN"
+"https://cdn.docbook.org/schema/5.0/dtd/docbook.dtd"[
+]>
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+<chapter xml:id="tools.sentiment" xmlns:xlink="http://www.w3.org/1999/xlink">
+<title>Sentiment Analysis</title>
+
+ <section xml:id="tools.sentiment.classifying">
+ <title>Classifying</title>
+ <para>
+ The OpenNLP Sentiment Analyzer can classify text into sentiment
categories such as
+ "positive" or "negative". It is based on the maximum entropy
framework.
+ For example, the text below could be classified as
<emphasis>positive</emphasis>:
+ <screen>
+<![CDATA[I love this product it is absolutely wonderful and amazing]]>
+ </screen>
+ and the text below could be classified as
<emphasis>negative</emphasis>:
+ <screen>
+<![CDATA[Terrible experience the worst customer service I have ever had]]>
+ </screen>
+ To be able to classify text, the sentiment analyzer needs a model.
The sentiment
+ categories are requirements-specific and defined by the training
data. There are no
+ pre-built models for sentiment analysis under the OpenNLP project.
+ </para>
+
+ <section xml:id="tools.sentiment.classifying.cmdline">
+ <title>Sentiment Analysis Tool</title>
+ <para>
+ The easiest way to try out the sentiment analyzer is the
command line tool.
+ The tool is only intended for demonstration and testing. The
following command
+ shows how to use the sentiment analysis tool:
+ <screen>
+<![CDATA[$ opennlp Sentiment model]]>
+ </screen>
+ The input is read from standard input and the predicted
sentiment is written
+ to standard output.
+ </para>
+ </section>
+
+ <section xml:id="tools.sentiment.classifying.api">
+ <title>Sentiment Analysis API</title>
+ <para>
+ To perform sentiment classification you will need a model
encapsulated in the
+ <code>SentimentModel</code> class. First, load the model from
an
+ <code>InputStream</code>:
+ <programlisting language="java">
+<![CDATA[InputStream is = ...
+SentimentModel model = new SentimentModel(is);]]>
+ </programlisting>
+ With the <code>SentimentModel</code> in hand, create a
+ <code>SentimentME</code> instance and predict sentiments:
+ <programlisting language="java">
+<![CDATA[SentimentME sentiment = new SentimentME(model);
+
+// Predict from a raw sentence string (tokenized internally)
+String result = sentiment.predict("I love this product");
+
+// Or predict from pre-tokenized input
+String[] tokens = new String[]{"I", "love", "this", "product"};
+String result2 = sentiment.predict(tokens);
+
+// Access the probability distribution over sentiment categories
+double[] probabilities = sentiment.probabilities(tokens);
+String bestSentiment = sentiment.getBestSentiment(probabilities);]]>
+ </programlisting>
+ </para>
+ </section>
+ </section>
+
+ <section xml:id="tools.sentiment.training">
+ <title>Training</title>
+ <para>
+ The Sentiment Analyzer can be trained on annotated training
material. The data
+ format is one sample per line, containing the sentiment category
followed by the
+ text tokens, all separated by whitespace. The following sample
shows the required format:
+ <screen>
+<![CDATA[positive I love this movie it is absolutely wonderful and amazing
+positive This product is great and I am very happy with it
+negative I hate this product it broke after one day of use
+negative Terrible experience the worst customer service I have ever had]]>
+ </screen>
+ </para>
+
+ <section xml:id="tools.sentiment.training.tool">
+ <title>Training Tool</title>
+ <para>
+ The following command will train the sentiment analyzer and
write the model
+ to <code>en-sentiment.bin</code>:
+ <screen>
+<![CDATA[$ opennlp SentimentTrainer -model en-sentiment.bin -lang en -data
en-sentiment.train -encoding UTF-8]]>
+ </screen>
+ </para>
+ </section>
+
+ <section xml:id="tools.sentiment.training.api">
+ <title>Training API</title>
+ <para>
+ To train a sentiment model programmatically, prepare an
+ <code>ObjectStream</code> of <code>SentimentSample</code>
objects and
+ call the <code>SentimentME.train()</code> method:
+ <programlisting language="java">
+<![CDATA[SentimentModel model;
+
+InputStreamFactory dataIn = new MarkableFileInputStreamFactory(
+ new File("en-sentiment.train"));
+
+ObjectStream<String> lineStream =
+ new PlainTextByLineStream(dataIn, StandardCharsets.UTF_8);
+ObjectStream<SentimentSample> sampleStream =
+ new SentimentSampleStream(lineStream);
+
+model = SentimentME.train("eng", sampleStream,
+ TrainingParameters.defaultParams(), new SentimentFactory());]]>
+ </programlisting>
+ Once trained, the model can be serialized for later use:
+ <programlisting language="java">
+<![CDATA[try (OutputStream modelOut = new BufferedOutputStream(
+ new FileOutputStream("en-sentiment.bin"))) {
+ model.serialize(modelOut);
+}]]>
+ </programlisting>
+ </para>
+ </section>
+ </section>
+
+ <section xml:id="tools.sentiment.evaluation">
+ <title>Evaluation</title>
+
+ <section xml:id="tools.sentiment.evaluation.tool">
+ <title>Evaluation Tool</title>
+ <para>
+ The sentiment analyzer can be evaluated against test data
using the command line tool:
+ <screen>
+<![CDATA[$ opennlp SentimentEvaluator -model en-sentiment.bin -data
en-sentiment.test -encoding UTF-8]]>
+ </screen>
+ This will output precision, recall, and F-measure statistics.
+ </para>
+ </section>
+
+ <section xml:id="tools.sentiment.evaluation.crossvalidation">
+ <title>Cross Validation</title>
+ <para>
+ K-fold cross validation can be performed to evaluate the model
without a
+ separate test set:
+ <screen>
+<![CDATA[$ opennlp SentimentCrossValidator -lang en -data en-sentiment.train
-encoding UTF-8 -folds 10]]>
+ </screen>
+ </para>
+ </section>
+
+ <section xml:id="tools.sentiment.evaluation.api">
+ <title>Evaluation API</title>
+ <para>
+ The evaluation API allows programmatic evaluation against a
set of
+ <code>SentimentSample</code> references:
+ <programlisting language="java">
+<![CDATA[SentimentME sentiment = new SentimentME(model);
+SentimentEvaluator evaluator = new SentimentEvaluator(sentiment);
+evaluator.evaluate(testSampleStream);
+
+System.out.println(evaluator.getFMeasure());]]>
+ </programlisting>
+ </para>
+ </section>
+ </section>
+
+</chapter>