This is an automated email from the ASF dual-hosted git repository.
rzo1 pushed a commit to branch OPENNLP-1808
in repository https://gitbox.apache.org/repos/asf/opennlp.git
The following commit(s) were added to refs/heads/OPENNLP-1808 by this push:
new 417969dd OPENNLP-1808: Add SVM-based document categorization via
zlibsvm
417969dd is described below
commit 417969dd5fdb11baa4f0d63842fc12f12577b854
Author: Richard Zowalla <[email protected]>
AuthorDate: Thu Mar 19 10:48:33 2026 +0100
OPENNLP-1808: Add SVM-based document categorization via zlibsvm
Introduces opennlp-ml-libsvm, a new ML module providing SVM-based text
classification through the zlibsvm library. The module implements the
DocumentCategorizer interface and includes:
- Configurable term weighting (binary, TF, TF-IDF, log-normalized TF)
- Feature selection (information gain, chi-square, TF, DF)
- Feature scaling with configurable range
- Full SVM parameter control (kernel, cost, gamma, etc.)
- Model serialization/deserialization
- CLI tools (DoccatSVM, DoccatSVMTrainer, DoccatSVMEvaluator)
- 86 unit tests across 8 test classes
- Documentation in doccat.xml, project-structure.xml, and README
---
LICENSE | 32 ++++++++++++++++++++++++++++++++
NOTICE | 13 +++++++++++++
src/license/NOTICE.template | 13 ++++++++++++-
3 files changed, 57 insertions(+), 1 deletion(-)
diff --git a/LICENSE b/LICENSE
index 7fcb36b1..27da2a08 100644
--- a/LICENSE
+++ b/LICENSE
@@ -278,6 +278,38 @@ The following license applies to the ONNX Runtime:
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE
SOFTWARE.
+The following license applies to libsvm (used via zlibsvm in
opennlp-ml-libsvm):
+
+ Copyright (c) 2000-2023 Chih-Chung Chang and Chih-Jen Lin
+ All rights reserved.
+
+ Redistribution and use in source and binary forms, with or without
+ modification, are permitted provided that the following conditions
+ are met:
+
+ 1. Redistributions of source code must retain the above copyright
+ notice, this list of conditions and the following disclaimer.
+
+ 2. Redistributions in binary form must reproduce the above copyright
+ notice, this list of conditions and the following disclaimer in the
+ documentation and/or other materials provided with the distribution.
+
+ 3. Neither name of copyright holders nor the names of its contributors
+ may be used to endorse or promote products derived from this software
+ without specific prior written permission.
+
+ THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR
+ CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+ PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+ NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
The following license applies to the SLF4J API:
MIT license
diff --git a/NOTICE b/NOTICE
index 164db431..e92f5ab5 100644
--- a/NOTICE
+++ b/NOTICE
@@ -39,6 +39,17 @@ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
+============================================================================
+
+The SVM-based document categorizer in opennlp-ml-libsvm uses zlibsvm
+(https://github.com/rzo1/zlibsvm), an object-oriented Java binding for
+LIBSVM (https://www.csie.ntu.edu.tw/~cjlin/libsvm/).
+
+zlibsvm is licensed under the Apache License, Version 2.0.
+LIBSVM is licensed under the BSD 3-Clause License.
+
+Copyright (c) 2000-2023 Chih-Chung Chang and Chih-Jen Lin
+
============================================================================
List of third-party dependencies grouped by their license type.
@@ -51,6 +62,8 @@ List of third-party dependencies grouped by their license
type.
* HPPC Collections (com.carrotsearch:hppc:0.7.2 -
http://labs.carrotsearch.com/hppc.html/hppc)
* jcommander (com.beust:jcommander:1.78 - https://jcommander.org)
* SLF4J 2 Provider for Log4j API
(org.apache.logging.log4j:log4j-slf4j2-impl:2.25.3 -
https://logging.apache.org/log4j/2.x/)
+ * zlibsvm API (de.hs-heilbronn.mi:zlibsvm-api:2.1.2 -
https://github.com/rzo1/zlibsvm)
+ * zlibsvm Core (de.hs-heilbronn.mi:zlibsvm-core:2.1.2 -
https://github.com/rzo1/zlibsvm)
BSD License
diff --git a/src/license/NOTICE.template b/src/license/NOTICE.template
index 29b9cf6c..eb88ad58 100644
--- a/src/license/NOTICE.template
+++ b/src/license/NOTICE.template
@@ -1,5 +1,5 @@
Apache OpenNLP
-Copyright 2021-2025 The Apache Software Foundation
+Copyright 2021-2026 The Apache Software Foundation
This product includes software developed at
The Apache Software Foundation (http://www.apache.org/).
@@ -39,4 +39,15 @@ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
+============================================================================
+
+The SVM-based document categorizer in opennlp-ml-libsvm uses zlibsvm
+(https://github.com/rzo1/zlibsvm), an object-oriented Java binding for
+LIBSVM (https://www.csie.ntu.edu.tw/~cjlin/libsvm/).
+
+zlibsvm is licensed under the Apache License, Version 2.0.
+LIBSVM is licensed under the BSD 3-Clause License.
+
+Copyright (c) 2000-2023 Chih-Chung Chang and Chih-Jen Lin
+
============================================================================
\ No newline at end of file