This is an automated email from the ASF dual-hosted git repository.
krickert pushed a change to branch OPENNLP-1850-4-docs
in repository https://gitbox.apache.org/repos/asf/opennlp.git
discard 213ab50ef OPENNLP-1850 Document the supplementary-dash offset shift in
the DL fold options
discard 62afd994f OPENNLP-1850 Document the offset-aware substitution folds
(quotes, digits, ellipsis, bullets, umlaut)
discard 9490c141f OPENNLP-1850 Name the OffsetMappingNameFinder capability
interface in the manual
discard 5be4025b2 OPENNLP-1850 Document the offset-aware normalization
pipeline (buildAligned)
discard 8685f233e OPENNLP-1850 Document Unicode normalization, the UAX #29
tokenizer, and DL handling
discard 2006d1d85 OPENNLP-1850 Harden fail-loud paths in the DL components
discard a7488a883 OPENNLP-1850 Add real-model chunk-boundary eval tests; drop
dead label constants
discard a14dcf98d OPENNLP-1850 Resolve overlapping chunk spans and compose the
input alignment
discard 22256c160 OPENNLP-1850 Add OffsetMappingNameFinder capability
interface and a findInOriginal end-to-end test
discard d5319ccaf OPENNLP-1850 Offset-safe, Unicode-aware input normalization
in the DL components
discard 0ec5a3651 OPENNLP-1850 Clarify that Extended_Pictographic symbols are
kept as emoji
discard 3c57a7456 OPENNLP-1850 Address tokenizer review comments
discard 4fda04577 OPENNLP-1850 Address Copilot review on the UAX #29 tokenizer
discard 396573a57 OPENNLP-1850 UAX #29 word tokenizer and the layered Term
model
discard 090593fca OPENNLP-1850 Document the NFC precondition of the German
umlaut fold
discard 1ee5a710f OPENNLP-1850 Apply low-priority review polish to the offset
model
discard bb20b101a OPENNLP-1850 Harden the offset-aware folds from review
feedback
discard fb5edf31f OPENNLP-1850 Make the per-code-point substitution folds
offset-aware
discard cec6989a3 OPENNLP-1850 Offset-aware normalization pipeline
(buildAligned)
discard 9f02a0c63 OPENNLP-1850 Address review nits: generated
serialVersionUIDs, nested TextNormalizer.Builder
discard d55353c13 OPENNLP-1850 Add edge-case tests for the aligned offset API
discard 463f95129 OPENNLP-1850 Report the offending line on malformed
confusables data
discard 7c58c0c7d OPENNLP-1850 Add Alignment offset model; move normalizer
engine to opennlp-api
discard 7b5dfff77 OPENNLP-1850 Address Copilot review on the normalization
foundation
discard 0096292ca OPENNLP-1850 Unicode normalization foundation (CharClass
engine, rungs, Dimension)
add 6963b1cb2 OPENNLP-1851: Fix
DocumentCategorizerDLEval.categorizeFailsLoudlyOnFailure (#1102)
add 48120c7aa OPENNLP-1850 Unicode normalization foundation (CharClass
engine, rungs, Dimension)
add 629a375a4 OPENNLP-1850 Address Copilot review on the normalization
foundation
add 52e0c9061 OPENNLP-1850 Add Alignment offset model; move normalizer
engine to opennlp-api
add 9f23fe8cf OPENNLP-1850 Report the offending line on malformed
confusables data
add 688e50f3b OPENNLP-1850 Add edge-case tests for the aligned offset API
add f2ce18340 OPENNLP-1850 Address review nits: generated
serialVersionUIDs, nested TextNormalizer.Builder
add 85b10b080 OPENNLP-1850 Offset-aware normalization pipeline
(buildAligned)
add 28846d479 OPENNLP-1850 Make the per-code-point substitution folds
offset-aware
add b7cde2c50 OPENNLP-1850 Harden the offset-aware folds from review
feedback
add ac7d7d354 OPENNLP-1850 Apply low-priority review polish to the offset
model
add 9f2622ed9 OPENNLP-1850 Document the NFC precondition of the German
umlaut fold
add fe1e77c7c OPENNLP-1850 UAX #29 word tokenizer and the layered Term
model
add bf37d092f OPENNLP-1850 Address Copilot review on the UAX #29 tokenizer
add 2860117dc OPENNLP-1850 Address tokenizer review comments
add b15005612 OPENNLP-1850 Clarify that Extended_Pictographic symbols are
kept as emoji
add 64630e992 OPENNLP-1850 Offset-safe, Unicode-aware input normalization
in the DL components
add 8fdb00bf1 OPENNLP-1850 Add OffsetMappingNameFinder capability
interface and a findInOriginal end-to-end test
add fd1b4addb OPENNLP-1850 Resolve overlapping chunk spans and compose the
input alignment
add c16f3c227 OPENNLP-1850 Add real-model chunk-boundary eval tests; drop
dead label constants
add fdd329f7d OPENNLP-1850 Harden fail-loud paths in the DL components
add 225c6db41 OPENNLP-1850 Document Unicode normalization, the UAX #29
tokenizer, and DL handling
add b7cd3e669 OPENNLP-1850 Document the offset-aware normalization
pipeline (buildAligned)
add fce6da402 OPENNLP-1850 Name the OffsetMappingNameFinder capability
interface in the manual
add 2798db144 OPENNLP-1850 Document the offset-aware substitution folds
(quotes, digits, ellipsis, bullets, umlaut)
add 8475b41ba OPENNLP-1850 Document the supplementary-dash offset shift in
the DL fold options
This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version. This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:
* -- * -- B -- O -- O -- O (213ab50ef)
\
N -- N -- N refs/heads/OPENNLP-1850-4-docs (8475b41ba)
You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.
Any revisions marked "omit" are not gone; other references still
refer to them. Any revisions marked "discard" are gone forever.
No new revisions were added by this update.
Summary of changes:
.../opennlp-dl/src/main/java/opennlp/dl/AbstractDL.java | 2 +-
.../opennlp/dl/doccat/DocumentCategorizerDLEval.java | 16 +++++++++++-----
2 files changed, 12 insertions(+), 6 deletions(-)