This is an automated email from the ASF dual-hosted git repository.

krickert pushed a change to branch OPENNLP-1850-3-dl
in repository https://gitbox.apache.org/repos/asf/opennlp.git


 discard 1ea12ea28 OPENNLP-1850 Fully-qualify TokenNameFinder javadoc links in 
NameFinderDL
 discard 143cdb72d OPENNLP-1850 Fail loud on corrupt document-classification 
model output
 discard 6558e8bc8 OPENNLP-1850 Fail fast on null finder input; fix the GPU 
eval test options
 discard 706bd2dd9 OPENNLP-1850 Harden fail-loud paths in the DL components
 discard 280966c73 OPENNLP-1850 Add real-model chunk-boundary eval tests; drop 
dead label constants
 discard 7127f0650 OPENNLP-1850 Resolve overlapping chunk spans and compose the 
input alignment
 discard b933a2d97 OPENNLP-1850 Add OffsetMappingNameFinder capability 
interface and a findInOriginal end-to-end test
 discard bfcbeb5a1 OPENNLP-1850 Offset-safe, Unicode-aware input normalization 
in the DL components
    omit 7a3c25ac7 OPENNLP-1850 Address review: fail-loud TermAnalyzer default; 
harden WordBreakProperty
    omit a75f272f9 OPENNLP-1850 Fail fast on null public-entry arguments 
(review nits)
    omit f70c1956a OPENNLP-1850 Clarify that Extended_Pictographic symbols are 
kept as emoji
    omit cc89abf52 OPENNLP-1850 Address tokenizer review comments
    omit f48f50f1f OPENNLP-1850 Address Copilot review on the UAX #29 tokenizer
    omit 59043dfea OPENNLP-1850 UAX #29 word tokenizer and the layered Term 
model
    omit 8f1d947dc OPENNLP-1850 Harden andThen insertion mapping docs/tests; 
label rung index
    omit 9f2622ed9 OPENNLP-1850 Document the NFC precondition of the German 
umlaut fold
    omit ac7d7d354 OPENNLP-1850 Apply low-priority review polish to the offset 
model
    omit b7cde2c50 OPENNLP-1850 Harden the offset-aware folds from review 
feedback
    omit 28846d479 OPENNLP-1850 Make the per-code-point substitution folds 
offset-aware
    omit 85b10b080 OPENNLP-1850 Offset-aware normalization pipeline 
(buildAligned)
    omit f2ce18340 OPENNLP-1850 Address review nits: generated 
serialVersionUIDs, nested TextNormalizer.Builder
    omit 688e50f3b OPENNLP-1850 Add edge-case tests for the aligned offset API
    omit 9f23fe8cf OPENNLP-1850 Report the offending line on malformed 
confusables data
    omit 52e0c9061 OPENNLP-1850 Add Alignment offset model; move normalizer 
engine to opennlp-api
    omit 629a375a4 OPENNLP-1850 Address Copilot review on the normalization 
foundation
    omit 48120c7aa OPENNLP-1850 Unicode normalization foundation (CharClass 
engine, rungs, Dimension)
     add bc9c73ff8 OPENNLP-1850 Unicode normalization engine: CharClass, rungs, 
Dimension, confusables (1a)
     add 08de0d35c OPENNLP-1850 Offset/alignment layer: Alignment, AlignedText, 
buildAligned, *Aligned (1b)
     add 064d36345 OPENNLP-1850 UAX #29 word tokenizer and the layered Term 
model
     add 2ecf63796 OPENNLP-1850 Address Copilot review on the UAX #29 tokenizer
     add 724a2544e OPENNLP-1850 Address tokenizer review comments
     add d06489714 OPENNLP-1850 Clarify that Extended_Pictographic symbols are 
kept as emoji
     add 2da2949c8 OPENNLP-1850 Fail fast on null public-entry arguments 
(review nits)
     add ab038b41e OPENNLP-1850 Address review: fail-loud TermAnalyzer default; 
harden WordBreakProperty
     add 0bf7f6c03 OPENNLP-1850 Lazy, recoverable loading for WordBreakProperty 
and ExtendedPictographic
     add b876e5506 OPENNLP-1850 Offset-safe, Unicode-aware input normalization 
in the DL components
     add 539780738 OPENNLP-1850 Add OffsetMappingNameFinder capability 
interface and a findInOriginal end-to-end test
     add e9c0334eb OPENNLP-1850 Resolve overlapping chunk spans and compose the 
input alignment
     add 1cc2a9789 OPENNLP-1850 Add real-model chunk-boundary eval tests; drop 
dead label constants
     add 43aa7255f OPENNLP-1850 Harden fail-loud paths in the DL components
     add 318352921 OPENNLP-1850 Fail fast on null finder input; fix the GPU 
eval test options
     add 2c76b083d OPENNLP-1850 Fail loud on corrupt document-classification 
model output
     add e7f3c5978 OPENNLP-1850 Fully-qualify TokenNameFinder javadoc links in 
NameFinderDL

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (1ea12ea28)
            \
             N -- N -- N   refs/heads/OPENNLP-1850-3-dl (e7f3c5978)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

No new revisions were added by this update.

Summary of changes:
 .../tools/tokenize/uax29/ExtendedPictographic.java | 41 ++++++++---
 .../tools/tokenize/uax29/WordBreakProperty.java    | 84 +++++++++++++++-------
 .../opennlp/tools/util/normalizer/Confusables.java | 25 ++++++-
 3 files changed, 112 insertions(+), 38 deletions(-)

Reply via email to