This is an automated email from the ASF dual-hosted git repository.
krickert pushed a change to branch OPENNLP-1850-1-foundation
in repository https://gitbox.apache.org/repos/asf/opennlp.git
from 9f02a0c63 OPENNLP-1850 Address review nits: generated
serialVersionUIDs, nested TextNormalizer.Builder
add cec6989a3 OPENNLP-1850 Offset-aware normalization pipeline
(buildAligned)
No new revisions were added by this update.
Summary of changes:
.../util/normalizer/OffsetAwareNormalizer.java | 49 +++++
.../AlignedAggregateCharSequenceNormalizer.java | 62 ++++++
.../normalizer/DashCharSequenceNormalizer.java | 7 +-
.../InvisibleCharSequenceNormalizer.java | 7 +-
...PreservingWhitespaceCharSequenceNormalizer.java | 72 +++++++
.../tools/util/normalizer/TextNormalizer.java | 36 ++++
.../WhitespaceCharSequenceNormalizer.java | 10 +-
.../normalizer/AlignedNormalizerPipelineTest.java | 239 +++++++++++++++++++++
8 files changed, 479 insertions(+), 3 deletions(-)
create mode 100644
opennlp-api/src/main/java/opennlp/tools/util/normalizer/OffsetAwareNormalizer.java
create mode 100644
opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/AlignedAggregateCharSequenceNormalizer.java
create mode 100644
opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/LineBreakPreservingWhitespaceCharSequenceNormalizer.java
create mode 100644
opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/AlignedNormalizerPipelineTest.java