Hello Gary,
I updated the unit test, and removed the guessing part, I think.
This page shows nicely how the Family Grapheme is composed
https://utf-8-visualizer.ardis.lu/?q=%F0%9F%91%A8%F0%9F%8F%BB%E2%80%8D%F0%9F%91%A9%F0%9F%8F%BB%E2%80%8D%F0%9F%91%A6%F0%9F%8F%BB%E2%80%8D%F0%9F%91%A6%F0%9F%8F%BB
Hi Carsten,
Could you provide a unit test with the expected behavior? The example
you gave has console output and assertions commented out, both of
which are undesirable. Instead of me guessing, I'd rather you manage
expectations and provide a failing/passing set of assertions.
TY!
Gary
On Sun,
I created https://issues.apache.org/jira/browse/LANG-1770 to track this report.
Gary
On Fri, Apr 11, 2025 at 10:15 AM Carsten Kirschner
wrote:
>
> Hello,
>
> The current commons lang3 StringUtils.abbreviate (3.17.0) implementation will
> destroy 4 byte emoji characters and larger grapheme clust
Hello,
The current commons lang3 StringUtils.abbreviate (3.17.0) implementation will
destroy 4 byte emoji characters and larger grapheme clusters. I know that
handling grapheme correctly before java 20 is not possible, but at least a
codepoint aware solution with String.offsetByCodPoints could