Dear fellow hackers, I'm pleased to announce version 1 of libgrapheme[0][1], a library for unicode string handling which at this point allows you to segment char-strings into user-perceived characters (that can be made up of multiple codepoints), e.g. "π¨βπ©βπ¦ πΊπΈ ΰ€¨ΰ₯" into "π¨βπ©βπ¦" (18 bytes), "πΊπΈ" (8 bytes) and "ΰ€¨ΰ₯" (6 bytes).
This allows you to properly handle text in your programs (and not only count codepoints as individual user-perceived characters, which is wrong) without having to rely on bloated libraries like ICU and libunistring. As could be seen on hackers@ there has been a lot of activity in the last few weeks, but now with version 1 there is a stable version you can rely on not to change in regard to its API. Take a look at the README and libgrapheme(7) for an overview. Every function-manual comes with an example and the usage should be more or less obvious. With best regards Laslo Hunhold [0]: https://libs.suckless.org/libgrapheme [1]: https://dl.suckless.org/libgrapheme/libgrapheme-1.tar.gz