On Fri, Sep 02, 2022 at 02:08:03PM -0300, atrtar...@cock.li wrote: > Quite inefficient really, but I guess it's fine since my usage would be > only user input (left arrow)
If efficiency is not a concern, then you can easily use something like this (just a quick prototype, didn't verify if it's correct or not): /* returns an offset into `s` */ static size_t prev_char_offset(const char *s, size_t slen, size_t off) { assert(s != NULL); assert(slen > 0); assert(off <= slen); size_t ret = 0; const char *const end = s + slen; while (s < end) { size_t n = grapheme_next_character_break_utf8(s, end - s); if (ret + n >= off) return ret; ret += n; s += n; } return 0; /* unreachable (?) */ } If I was expecting a decent amount of non-ascii input, I would use the bitvector approach described by Thomas Oltmann. 1bit per byte overhead should be fine for most use-cases. - NRK