Re: [dev] [libgrapheme] Some questions about libgrapheme

NRK Fri, 02 Sep 2022 13:22:37 -0700

On Fri, Sep 02, 2022 at 02:08:03PM -0300, [email protected] wrote:
> Quite inefficient really, but I guess it's fine since my usage would be
> only user input (left arrow)


If efficiency is not a concern, then you can easily use something like
this (just a quick prototype, didn't verify if it's correct or not):

        /* returns an offset into `s` */
        static size_t
        prev_char_offset(const char *s, size_t slen, size_t off)
        {
                assert(s != NULL);
                assert(slen > 0);
                assert(off <= slen);
        
                size_t ret = 0;
                const char *const end = s + slen;
                while (s < end) {
                        size_t n = grapheme_next_character_break_utf8(s, end - 
s);
                        if (ret + n >= off)
                                return ret;
                        ret += n;
                        s += n;
                }
                return 0; /* unreachable (?) */
        }

If I was expecting a decent amount of non-ascii input, I would use the
bitvector approach described by Thomas Oltmann. 1bit per byte overhead
should be fine for most use-cases.

- NRK

Re: [dev] [libgrapheme] Some questions about libgrapheme

Reply via email to