subject:"Re\: \[PATCH\] Enable utf8\->string to take a range"

Re: [PATCH] Enable utf8->string to take a range

2022-03-09 Thread Vijay Marupudi

Maxime Devos writes: > Nevermind, seems like a misinterpreded a comment and #vu8(97 0 98) is > valid UTF-8 after all, it's just not possible to encode it as a zero- > terminated string. Thanks for the catch on the typo in the docstrings. I've attached the updated versions of the patches that fix

Re: [PATCH] Enable utf8->string to take a range

2022-03-09 Thread Maxime Devos

Maxime Devos schreef op wo 09-03-2022 om 14:27 [+0100]: > That's not quite correct, seems like Guile uses another encoding, but > still. Nevermind, seems like a misinterpreded a comment and #vu8(97 0 98) is valid UTF-8 after all, it's just not possible to encode it as a zero- terminated string.

Re: [PATCH] Enable utf8->string to take a range

2022-03-09 Thread Maxime Devos

Maxime Devos schreef op wo 09-03-2022 om 14:24 [+0100]: > This is incorrect, since the nul character is encoded even though > UTF- > proper does not allow encoding the nul character -- UTF-8 with an > encoding of the nul character is sometimes called ‘modified UTF-8’. That's not quite correct, see

Re: [PATCH] Enable utf8->string to take a range

2022-03-09 Thread Maxime Devos

Vijay Marupudi schreef op vr 21-01-2022 om 20:21 [-0500]: > +SCM_DEFINE (scm_utf8_range_to_string, "utf8->string", > + 1, 2, 0, > + (SCM utf, SCM start, SCM end), > + "Return a newly allocate string that contains from the > UTF-8-" > + "encoded contents o

Re: [PATCH] Enable utf8->string to take a range

2022-03-09 Thread Maxime Devos

Vijay Marupudi schreef op vr 21-01-2022 om 20:21 [-0500]: > +SCM_DEFINE (scm_utf8_range_to_string, "utf8->string", > + 1, 2, 0, > + (SCM utf, SCM start, SCM end), > + "Return a newly allocate string that contains from the > UTF-8-" > + "encoded contents o

Re: [PATCH] Enable utf8->string to take a range

2022-03-09 Thread Maxime Devos

Vijay Marupudi schreef op vr 21-01-2022 om 20:21 [-0500]: > +SCM_DEFINE (scm_utf16_range_to_string, "utf16->string", > + 1, 3, 0, > + (SCM utf, SCM endianness, SCM start, SCM end), > + "Return a newly allocate string that contains from the > UTF-8-" > + "

Re: [PATCH] Enable utf8->string to take a range

2022-01-21 Thread Vijay Marupudi

> It would be nice to check multibyte characters as well, > to verify that byte indices and not character indices are used. > > E.g., (utf8->string #vu8(195 169) 0 2) should return "é". > > Another nice test: (utf8->string #vu8(195 169) 0 1) should raise > a 'decoding-error', even though #vu8(195 1

Re: [PATCH] Enable utf8->string to take a range

2022-01-21 Thread Maxime Devos

Vijay Marupudi schreef op vr 21-01-2022 om 15:20 [-0500]: + (pass-if-exception "utf8->string range: end < start" + exception:out-of-range + (let* ((utf8 (string->utf8 "gnu guile"))) + (utf8->string utf8 1 0))) + [other tests] It would be nice to check multibyte characters as wel

Re: [PATCH] Enable utf8->string to take a range

2022-01-21 Thread Vijay Marupudi

> There seems to be an inconsistency here. Can (c_start >= c_len) be > relaxed to c_start > c_len? Done. `substring' was a useful reference. > It would be nice to document if it's an open, closed or half- > open/closed range. E.g. see the documentation of 'substring': Done. > It seems a bit

Re: [PATCH] Enable utf8->string to take a range

2022-01-21 Thread Maxime Devos

Vijay Marupudi schreef op do 20-01-2022 om 22:23 [-0500]: > + c_start = scm_to_size_t (start); This seems suboptimal because if start > SIZE_MAX, then this will throw an 'out-of-range' exception without attributing it to 'utf8->string' (untested). Greetings, Maxime. signature.asc Descripti

Re: [PATCH] Enable utf8->string to take a range

2022-01-21 Thread Maxime Devos

Vijay Marupudi schreef op do 20-01-2022 om 22:23 [-0500]: > +@deffn {Scheme Procedure} utf8->string utf [start [end]] > @deffnx {Scheme Procedure} utf16->string utf [endianness] > @deffnx {Scheme Procedure} utf32->string utf [endianness] > @deffnx {C Function} scm_utf8_to_string (utf) > +@deffnx

Re: [PATCH] Enable utf8->string to take a range

2022-01-21 Thread Maxime Devos

Vijay Marupudi schreef op do 20-01-2022 om 22:23 [-0500]: > + c_start = scm_to_size_t (start); > + if (SCM_UNLIKELY (c_start >= c_len)) > + { > + scm_out_of_range (FUNC_NAME, start); > + } > + > + if (!scm_is_eq (end, SCM_UNDEFINED)) > + { > + c_end =

Re: [PATCH] Enable utf8->string to take a range

2022-01-21 Thread Maxime Devos

Vijay Marupudi schreef op do 20-01-2022 om 22:23 [-0500]: > --- a/libguile/bytevectors.c > +++ b/libguile/bytevectors.c > [...] Boundary conditions can be tricky, I would recommend writing some tests. Greetings, Maxime. signature.asc Description: This is a digitally signed message part

Re: [PATCH] Enable utf8->string to take a range

Re: [PATCH] Enable utf8->string to take a range

Re: [PATCH] Enable utf8->string to take a range

Re: [PATCH] Enable utf8->string to take a range

Re: [PATCH] Enable utf8->string to take a range

Re: [PATCH] Enable utf8->string to take a range

Re: [PATCH] Enable utf8->string to take a range

Re: [PATCH] Enable utf8->string to take a range

Re: [PATCH] Enable utf8->string to take a range

Re: [PATCH] Enable utf8->string to take a range

Re: [PATCH] Enable utf8->string to take a range

Re: [PATCH] Enable utf8->string to take a range

Re: [PATCH] Enable utf8->string to take a range

13 matches

Site Navigation

Mail list logo

Footer information