Vijay Marupudi schreef op do 20-01-2022 om 22:23 [-0500]:
> --- a/libguile/bytevectors.c
> +++ b/libguile/bytevectors.c
> [...]
Boundary conditions can be tricky, I would recommend writing some
tests.
Greetings,
Maxime.
signature.asc
Description: This is a digitally signed message part
Vijay Marupudi schreef op do 20-01-2022 om 22:23 [-0500]:
> + c_start = scm_to_size_t (start);
> + if (SCM_UNLIKELY (c_start >= c_len))
> + {
> + scm_out_of_range (FUNC_NAME, start);
> + }
> +
> + if (!scm_is_eq (end, SCM_UNDEFINED))
> + {
> + c_end =
Vijay Marupudi schreef op do 20-01-2022 om 22:23 [-0500]:
> +@deffn {Scheme Procedure} utf8->string utf [start [end]]
> @deffnx {Scheme Procedure} utf16->string utf [endianness]
> @deffnx {Scheme Procedure} utf32->string utf [endianness]
> @deffnx {C Function} scm_utf8_to_string (utf)
> +@deffnx
Vijay Marupudi schreef op do 20-01-2022 om 22:23 [-0500]:
> + c_start = scm_to_size_t (start);
This seems suboptimal because if start > SIZE_MAX,
then this will throw an 'out-of-range' exception without attributing
it to 'utf8->string' (untested).
Greetings,
Maxime.
signature.asc
Descripti
> There seems to be an inconsistency here. Can (c_start >= c_len) be
> relaxed to c_start > c_len?
Done. `substring' was a useful reference.
> It would be nice to document if it's an open, closed or half-
> open/closed range. E.g. see the documentation of 'substring':
Done.
> It seems a bit
Vijay Marupudi schreef op vr 21-01-2022 om 15:20 [-0500]:
+ (pass-if-exception "utf8->string range: end < start"
+ exception:out-of-range
+ (let* ((utf8 (string->utf8 "gnu guile")))
+ (utf8->string utf8 1 0)))
+ [other tests]
It would be nice to check multibyte characters as wel
> It would be nice to check multibyte characters as well,
> to verify that byte indices and not character indices are used.
>
> E.g., (utf8->string #vu8(195 169) 0 2) should return "é".
>
> Another nice test: (utf8->string #vu8(195 169) 0 1) should raise
> a 'decoding-error', even though #vu8(195 1