Thanks for the inputs and feedbacks. Here are the new modules string-desc, xstring-desc, string-desc-quotearg that I'm adding.
2023-03-28 Bruno Haible <br...@clisp.org> doc: Document string-desc and related modules. * doc/string-desc.texi: New file. * doc/gnulib.texi (Particular Modules): Include it. string-desc-quotearg: Add tests. * tests/test-string-desc-quotearg.c: New file. * modules/string-desc-quotearg-tests: New file. string-desc-quotearg: New module. * lib/string-desc-quotearg.h: New file. * lib/string-desc-quotearg.c: New file. * modules/string-desc-quotearg: New file. xstring-desc: Add tests. * tests/test-xstring-desc.c: New file. * modules/xstring-desc-tests: New file. xstring-desc: New module. * lib/xstring-desc.h: New file. * lib/xstring-desc.c: New file. * modules/xstring-desc: New file. string-desc: Add tests. * tests/test-string-desc.sh: New file. * tests/test-string-desc.c: New file. * modules/string-desc-tests: New file. string-desc: New module. * lib/string-desc.h: New file. * lib/string-desc.c: New file. * lib/string-desc-contains.c: New file. * modules/string-desc: New file.
From 93e98eb64e33d1a9d5e562fe61f9eb86a2a4de2e Mon Sep 17 00:00:00 2001 From: Bruno Haible <br...@clisp.org> Date: Wed, 29 Mar 2023 00:22:17 +0200 Subject: [PATCH 1/7] string-desc: New module. * lib/string-desc.h: New file. * lib/string-desc.c: New file. * lib/string-desc-contains.c: New file. * modules/string-desc: New file. --- ChangeLog | 8 + lib/string-desc-contains.c | 44 +++++ lib/string-desc.c | 358 +++++++++++++++++++++++++++++++++++++ lib/string-desc.h | 229 ++++++++++++++++++++++++ modules/string-desc | 30 ++++ 5 files changed, 669 insertions(+) create mode 100644 lib/string-desc-contains.c create mode 100644 lib/string-desc.c create mode 100644 lib/string-desc.h create mode 100644 modules/string-desc diff --git a/ChangeLog b/ChangeLog index 0688e05bd8..2865fdea7e 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,11 @@ +2023-03-28 Bruno Haible <br...@clisp.org> + + string-desc: New module. + * lib/string-desc.h: New file. + * lib/string-desc.c: New file. + * lib/string-desc-contains.c: New file. + * modules/string-desc: New file. + 2023-03-28 Bruno Haible <br...@clisp.org> doc: Fix placement of memset_explicit node. diff --git a/lib/string-desc-contains.c b/lib/string-desc-contains.c new file mode 100644 index 0000000000..c02617629e --- /dev/null +++ b/lib/string-desc-contains.c @@ -0,0 +1,44 @@ +/* String descriptors. + Copyright (C) 2023 Free Software Foundation, Inc. + + This file is free software: you can redistribute it and/or modify + it under the terms of the GNU Lesser General Public License as + published by the Free Software Foundation, either version 3 of the + License, or (at your option) any later version. + + This file is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public License + along with this program. If not, see <https://www.gnu.org/licenses/>. */ + +/* Written by Bruno Haible <br...@clisp.org>, 2023. */ + +#ifdef HAVE_CONFIG_H +# include "config.h" +#endif + +/* Specification. */ +#include "string-desc.h" + +#include <string.h> + + +/* This function is in a separate compilation unit, because not all users + of the 'string-desc' module need this function and it depends on 'memmem' + which — depending on platforms — costs up to 2 KB of binary code. */ + +ptrdiff_t +string_desc_contains (string_desc_t haystack, string_desc_t needle) +{ + if (needle._nbytes == 0) + return 0; + void *found = + memmem (haystack._data, haystack._nbytes, needle._data, needle._nbytes); + if (found != NULL) + return (char *) found - haystack._data; + else + return -1; +} diff --git a/lib/string-desc.c b/lib/string-desc.c new file mode 100644 index 0000000000..2747612bbc --- /dev/null +++ b/lib/string-desc.c @@ -0,0 +1,358 @@ +/* String descriptors. + Copyright (C) 2023 Free Software Foundation, Inc. + + This file is free software: you can redistribute it and/or modify + it under the terms of the GNU Lesser General Public License as + published by the Free Software Foundation, either version 3 of the + License, or (at your option) any later version. + + This file is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public License + along with this program. If not, see <https://www.gnu.org/licenses/>. */ + +/* Written by Bruno Haible <br...@clisp.org>, 2023. */ + +#ifdef HAVE_CONFIG_H +# include "config.h" +#endif + +#define GL_STRING_DESC_INLINE _GL_EXTERN_INLINE + +/* Specification and inline definitions. */ +#include "string-desc.h" + +#include <stdarg.h> +#include <stdlib.h> +#include <string.h> + +#include "ialloc.h" +#include "full-write.h" + + +/* ==== Side-effect-free operations on string descriptors ==== */ + +/* Return true if A and B are equal. */ +bool +string_desc_equals (string_desc_t a, string_desc_t b) +{ + return (a._nbytes == b._nbytes + && (a._nbytes == 0 || memcmp (a._data, b._data, a._nbytes) == 0)); +} + +bool +string_desc_startswith (string_desc_t s, string_desc_t prefix) +{ + return (s._nbytes >= prefix._nbytes + && (prefix._nbytes == 0 + || memcmp (s._data, prefix._data, prefix._nbytes) == 0)); +} + +bool +string_desc_endswith (string_desc_t s, string_desc_t suffix) +{ + return (s._nbytes >= suffix._nbytes + && (suffix._nbytes == 0 + || memcmp (s._data + (s._nbytes - suffix._nbytes), suffix._data, + suffix._nbytes) == 0)); +} + +int +string_desc_cmp (string_desc_t a, string_desc_t b) +{ + if (a._nbytes > b._nbytes) + { + if (b._nbytes == 0) + return 1; + return (memcmp (a._data, b._data, b._nbytes) < 0 ? -1 : 1); + } + else if (a._nbytes < b._nbytes) + { + if (a._nbytes == 0) + return -1; + return (memcmp (a._data, b._data, a._nbytes) > 0 ? 1 : -1); + } + else /* a._nbytes == b._nbytes */ + { + if (a._nbytes == 0) + return 0; + return memcmp (a._data, b._data, a._nbytes); + } +} + +ptrdiff_t +string_desc_index (string_desc_t s, char c) +{ + if (s._nbytes > 0) + { + void *found = memchr (s._data, (unsigned char) c, s._nbytes); + if (found != NULL) + return (char *) found - s._data; + } + return -1; +} + +ptrdiff_t +string_desc_last_index (string_desc_t s, char c) +{ + if (s._nbytes > 0) + { + void *found = memrchr (s._data, (unsigned char) c, s._nbytes); + if (found != NULL) + return (char *) found - s._data; + } + return -1; +} + +string_desc_t +string_desc_new_empty (void) +{ + string_desc_t result; + + result._nbytes = 0; + result._data = NULL; + + return result; + +} + +string_desc_t +string_desc_from_c (const char *s) +{ + string_desc_t result; + + result._nbytes = strlen (s); + result._data = (char *) s; + + return result; +} + +string_desc_t +string_desc_substring (string_desc_t s, idx_t start, idx_t end) +{ + string_desc_t result; + + if (!(start >= 0 && start <= end)) + /* Invalid arguments. */ + abort (); + + result._nbytes = end - start; + result._data = s._data + start; + + return result; +} + +int +string_desc_write (int fd, string_desc_t s) +{ + if (s._nbytes > 0) + if (full_write (fd, s._data, s._nbytes) != s._nbytes) + /* errno is set here. */ + return -1; + return 0; +} + +int +string_desc_fwrite (FILE *fp, string_desc_t s) +{ + if (s._nbytes > 0) + if (fwrite (s._data, 1, s._nbytes, fp) != s._nbytes) + return -1; + return 0; +} + + +/* ==== Memory-allocating operations on string descriptors ==== */ + +int +string_desc_new (string_desc_t *resultp, idx_t n) +{ + string_desc_t result; + + if (!(n >= 0)) + /* Invalid argument. */ + abort (); + + result._nbytes = n; + if (n == 0) + result._data = NULL; + else + { + result._data = (char *) imalloc (n); + if (result._data == NULL) + /* errno is set here. */ + return -1; + } + + *resultp = result; + return 0; +} + +string_desc_t +string_desc_new_addr (idx_t n, char *addr) +{ + string_desc_t result; + + result._nbytes = n; + if (n == 0) + result._data = NULL; + else + result._data = addr; + + return result; +} + +int +string_desc_new_filled (string_desc_t *resultp, idx_t n, char c) +{ + string_desc_t result; + + result._nbytes = n; + if (n == 0) + result._data = NULL; + else + { + result._data = (char *) imalloc (n); + if (result._data == NULL) + /* errno is set here. */ + return -1; + memset (result._data, (unsigned char) c, n); + } + + *resultp = result; + return 0; +} + +int +string_desc_copy (string_desc_t *resultp, string_desc_t s) +{ + string_desc_t result; + idx_t n = s._nbytes; + + result._nbytes = n; + if (n == 0) + result._data = NULL; + else + { + result._data = (char *) imalloc (n); + if (result._data == NULL) + /* errno is set here. */ + return -1; + memcpy (result._data, s._data, n); + } + + *resultp = result; + return 0; +} + +int +string_desc_concat (string_desc_t *resultp, idx_t n, string_desc_t string1, ...) +{ + if (n <= 0) + /* Invalid argument. */ + abort (); + + idx_t total = 0; + total += string1._nbytes; + if (n > 1) + { + va_list other_strings; + idx_t i; + + va_start (other_strings, string1); + for (i = n - 1; i > 0; i--) + { + string_desc_t arg = va_arg (other_strings, string_desc_t); + total += arg._nbytes; + } + va_end (other_strings); + } + + char *combined = (char *) imalloc (total); + if (combined == NULL) + /* errno is set here. */ + return -1; + idx_t pos = 0; + memcpy (combined, string1._data, string1._nbytes); + pos += string1._nbytes; + if (n > 1) + { + va_list other_strings; + idx_t i; + + va_start (other_strings, string1); + for (i = n - 1; i > 0; i--) + { + string_desc_t arg = va_arg (other_strings, string_desc_t); + if (arg._nbytes > 0) + memcpy (combined + pos, arg._data, arg._nbytes); + pos += arg._nbytes; + } + va_end (other_strings); + } + + string_desc_t result; + result._nbytes = total; + result._data = combined; + + *resultp = result; + return 0; +} + +char * +string_desc_c (string_desc_t s) +{ + idx_t n = s._nbytes; + char *result = (char *) imalloc (n + 1); + if (result == NULL) + /* errno is set here. */ + return NULL; + if (n > 0) + memcpy (result, s._data, n); + result[n] = '\0'; + + return result; +} + + +/* ==== Operations with side effects on string descriptors ==== */ + +void +string_desc_set_char_at (string_desc_t s, idx_t i, char c) +{ + if (!(i >= 0 && i < s._nbytes)) + /* Invalid argument. */ + abort (); + s._data[i] = c; +} + +void +string_desc_fill (string_desc_t s, idx_t start, idx_t end, char c) +{ + if (!(start >= 0 && start <= end)) + /* Invalid arguments. */ + abort (); + + if (start < end) + memset (s._data + start, (unsigned char) c, end - start); +} + +void +string_desc_overwrite (string_desc_t s, idx_t start, string_desc_t t) +{ + if (!(start >= 0 && start + t._nbytes <= s._nbytes)) + /* Invalid arguments. */ + abort (); + + if (t._nbytes > 0) + memcpy (s._data + start, t._data, t._nbytes); +} + +void +string_desc_free (string_desc_t s) +{ + free (s._data); +} diff --git a/lib/string-desc.h b/lib/string-desc.h new file mode 100644 index 0000000000..9bd086f689 --- /dev/null +++ b/lib/string-desc.h @@ -0,0 +1,229 @@ +/* String descriptors. + Copyright (C) 2023 Free Software Foundation, Inc. + + This file is free software: you can redistribute it and/or modify + it under the terms of the GNU Lesser General Public License as + published by the Free Software Foundation, either version 3 of the + License, or (at your option) any later version. + + This file is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public License + along with this program. If not, see <https://www.gnu.org/licenses/>. */ + +/* Written by Bruno Haible <br...@clisp.org>, 2023. */ + +#ifndef _STRING_DESC_H +#define _STRING_DESC_H 1 + +/* Get ptrdiff_t. */ +#include <stddef.h> + +/* Get FILE. */ +#include <stdio.h> + +/* Get abort(), free(). */ +#include <stdlib.h> + +/* Get idx_t. */ +#include "idx.h" + + +#ifndef _GL_INLINE_HEADER_BEGIN + #error "Please include config.h first." +#endif +_GL_INLINE_HEADER_BEGIN +#ifndef GL_STRING_DESC_INLINE +# define GL_STRING_DESC_INLINE _GL_INLINE +#endif + +#ifdef __cplusplus +extern "C" { +#endif + + +/* Type describing a string that may contain NUL bytes. + It's merely a descriptor of an array of bytes. */ +typedef struct string_desc_t string_desc_t; +struct string_desc_t +{ + /* The fields of this struct should be considered private. */ + idx_t _nbytes; + char *_data; +}; + +/* String descriptors can be passed and returned by value. + + String descriptors and NUL-terminated 'const char *'/'char *' C strings + cannot be used interchangeably. You will get compilation errors if you + attempt to assign a string descriptor to a C string or vice versa. */ + + +/* ==== Side-effect-free operations on string descriptors ==== */ + +/* Return the length of the string S. */ +#if 0 /* Defined inline below. */ +extern idx_t string_desc_length (string_desc_t s); +#endif + +/* Return the byte at index I of string S. + I must be < length(S). */ +#if 0 /* Defined inline below. */ +extern char string_desc_char_at (string_desc_t s, idx_t i); +#endif + +/* Return a read-only view of the bytes of S. */ +#if 0 /* Defined inline below. */ +extern const char * string_desc_data (string_desc_t s); +#endif + +/* Return true if S is the empty string. */ +#if 0 /* Defined inline below. */ +extern bool string_desc_is_empty (string_desc_t s); +#endif + +/* Return true if A and B are equal. */ +extern bool string_desc_equals (string_desc_t a, string_desc_t b); + +/* Return true if S starts with PREFIX. */ +extern bool string_desc_startswith (string_desc_t s, string_desc_t prefix); + +/* Return true if S ends with SUFFIX. */ +extern bool string_desc_endswith (string_desc_t s, string_desc_t suffix); + +/* Return > 0, == 0, or < 0 if A > B, A == B, A < B. + This uses a lexicographic ordering, where the bytes are compared as + 'unsigned char'. */ +extern int string_desc_cmp (string_desc_t a, string_desc_t b); + +/* Return the index of the first occurrence of C in S, + or -1 if there is none. */ +extern ptrdiff_t string_desc_index (string_desc_t s, char c); + +/* Return the index of the last occurrence of C in S, + or -1 if there is none. */ +extern ptrdiff_t string_desc_last_index (string_desc_t s, char c); + +/* Return the index of the first occurrence of NEEDLE in HAYSTACK, + or -1 if there is none. */ +extern ptrdiff_t string_desc_contains (string_desc_t haystack, string_desc_t needle); + +/* Return an empty string. */ +extern string_desc_t string_desc_new_empty (void); + +/* Return a string that represents the C string S, of length strlen (S). */ +extern string_desc_t string_desc_from_c (const char *s); + +/* Return the substring of S, starting at offset START and ending at offset END. + START must be <= END. + The result is of length END - START. + The result must not be freed (since its storage is part of the storage + of S). */ +extern string_desc_t string_desc_substring (string_desc_t s, idx_t start, idx_t end); + +/* Output S to the file descriptor FD. + Return 0 if successful. + Upon error, return -1 with errno set. */ +extern int string_desc_write (int fd, string_desc_t s); + +/* Output S to the FILE stream FP. + Return 0 if successful. + Upon error, return -1. */ +extern int string_desc_fwrite (FILE *fp, string_desc_t s); + + +/* ==== Memory-allocating operations on string descriptors ==== */ + +/* Construct a string of length N, with uninitialized contents. + Return 0 if successful. + Upon error, return -1 with errno set. */ +_GL_ATTRIBUTE_NODISCARD +extern int string_desc_new (string_desc_t *resultp, idx_t n); + +/* Construct and return a string of length N, at the given memory address. */ +extern string_desc_t string_desc_new_addr (idx_t n, char *addr); + +/* Construct a string of length N, filled with C. + Return 0 if successful. + Upon error, return -1 with errno set. */ +_GL_ATTRIBUTE_NODISCARD +extern int string_desc_new_filled (string_desc_t *resultp, idx_t n, char c); + +/* Construct a copy of string S. + Return 0 if successful. + Upon error, return -1 with errno set. */ +_GL_ATTRIBUTE_NODISCARD +extern int string_desc_copy (string_desc_t *resultp, string_desc_t s); + +/* Construct the concatenation of N strings. N must be > 0. + Return 0 if successful. + Upon error, return -1 with errno set. */ +_GL_ATTRIBUTE_NODISCARD +extern int string_desc_concat (string_desc_t *resultp, idx_t n, string_desc_t string1, ...); + +/* Construct a copy of string S, as a NUL-terminated C string. + Return it is successful. + Upon error, return NULL with errno set. */ +extern char * string_desc_c (string_desc_t s) _GL_ATTRIBUTE_DEALLOC_FREE; + + +/* ==== Operations with side effects on string descriptors ==== */ + +/* Overwrite the byte at index I of string S with C. + I must be < length(S). */ +extern void string_desc_set_char_at (string_desc_t s, idx_t i, char c); + +/* Fill part of S, starting at offset START and ending at offset END, + with copies of C. + START must be <= END. */ +extern void string_desc_fill (string_desc_t s, idx_t start, idx_t end, char c); + +/* Overwrite part of S with T, starting at offset START. + START + length(T) must be <= length (S). */ +extern void string_desc_overwrite (string_desc_t s, idx_t start, string_desc_t t); + +/* Free S. */ +extern void string_desc_free (string_desc_t s); + + +/* ==== Inline function definitions ==== */ + +GL_STRING_DESC_INLINE idx_t +string_desc_length (string_desc_t s) +{ + return s._nbytes; +} + +GL_STRING_DESC_INLINE char +string_desc_char_at (string_desc_t s, idx_t i) +{ + if (!(i >= 0 && i < s._nbytes)) + /* Invalid argument. */ + abort (); + return s._data[i]; +} + +GL_STRING_DESC_INLINE const char * +string_desc_data (string_desc_t s) +{ + return s._data; +} + +GL_STRING_DESC_INLINE bool +string_desc_is_empty (string_desc_t s) +{ + return s._nbytes == 0; +} + + +#ifdef __cplusplus +} +#endif + +_GL_INLINE_HEADER_END + + +#endif /* _STRING_DESC_H */ diff --git a/modules/string-desc b/modules/string-desc new file mode 100644 index 0000000000..044ee266e4 --- /dev/null +++ b/modules/string-desc @@ -0,0 +1,30 @@ +Description: +String descriptors. + +Files: +lib/string-desc.h +lib/string-desc.c +lib/string-desc-contains.c + +Depends-on: +stdbool +idx +ialloc +memchr +memrchr +memmem +full-write + +configure.ac: + +Makefile.am: +lib_SOURCES += string-desc.c string-desc-contains.c + +Include: +"string-desc.h" + +License: +LGPL + +Maintainer: +all -- 2.34.1
>From 7e6e6fc13ae94a8a1449153a6b48e9621dcf45a2 Mon Sep 17 00:00:00 2001 From: Bruno Haible <br...@clisp.org> Date: Wed, 29 Mar 2023 00:23:55 +0200 Subject: [PATCH 2/7] string-desc: Add tests. * tests/test-string-desc.sh: New file. * tests/test-string-desc.c: New file. * modules/string-desc-tests: New file. --- ChangeLog | 5 + modules/string-desc-tests | 12 +++ tests/test-string-desc.c | 186 ++++++++++++++++++++++++++++++++++++++ tests/test-string-desc.sh | 13 +++ 4 files changed, 216 insertions(+) create mode 100644 modules/string-desc-tests create mode 100644 tests/test-string-desc.c create mode 100755 tests/test-string-desc.sh diff --git a/ChangeLog b/ChangeLog index 2865fdea7e..3f621249b8 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,5 +1,10 @@ 2023-03-28 Bruno Haible <br...@clisp.org> + string-desc: Add tests. + * tests/test-string-desc.sh: New file. + * tests/test-string-desc.c: New file. + * modules/string-desc-tests: New file. + string-desc: New module. * lib/string-desc.h: New file. * lib/string-desc.c: New file. diff --git a/modules/string-desc-tests b/modules/string-desc-tests new file mode 100644 index 0000000000..d7923c3a60 --- /dev/null +++ b/modules/string-desc-tests @@ -0,0 +1,12 @@ +Files: +tests/test-string-desc.sh +tests/test-string-desc.c +tests/macros.h + +Depends-on: + +configure.ac: + +Makefile.am: +TESTS += test-string-desc.sh +check_PROGRAMS += test-string-desc diff --git a/tests/test-string-desc.c b/tests/test-string-desc.c new file mode 100644 index 0000000000..53aeb68743 --- /dev/null +++ b/tests/test-string-desc.c @@ -0,0 +1,186 @@ +/* Test of string descriptors. + Copyright (C) 2023 Free Software Foundation, Inc. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see <https://www.gnu.org/licenses/>. */ + +/* Written by Bruno Haible <br...@clisp.org>, 2023. */ + +#include <config.h> + +#include "string-desc.h" + +#include <stdlib.h> +#include <string.h> + +#include "macros.h" + +int +main (void) +{ + string_desc_t s0 = string_desc_new_empty (); + string_desc_t s1 = string_desc_from_c ("Hello world!"); + string_desc_t s2 = string_desc_new_addr (21, "The\0quick\0brown\0\0fox"); + + /* Test string_desc_length. */ + ASSERT (string_desc_length (s0) == 0); + ASSERT (string_desc_length (s1) == 12); + ASSERT (string_desc_length (s2) == 21); + + /* Test string_desc_char_at. */ + ASSERT (string_desc_char_at (s1, 0) == 'H'); + ASSERT (string_desc_char_at (s1, 11) == '!'); + ASSERT (string_desc_char_at (s2, 0) == 'T'); + ASSERT (string_desc_char_at (s2, 1) == 'h'); + ASSERT (string_desc_char_at (s2, 2) == 'e'); + ASSERT (string_desc_char_at (s2, 3) == '\0'); + ASSERT (string_desc_char_at (s2, 4) == 'q'); + ASSERT (string_desc_char_at (s2, 15) == '\0'); + ASSERT (string_desc_char_at (s2, 16) == '\0'); + + /* Test string_desc_data. */ + (void) string_desc_data (s0); + ASSERT (memcmp (string_desc_data (s1), "Hello world!", 12) == 0); + ASSERT (memcmp (string_desc_data (s2), "The\0quick\0brown\0\0fox", 21) == 0); + + /* Test string_desc_is_empty. */ + ASSERT (string_desc_is_empty (s0)); + ASSERT (!string_desc_is_empty (s1)); + ASSERT (!string_desc_is_empty (s2)); + + /* Test string_desc_startswith. */ + ASSERT (string_desc_startswith (s1, s0)); + ASSERT (!string_desc_startswith (s0, s1)); + ASSERT (!string_desc_startswith (s1, s2)); + ASSERT (!string_desc_startswith (s2, s1)); + ASSERT (string_desc_startswith (s2, string_desc_from_c ("The"))); + ASSERT (string_desc_startswith (s2, string_desc_new_addr (9, "The\0quick"))); + ASSERT (!string_desc_startswith (s2, string_desc_new_addr (9, "The\0quirk"))); + + /* Test string_desc_endswith. */ + ASSERT (string_desc_endswith (s1, s0)); + ASSERT (!string_desc_endswith (s0, s1)); + ASSERT (!string_desc_endswith (s1, s2)); + ASSERT (!string_desc_endswith (s2, s1)); + ASSERT (!string_desc_endswith (s2, string_desc_from_c ("fox"))); + ASSERT (string_desc_endswith (s2, string_desc_new_addr (4, "fox"))); + ASSERT (string_desc_endswith (s2, string_desc_new_addr (6, "\0\0fox"))); + ASSERT (!string_desc_endswith (s2, string_desc_new_addr (5, "\0\0ox"))); + + /* Test string_desc_cmp. */ + ASSERT (string_desc_cmp (s0, s0) == 0); + ASSERT (string_desc_cmp (s0, s1) < 0); + ASSERT (string_desc_cmp (s0, s2) < 0); + ASSERT (string_desc_cmp (s1, s0) > 0); + ASSERT (string_desc_cmp (s1, s1) == 0); + ASSERT (string_desc_cmp (s1, s2) < 0); + ASSERT (string_desc_cmp (s2, s0) > 0); + ASSERT (string_desc_cmp (s2, s1) > 0); + ASSERT (string_desc_cmp (s2, s2) == 0); + + /* Test string_desc_index. */ + ASSERT (string_desc_index (s0, 'o') == -1); + ASSERT (string_desc_index (s2, 'o') == 12); + + /* Test string_desc_last_index. */ + ASSERT (string_desc_last_index (s0, 'o') == -1); + ASSERT (string_desc_last_index (s2, 'o') == 18); + + /* Test string_desc_contains. */ + ASSERT (string_desc_contains (s0, string_desc_from_c ("ll")) == -1); + ASSERT (string_desc_contains (s1, string_desc_from_c ("ll")) == 2); + ASSERT (string_desc_contains (s1, string_desc_new_addr (1, "")) == -1); + ASSERT (string_desc_contains (s2, string_desc_new_addr (1, "")) == 3); + ASSERT (string_desc_contains (s1, string_desc_new_addr (2, "\0")) == -1); + ASSERT (string_desc_contains (s2, string_desc_new_addr (2, "\0")) == 15); + + /* Test string_desc_substring. */ + ASSERT (string_desc_cmp (string_desc_substring (s1, 2, 5), + string_desc_from_c ("llo")) == 0); + + /* Test string_desc_write. */ + ASSERT (string_desc_write (3, s0) == 0); + ASSERT (string_desc_write (3, s1) == 0); + ASSERT (string_desc_write (3, s2) == 0); + + /* Test string_desc_fwrite. */ + ASSERT (string_desc_fwrite (stdout, s0) == 0); + ASSERT (string_desc_fwrite (stdout, s1) == 0); + ASSERT (string_desc_fwrite (stdout, s2) == 0); + + /* Test string_desc_new, string_desc_set_char_at, string_desc_fill. */ + string_desc_t s4; + ASSERT (string_desc_new (&s4, 5) == 0); + string_desc_set_char_at (s4, 0, 'H'); + string_desc_set_char_at (s4, 4, 'o'); + string_desc_set_char_at (s4, 1, 'e'); + string_desc_fill (s4, 2, 4, 'l'); + ASSERT (string_desc_length (s4) == 5); + ASSERT (string_desc_startswith (s1, s4)); + + /* Test string_desc_new_filled, string_desc_set_char_at. */ + string_desc_t s5; + ASSERT (string_desc_new_filled (&s5, 5, 'l') == 0); + string_desc_set_char_at (s5, 0, 'H'); + string_desc_set_char_at (s5, 4, 'o'); + string_desc_set_char_at (s5, 1, 'e'); + ASSERT (string_desc_length (s5) == 5); + ASSERT (string_desc_startswith (s1, s5)); + + /* Test string_desc_equals. */ + ASSERT (!string_desc_equals (s1, s5)); + ASSERT (string_desc_equals (s4, s5)); + + /* Test string_desc_copy, string_desc_free. */ + { + string_desc_t s6; + ASSERT (string_desc_copy (&s6, s0) == 0); + ASSERT (string_desc_is_empty (s6)); + string_desc_free (s6); + } + { + string_desc_t s6; + ASSERT (string_desc_copy (&s6, s2) == 0); + ASSERT (string_desc_equals (s6, s2)); + string_desc_free (s6); + } + + /* Test string_desc_overwrite. */ + { + string_desc_t s7; + ASSERT (string_desc_copy (&s7, s2) == 0); + string_desc_overwrite (s7, 4, s1); + ASSERT (string_desc_equals (s7, string_desc_new_addr (21, "The\0Hello world!\0fox"))); + } + + /* Test string_desc_concat. */ + { + string_desc_t s8; + ASSERT (string_desc_concat (&s8, 3, string_desc_new_addr (10, "The\0quick"), + string_desc_new_addr (7, "brown\0"), + string_desc_new_addr (4, "fox"), + string_desc_new_addr (7, "unused")) == 0); + ASSERT (string_desc_equals (s8, s2)); + string_desc_free (s8); + } + + /* Test string_desc_c. */ + { + char *ptr = string_desc_c (s2); + ASSERT (ptr != NULL); + ASSERT (memcmp (ptr, "The\0quick\0brown\0\0fox\0", 22) == 0); + free (ptr); + } + + return 0; +} diff --git a/tests/test-string-desc.sh b/tests/test-string-desc.sh new file mode 100755 index 0000000000..1f52ccf19e --- /dev/null +++ b/tests/test-string-desc.sh @@ -0,0 +1,13 @@ +#!/bin/sh + +./test-string-desc${EXEEXT} > test-string-desc-1.tmp 3> test-string-desc-3.tmp || exit 1 + +printf 'Hello world!The\0quick\0brown\0\0fox\0' > test-string-desc.ok + +: "${DIFF=diff}" +${DIFF} test-string-desc.ok test-string-desc-1.tmp || { echo "string_desc_fwrite KO" 1>&2; exit 1; } +${DIFF} test-string-desc.ok test-string-desc-3.tmp || { echo "string_desc_write KO" 1>&2; exit 1; } + +rm -f test-string-desc-1.tmp test-string-desc-3.tmp + +exit 0 -- 2.34.1
>From 40168fbbadc0bde07de8c27612f88640e4cf74e8 Mon Sep 17 00:00:00 2001 From: Bruno Haible <br...@clisp.org> Date: Wed, 29 Mar 2023 00:24:57 +0200 Subject: [PATCH 3/7] xstring-desc: New module. * lib/xstring-desc.h: New file. * lib/xstring-desc.c: New file. * modules/xstring-desc: New file. --- ChangeLog | 5 ++ lib/xstring-desc.c | 74 ++++++++++++++++++++++++++++ lib/xstring-desc.h | 114 +++++++++++++++++++++++++++++++++++++++++++ modules/xstring-desc | 24 +++++++++ 4 files changed, 217 insertions(+) create mode 100644 lib/xstring-desc.c create mode 100644 lib/xstring-desc.h create mode 100644 modules/xstring-desc diff --git a/ChangeLog b/ChangeLog index 3f621249b8..d0e75ebec7 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,5 +1,10 @@ 2023-03-28 Bruno Haible <br...@clisp.org> + xstring-desc: New module. + * lib/xstring-desc.h: New file. + * lib/xstring-desc.c: New file. + * modules/xstring-desc: New file. + string-desc: Add tests. * tests/test-string-desc.sh: New file. * tests/test-string-desc.c: New file. diff --git a/lib/xstring-desc.c b/lib/xstring-desc.c new file mode 100644 index 0000000000..19c5ab6a00 --- /dev/null +++ b/lib/xstring-desc.c @@ -0,0 +1,74 @@ +/* String descriptors, with out-of-memory checking. + Copyright (C) 2023 Free Software Foundation, Inc. + + This file is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published + by the Free Software Foundation, either version 3 of the License, + or (at your option) any later version. + + This file is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see <https://www.gnu.org/licenses/>. */ + +#include <config.h> + +#define GL_XSTRING_DESC_INLINE _GL_EXTERN_INLINE +#include "xstring-desc.h" + +#include "ialloc.h" + +string_desc_t +xstring_desc_concat (idx_t n, string_desc_t string1, ...) +{ + if (n <= 0) + /* Invalid argument. */ + abort (); + + idx_t total = 0; + total += string1._nbytes; + if (n > 1) + { + va_list other_strings; + idx_t i; + + va_start (other_strings, string1); + for (i = n - 1; i > 0; i--) + { + string_desc_t arg = va_arg (other_strings, string_desc_t); + total += arg._nbytes; + } + va_end (other_strings); + } + + char *combined = (char *) imalloc (total); + if (combined == NULL) + xalloc_die (); + idx_t pos = 0; + memcpy (combined, string1._data, string1._nbytes); + pos += string1._nbytes; + if (n > 1) + { + va_list other_strings; + idx_t i; + + va_start (other_strings, string1); + for (i = n - 1; i > 0; i--) + { + string_desc_t arg = va_arg (other_strings, string_desc_t); + if (arg._nbytes > 0) + memcpy (combined + pos, arg._data, arg._nbytes); + pos += arg._nbytes; + } + va_end (other_strings); + } + + string_desc_t result; + result._nbytes = total; + result._data = combined; + + return result; +} diff --git a/lib/xstring-desc.h b/lib/xstring-desc.h new file mode 100644 index 0000000000..b07831baf4 --- /dev/null +++ b/lib/xstring-desc.h @@ -0,0 +1,114 @@ +/* String descriptors, with out-of-memory checking. + Copyright (C) 2023 Free Software Foundation, Inc. + + This file is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published + by the Free Software Foundation, either version 3 of the License, + or (at your option) any later version. + + This file is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see <https://www.gnu.org/licenses/>. */ + +/* Written by Bruno Haible <br...@clisp.org>, 2023. */ + +#ifndef _XSTRING_DESC_H +#define _XSTRING_DESC_H 1 + +#include <string.h> +#include "string-desc.h" +#include "xalloc.h" + + +#ifndef _GL_INLINE_HEADER_BEGIN + #error "Please include config.h first." +#endif +_GL_INLINE_HEADER_BEGIN +#ifndef GL_XSTRING_DESC_INLINE +# define GL_XSTRING_DESC_INLINE _GL_INLINE +#endif + +#ifdef __cplusplus +extern "C" { +#endif + + +/* ==== Memory-allocating operations on string descriptors ==== */ + +/* Return a string of length N, with uninitialized contents. */ +#if 0 /* Defined inline below. */ +extern string_desc_t xstring_desc_new (idx_t n); +#endif + +/* Return a string of length N, filled with C. */ +#if 0 /* Defined inline below. */ +extern string_desc_t xstring_desc_new_filled (idx_t n, char c); +#endif + +/* Return a copy of string S. */ +#if 0 /* Defined inline below. */ +extern string_desc_t xstring_desc_copy (string_desc_t s); +#endif + +/* Return the concatenation of N strings. N must be > 0. */ +extern string_desc_t xstring_desc_concat (idx_t n, string_desc_t string1, ...); + +/* Construct and return a copy of string S, as a NUL-terminated C string. */ +#if 0 /* Defined inline below. */ +extern char * xstring_desc_c (string_desc_t s) _GL_ATTRIBUTE_DEALLOC_FREE; +#endif + + +/* ==== Inline function definitions ==== */ + +GL_XSTRING_DESC_INLINE string_desc_t +xstring_desc_new (idx_t n) +{ + string_desc_t result; + if (string_desc_new (&result, n) < 0) + xalloc_die (); + return result; +} + +GL_XSTRING_DESC_INLINE string_desc_t +xstring_desc_new_filled (idx_t n, char c) +{ + string_desc_t result; + if (string_desc_new_filled (&result, n, c) < 0) + xalloc_die (); + return result; +} + +GL_XSTRING_DESC_INLINE string_desc_t +xstring_desc_copy (string_desc_t s) +{ + string_desc_t result; + if (string_desc_copy (&result, s) < 0) + xalloc_die (); + return result; +} + +GL_XSTRING_DESC_INLINE +_GL_ATTRIBUTE_DEALLOC_FREE +char * +xstring_desc_c (string_desc_t s) +{ + char *result = string_desc_c (s); + if (result == NULL) + xalloc_die (); + return result; +} + + +#ifdef __cplusplus +} +#endif + +_GL_INLINE_HEADER_END + + +#endif /* _XSTRING_DESC_H */ diff --git a/modules/xstring-desc b/modules/xstring-desc new file mode 100644 index 0000000000..d0cda2d23e --- /dev/null +++ b/modules/xstring-desc @@ -0,0 +1,24 @@ +Description: +String descriptors, with out-of-memory checking. + +Files: +lib/xstring-desc.h +lib/xstring-desc.c + +Depends-on: +string-desc +xalloc-die + +configure.ac: + +Makefile.am: +lib_SOURCES += xstring-desc.c + +Include: +"xstring-desc.h" + +License: +GPL + +Maintainer: +all -- 2.34.1
>From b1ddf91bbbf58d1fc35dd2922353a3f8a7a098df Mon Sep 17 00:00:00 2001 From: Bruno Haible <br...@clisp.org> Date: Wed, 29 Mar 2023 00:25:50 +0200 Subject: [PATCH 4/7] xstring-desc: Add tests. * tests/test-xstring-desc.c: New file. * modules/xstring-desc-tests: New file. --- ChangeLog | 4 ++ modules/xstring-desc-tests | 12 ++++++ tests/test-xstring-desc.c | 84 ++++++++++++++++++++++++++++++++++++++ 3 files changed, 100 insertions(+) create mode 100644 modules/xstring-desc-tests create mode 100644 tests/test-xstring-desc.c diff --git a/ChangeLog b/ChangeLog index d0e75ebec7..31e5d5d7d9 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,5 +1,9 @@ 2023-03-28 Bruno Haible <br...@clisp.org> + xstring-desc: Add tests. + * tests/test-xstring-desc.c: New file. + * modules/xstring-desc-tests: New file. + xstring-desc: New module. * lib/xstring-desc.h: New file. * lib/xstring-desc.c: New file. diff --git a/modules/xstring-desc-tests b/modules/xstring-desc-tests new file mode 100644 index 0000000000..7113d54d96 --- /dev/null +++ b/modules/xstring-desc-tests @@ -0,0 +1,12 @@ +Files: +tests/test-xstring-desc.c +tests/macros.h + +Depends-on: + +configure.ac: + +Makefile.am: +TESTS += test-xstring-desc +check_PROGRAMS += test-xstring-desc +test_xstring_desc_LDADD = $(LDADD) @LIBINTL@ diff --git a/tests/test-xstring-desc.c b/tests/test-xstring-desc.c new file mode 100644 index 0000000000..127bb93440 --- /dev/null +++ b/tests/test-xstring-desc.c @@ -0,0 +1,84 @@ +/* Test of string descriptors. + Copyright (C) 2023 Free Software Foundation, Inc. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see <https://www.gnu.org/licenses/>. */ + +/* Written by Bruno Haible <br...@clisp.org>, 2023. */ + +#include <config.h> + +#include "xstring-desc.h" + +#include <stdlib.h> +#include <string.h> + +#include "macros.h" + +int +main (void) +{ + string_desc_t s0 = string_desc_new_empty (); + string_desc_t s1 = string_desc_from_c ("Hello world!"); + string_desc_t s2 = string_desc_new_addr (21, "The\0quick\0brown\0\0fox"); + + /* Test xstring_desc_new. */ + string_desc_t s4 = xstring_desc_new (5); + string_desc_set_char_at (s4, 0, 'H'); + string_desc_set_char_at (s4, 4, 'o'); + string_desc_set_char_at (s4, 1, 'e'); + string_desc_fill (s4, 2, 4, 'l'); + ASSERT (string_desc_length (s4) == 5); + ASSERT (string_desc_startswith (s1, s4)); + + /* Test xstring_desc_new_filled. */ + string_desc_t s5 = xstring_desc_new_filled (5, 'l'); + string_desc_set_char_at (s5, 0, 'H'); + string_desc_set_char_at (s5, 4, 'o'); + string_desc_set_char_at (s5, 1, 'e'); + ASSERT (string_desc_length (s5) == 5); + ASSERT (string_desc_startswith (s1, s5)); + + /* Test xstring_desc_copy. */ + { + string_desc_t s6 = xstring_desc_copy (s0); + ASSERT (string_desc_is_empty (s6)); + string_desc_free (s6); + } + { + string_desc_t s6 = xstring_desc_copy (s2); + ASSERT (string_desc_equals (s6, s2)); + string_desc_free (s6); + } + + /* Test xstring_desc_concat. */ + { + string_desc_t s8 = + xstring_desc_concat (3, string_desc_new_addr (10, "The\0quick"), + string_desc_new_addr (7, "brown\0"), + string_desc_new_addr (4, "fox"), + string_desc_new_addr (7, "unused")); + ASSERT (string_desc_equals (s8, s2)); + string_desc_free (s8); + } + + /* Test xstring_desc_c. */ + { + char *ptr = xstring_desc_c (s2); + ASSERT (ptr != NULL); + ASSERT (memcmp (ptr, "The\0quick\0brown\0\0fox\0", 22) == 0); + free (ptr); + } + + return 0; +} -- 2.34.1
>From 7d2467a15fde1bd0c0a70756800ed5b266e6d031 Mon Sep 17 00:00:00 2001 From: Bruno Haible <br...@clisp.org> Date: Wed, 29 Mar 2023 00:26:51 +0200 Subject: [PATCH 5/7] string-desc-quotearg: New module. * lib/string-desc-quotearg.h: New file. * lib/string-desc-quotearg.c: New file. * modules/string-desc-quotearg: New file. --- ChangeLog | 5 + lib/string-desc-quotearg.c | 20 ++++ lib/string-desc-quotearg.h | 220 +++++++++++++++++++++++++++++++++++ modules/string-desc-quotearg | 24 ++++ 4 files changed, 269 insertions(+) create mode 100644 lib/string-desc-quotearg.c create mode 100644 lib/string-desc-quotearg.h create mode 100644 modules/string-desc-quotearg diff --git a/ChangeLog b/ChangeLog index 31e5d5d7d9..81788e76d4 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,5 +1,10 @@ 2023-03-28 Bruno Haible <br...@clisp.org> + string-desc-quotearg: New module. + * lib/string-desc-quotearg.h: New file. + * lib/string-desc-quotearg.c: New file. + * modules/string-desc-quotearg: New file. + xstring-desc: Add tests. * tests/test-xstring-desc.c: New file. * modules/xstring-desc-tests: New file. diff --git a/lib/string-desc-quotearg.c b/lib/string-desc-quotearg.c new file mode 100644 index 0000000000..11554e3369 --- /dev/null +++ b/lib/string-desc-quotearg.c @@ -0,0 +1,20 @@ +/* Quote string descriptors for output. + Copyright (C) 2023 Free Software Foundation, Inc. + + This file is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published + by the Free Software Foundation, either version 3 of the License, + or (at your option) any later version. + + This file is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see <https://www.gnu.org/licenses/>. */ + +#include <config.h> + +#define GL_STRING_DESC_QUOTEARG_INLINE _GL_EXTERN_INLINE +#include "string-desc-quotearg.h" diff --git a/lib/string-desc-quotearg.h b/lib/string-desc-quotearg.h new file mode 100644 index 0000000000..b7cb306b89 --- /dev/null +++ b/lib/string-desc-quotearg.h @@ -0,0 +1,220 @@ +/* Quote string descriptors for output. + Copyright (C) 2023 Free Software Foundation, Inc. + + This file is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published + by the Free Software Foundation, either version 3 of the License, + or (at your option) any later version. + + This file is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see <https://www.gnu.org/licenses/>. */ + +/* Written by Bruno Haible <br...@clisp.org>, 2023. */ + +#ifndef _STRING_DESC_QUOTEARG_H +#define _STRING_DESC_QUOTEARG_H 1 + +#include "string-desc.h" +#include "quotearg.h" + + +#ifndef _GL_INLINE_HEADER_BEGIN + #error "Please include config.h first." +#endif +_GL_INLINE_HEADER_BEGIN +#ifndef GL_STRING_DESC_QUOTEARG_INLINE +# define GL_STRING_DESC_QUOTEARG_INLINE _GL_INLINE +#endif + +#ifdef __cplusplus +extern "C" { +#endif + + +/* Place into buffer BUFFER (of size BUFFERSIZE) a quoted version of + argument ARG, using O to control quoting. + If O is null, use the default. + Terminate the output with a null character, and return the written + size of the output, not counting the terminating null. + If BUFFERSIZE is too small to store the output string, return the + value that would have been returned had BUFFERSIZE been large enough. + On output, BUFFER might contain embedded null bytes if the style of O + does not use backslash escapes and the flags of O do not request + elision of null bytes. */ +#if 0 +extern size_t string_desc_quotearg_buffer (char *restrict buffer, + size_t buffersize, + string_desc_t arg, + struct quoting_options const *o); +#endif + +/* Like string_desc_quotearg_buffer, except return the result in a newly + allocated buffer and store its length, excluding the terminating null + byte, in *SIZE. It is the caller's responsibility to free the result. + The result might contain embedded null bytes if the style of O does + not use backslash escapes and the flags of O do not request elision + of null bytes. */ +#if 0 +extern char *string_desc_quotearg_alloc (string_desc_t arg, + size_t *size, + struct quoting_options const *o) + _GL_ATTRIBUTE_NONNULL ((2)) + _GL_ATTRIBUTE_MALLOC _GL_ATTRIBUTE_DEALLOC_FREE + _GL_ATTRIBUTE_RETURNS_NONNULL; +#endif + +/* Use storage slot N to return a quoted version of the string ARG. + Use the default quoting options. + The returned value points to static storage that can be + reused by the next call to this function with the same value of N. + N must be nonnegative. */ +#if 0 +extern char *string_desc_quotearg_n (int n, string_desc_t arg); +#endif + +/* Equivalent to string_desc_quotearg_n (0, ARG). */ +#if 0 +extern char *string_desc_quotearg (string_desc_t arg); +#endif + +/* Use style S and storage slot N to return a quoted version of the string ARG. + This is like string_desc_quotearg_n (N, ARG), except that it uses S + with no other options to specify the quoting method. */ +#if 0 +extern char *string_desc_quotearg_n_style (int n, enum quoting_style s, + string_desc_t arg); +#endif + +/* Equivalent to string_desc_quotearg_n_style (0, S, ARG). */ +#if 0 +extern char *string_desc_quotearg_style (enum quoting_style s, + string_desc_t arg); +#endif + +/* Like string_desc_quotearg (ARG), except also quote any instances of CH. + See set_char_quoting for a description of acceptable CH values. */ +#if 0 +extern char *string_desc_quotearg_char (string_desc_t arg, char ch); +#endif + +/* Equivalent to string_desc_quotearg_char (ARG, ':'). */ +#if 0 +extern char *string_desc_quotearg_colon (string_desc_t arg); +#endif + +/* Like string_desc_quotearg_n_style (N, S, ARG) but with S as + custom_quoting_style with left quote as LEFT_QUOTE and right quote + as RIGHT_QUOTE. See set_custom_quoting for a description of acceptable + LEFT_QUOTE and RIGHT_QUOTE values. */ +#if 0 +extern char *string_desc_quotearg_n_custom (int n, + char const *left_quote, + char const *right_quote, + string_desc_t arg); +#endif + +/* Equivalent to + string_desc_quotearg_n_custom (0, LEFT_QUOTE, RIGHT_QUOTE, ARG). */ +#if 0 +extern char *string_desc_quotearg_custom (char const *left_quote, + char const *right_quote, + string_desc_t arg); +#endif + + +/* ==== Inline function definitions ==== */ + +GL_STRING_DESC_QUOTEARG_INLINE size_t +string_desc_quotearg_buffer (char *restrict buffer, size_t buffersize, + string_desc_t arg, + struct quoting_options const *o) +{ + return quotearg_buffer (buffer, buffersize, + string_desc_data (arg), string_desc_length (arg), + o); +} + +GL_STRING_DESC_QUOTEARG_INLINE +_GL_ATTRIBUTE_NONNULL ((2)) +_GL_ATTRIBUTE_MALLOC _GL_ATTRIBUTE_DEALLOC_FREE +_GL_ATTRIBUTE_RETURNS_NONNULL +char * +string_desc_quotearg_alloc (string_desc_t arg, + size_t *size, + struct quoting_options const *o) +{ + return quotearg_alloc_mem (string_desc_data (arg), string_desc_length (arg), + size, + o); +} + +GL_STRING_DESC_QUOTEARG_INLINE char * +string_desc_quotearg_n (int n, string_desc_t arg) +{ + return quotearg_n_mem (n, string_desc_data (arg), string_desc_length (arg)); +} + +GL_STRING_DESC_QUOTEARG_INLINE char * +string_desc_quotearg (string_desc_t arg) +{ + return quotearg_mem (string_desc_data (arg), string_desc_length (arg)); +} + +GL_STRING_DESC_QUOTEARG_INLINE char * +string_desc_quotearg_n_style (int n, enum quoting_style s, string_desc_t arg) +{ + return quotearg_n_style_mem (n, s, + string_desc_data (arg), string_desc_length (arg)); +} + +GL_STRING_DESC_QUOTEARG_INLINE char * +string_desc_quotearg_style (enum quoting_style s, string_desc_t arg) +{ + return quotearg_style_mem (s, + string_desc_data (arg), string_desc_length (arg)); +} + +GL_STRING_DESC_QUOTEARG_INLINE char * +string_desc_quotearg_char (string_desc_t arg, char ch) +{ + return quotearg_char_mem (string_desc_data (arg), string_desc_length (arg), + ch); +} + +GL_STRING_DESC_QUOTEARG_INLINE char * +string_desc_quotearg_colon (string_desc_t arg) +{ + return quotearg_colon_mem (string_desc_data (arg), string_desc_length (arg)); +} + +GL_STRING_DESC_QUOTEARG_INLINE char * +string_desc_quotearg_n_custom (int n, + char const *left_quote, char const *right_quote, + string_desc_t arg) +{ + return quotearg_n_custom_mem (n, left_quote, right_quote, + string_desc_data (arg), string_desc_length (arg)); +} + +GL_STRING_DESC_QUOTEARG_INLINE char * +string_desc_quotearg_custom (char const *left_quote, char const *right_quote, + string_desc_t arg) +{ + return quotearg_custom_mem (left_quote, right_quote, + string_desc_data (arg), string_desc_length (arg)); +} + + +#ifdef __cplusplus +} +#endif + +_GL_INLINE_HEADER_END + + +#endif /* _STRING_DESC_QUOTEARG_H */ diff --git a/modules/string-desc-quotearg b/modules/string-desc-quotearg new file mode 100644 index 0000000000..42472b0774 --- /dev/null +++ b/modules/string-desc-quotearg @@ -0,0 +1,24 @@ +Description: +Quote string descriptors for output. + +Files: +lib/string-desc-quotearg.h +lib/string-desc-quotearg.c + +Depends-on: +string-desc +quotearg + +configure.ac: + +Makefile.am: +lib_SOURCES += string-desc-quotearg.c + +Include: +"string-desc-quotearg.h" + +License: +GPL + +Maintainer: +all -- 2.34.1
>From 698599bd146f8f6a70efd3b1fe54c3aec441ce31 Mon Sep 17 00:00:00 2001 From: Bruno Haible <br...@clisp.org> Date: Wed, 29 Mar 2023 00:27:37 +0200 Subject: [PATCH 6/7] string-desc-quotearg: Add tests. * tests/test-string-desc-quotearg.c: New file. * modules/string-desc-quotearg-tests: New file. --- ChangeLog | 4 ++ modules/string-desc-quotearg-tests | 12 ++++ tests/test-string-desc-quotearg.c | 100 +++++++++++++++++++++++++++++ 3 files changed, 116 insertions(+) create mode 100644 modules/string-desc-quotearg-tests create mode 100644 tests/test-string-desc-quotearg.c diff --git a/ChangeLog b/ChangeLog index 81788e76d4..7b4ac80f12 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,5 +1,9 @@ 2023-03-28 Bruno Haible <br...@clisp.org> + string-desc-quotearg: Add tests. + * tests/test-string-desc-quotearg.c: New file. + * modules/string-desc-quotearg-tests: New file. + string-desc-quotearg: New module. * lib/string-desc-quotearg.h: New file. * lib/string-desc-quotearg.c: New file. diff --git a/modules/string-desc-quotearg-tests b/modules/string-desc-quotearg-tests new file mode 100644 index 0000000000..2fa97ee4cb --- /dev/null +++ b/modules/string-desc-quotearg-tests @@ -0,0 +1,12 @@ +Files: +tests/test-string-desc-quotearg.c +tests/macros.h + +Depends-on: + +configure.ac: + +Makefile.am: +TESTS += test-string-desc-quotearg +check_PROGRAMS += test-string-desc-quotearg +test_string_desc_quotearg_LDADD = $(LDADD) @LIBINTL@ $(MBRTOWC_LIB) diff --git a/tests/test-string-desc-quotearg.c b/tests/test-string-desc-quotearg.c new file mode 100644 index 0000000000..0a3c42d35c --- /dev/null +++ b/tests/test-string-desc-quotearg.c @@ -0,0 +1,100 @@ +/* Test of string descriptors. + Copyright (C) 2023 Free Software Foundation, Inc. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see <https://www.gnu.org/licenses/>. */ + +/* Written by Bruno Haible <br...@clisp.org>, 2023. */ + +#include <config.h> + +#include "string-desc-quotearg.h" + +#include <stdlib.h> +#include <string.h> + +#include "macros.h" + +int +main (void) +{ + string_desc_t s1 = string_desc_from_c ("Hello world!"); + string_desc_t s2 = string_desc_new_addr (21, "The\0quick\0brown\0\0fox"); + + /* Test string_desc_quotearg_buffer. */ + { + char buf[80]; + size_t n = string_desc_quotearg_buffer (buf, sizeof (buf), s2, NULL); + ASSERT (n == 21); + ASSERT (memcmp (buf, "The\0quick\0brown\0\0fox", n) == 0); + } + + /* Test string_desc_quotearg_alloc. */ + { + size_t n; + char *ret = string_desc_quotearg_alloc (s2, &n, NULL); + ASSERT (n == 21); + ASSERT (memcmp (ret, "The\0quick\0brown\0\0fox", n) == 0); + free (ret); + } + + /* Test string_desc_quotearg_n. */ + { + char *ret = string_desc_quotearg_n (1, s2); + ASSERT (memcmp (ret, "Thequickbrownfox", 16 + 1) == 0); + } + + /* Test string_desc_quotearg. */ + { + char *ret = string_desc_quotearg (s2); + ASSERT (memcmp (ret, "Thequickbrownfox", 16 + 1) == 0); + } + + /* Test string_desc_quotearg_n_style. */ + { + char *ret = string_desc_quotearg_n_style (1, clocale_quoting_style, s2); + ASSERT (memcmp (ret, "\"The\\0quick\\0brown\\0\\0fox\\0\"", 28 + 1) == 0); + } + + /* Test string_desc_quotearg_style. */ + { + char *ret = string_desc_quotearg_style (clocale_quoting_style, s2); + ASSERT (memcmp (ret, "\"The\\0quick\\0brown\\0\\0fox\\0\"", 28 + 1) == 0); + } + + /* Test string_desc_quotearg_char. */ + { + char *ret = string_desc_quotearg_char (s1, ' '); + ASSERT (memcmp (ret, "Hello world!", 12 + 1) == 0); /* ' ' not quoted?! */ + } + + /* Test string_desc_quotearg_colon. */ + { + char *ret = string_desc_quotearg_colon (string_desc_from_c ("a:b")); + ASSERT (memcmp (ret, "a:b", 3 + 1) == 0); /* ':' not quoted?! */ + } + + /* Test string_desc_quotearg_n_custom. */ + { + char *ret = string_desc_quotearg_n_custom (2, "<", ">", s1); + ASSERT (memcmp (ret, "<Hello world!>", 14 + 1) == 0); + } + + /* Test string_desc_quotearg_n_custom. */ + { + char *ret = string_desc_quotearg_custom ("[[", "]]", s1); + ASSERT (memcmp (ret, "[[Hello world!]]", 16 + 1) == 0); + } + + return 0; +} -- 2.34.1
>From e7be61b04cff735e0e3974103a0c2798b2b123d2 Mon Sep 17 00:00:00 2001 From: Bruno Haible <br...@clisp.org> Date: Wed, 29 Mar 2023 00:31:47 +0200 Subject: [PATCH 7/7] doc: Document string-desc and related modules. * doc/string-desc.texi: New file. * doc/gnulib.texi (Particular Modules): Include it. --- ChangeLog | 4 ++ doc/gnulib.texi | 3 ++ doc/string-desc.texi | 103 +++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 110 insertions(+) create mode 100644 doc/string-desc.texi diff --git a/ChangeLog b/ChangeLog index 7b4ac80f12..c0cda0b3ba 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,5 +1,9 @@ 2023-03-28 Bruno Haible <br...@clisp.org> + doc: Document string-desc and related modules. + * doc/string-desc.texi: New file. + * doc/gnulib.texi (Particular Modules): Include it. + string-desc-quotearg: Add tests. * tests/test-string-desc-quotearg.c: New file. * modules/string-desc-quotearg-tests: New file. diff --git a/doc/gnulib.texi b/doc/gnulib.texi index 12766b2d96..3af5cb21b2 100644 --- a/doc/gnulib.texi +++ b/doc/gnulib.texi @@ -6910,6 +6910,7 @@ * static inline:: * extern inline:: * Closed standard fds:: +* Handling strings with NUL characters:: * Container data types:: * String Functions in C Locale:: * Recognizing Option Arguments:: @@ -6949,6 +6950,8 @@ @include xstdopen.texi +@include string-desc.texi + @include containers.texi @include c-locale.texi diff --git a/doc/string-desc.texi b/doc/string-desc.texi new file mode 100644 index 0000000000..5813baf47a --- /dev/null +++ b/doc/string-desc.texi @@ -0,0 +1,103 @@ +@node Handling strings with NUL characters +@section Handling strings with NUL characters + +@c Copyright (C) 2023 Free Software Foundation, Inc. + +@c Permission is granted to copy, distribute and/or modify this document +@c under the terms of the GNU Free Documentation License, Version 1.3 or +@c any later version published by the Free Software Foundation; with no +@c Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A +@c copy of the license is at <https://www.gnu.org/licenses/fdl-1.3.en.html>. + +@c Written by Bruno Haible. + +Strings in C are usually represented by a character sequence with a +terminating NUL character. A @samp{char *}, pointer to the first byte +of this character sequence, is what gets passed around as function +argument or return value. + +The major restriction of this string representation is that it cannot +handle strings that contain NUL characters: such strings will appear +shorter than they were meant to be. In most application areas, this is +not a problem, and the @code{char *} type is well usable. + +In areas where strings with embedded NUL characters need to be handled, +the common approach is to use a @code{char *ptr} pointer variable +together with a @code{size_t nbytes} variable (or an @code{idx_t nbytes} +variable, if you want to avoid problems due to integer overflow). This +works fine in code that constructs or manipulates strings with embedded +NUL characters. But when it comes to @emph{storing} them, for example +in an array or as key or value of a hash table, one needs a type that +combines these two fields. + +The Gnulib modules @code{string-desc}, @code{xstring-desc}, and +@code{string-desc-quotearg} provide such a type. We call it a +``string descriptor'' and name it @code{string_desc_t}. + +The type @code{string_desc_t} is a struct that contains a pointer to the +first byte and the number of bytes of the memory region that make up the +string. An additional terminating NUL byte, that may be present in +memory, is not included in this byte count. This type implements the +same concept as @code{std::string_view} in C++, or the @code{String} +type in Java. + +A @code{string_desc_t} can be passed to a function as an argument, or +can be the return value of a function. This is type-safe: If, by +mistake, a programmer passes a @code{string_desc_t} to a function that +expects a @code{char *} argument, or vice versa, or assigns a +@code{string_desc_t} value to a variable of type @code{char *}, or +vice versa, the compiler will report an error. + +Functions related to string descriptors are provided: +@itemize +@item +Side-effect-free operations in @code{"string-desc.h"}, +@item +Memory-allocating operations in @code{"string-desc.h"}, +@item +Memory-allocating operations with out-of-memory checking in +@code{"xstring-desc.h"}, +@item +Operations with side effects in @code{"string-desc.h"}. +@end itemize + +For outputting a string descriptor, the @code{*printf} family of +functions cannot be used directly. A format string directive such as +@code{"%.*s"} would not work: +@itemize +@item +it would stop the output at the first encountered NUL character, +@item +it would require to cast the number of bytes to @code{int}, and thus +would not work for strings longer than @code{INT_MAX} bytes. +@end itemize +@c @noindent Other format string directives don't work either, because +@c the only way to produce a NUL character in @code{*printf}'s output +@c is through a dedicated @code{%c} or @code{%lc} directive. + +Therefore Gnulib offers +@itemize +@item +a function @code{string_desc_fwrite} that outputs a string descriptor to +a @code{FILE} stream, +@item +a function @code{string_desc_write} that outputs a string descriptor to +a file descriptor, +@item +and for those applications where the NUL characters should become +visible as @samp{\0}, a family of @code{quotearg} based functions, that +allow to specify the escaping rules in detail. +@end itemize + +The functionality is thus split across three modules as follows: +@itemize +@item +The module @code{string-desc}, under LGPL, defines the type and +elementary functions. +@item +The module @code{xstring-desc}, under GPL, defines the memory-allocating +functions with out-of-memory checking. +@item +The module @code{string-desc-quotearg}, under GPL, defines the +@code{quotearg} based functions. +@end itemize -- 2.34.1