On 10/04/25 11:24 +0200, Tomasz Kamiński wrote:
This patch implements part P2286R8 that specified debug (escaped)
format for the strings and characters sequences. This include both
handling of the '?' format specifier and set_debug_format member.

To indicate partial support we define __glibcxx_format_ranges macro
value 1, without defining __cpp_lib_format_ranges.

We provide two separate escaping routines depending on the literal
encoding for the corresponding character types. If the character
encoding is Unicode, we follow the specification for the standard
(__format::__write_escaped_unicode).
For other encodings, we escape only characters in range [0x00, 0x80),
interpreting them as ACII values: [0x00, 0x20), 0x7f and  '\t', '\r',

"ASCII"

'\n', '\\', '"', '\'' are escaped. We assume every character outside
this range is printable (__format::_write_escaped_ascii).
In particular we do not yet implement special handling of shift
sequences.

For Unicode escaping a new __escape_edges table is introduced,
that encodes information if character belongs to General_Category
that is escaped by the standard (Control or Other). This table
is generated from DerivedGeneralCategory.txt provided by Unicode.
Only boolean flag is preserved to reduce the number of entires.

"entries"

The additional rules for escaping are handled by __should_escape_unicode.

When width or precision is specified, we emit escaped string to the temporary
buffer and format the resulting string according to the format spec.
For characters use a fixed size stack buffer, for which a new _Fixedbuf_sink is
introduced. For strings, we use _Str_sink and to avoid allocations,
we compute the estimated size of (possibly truncated) input, and if it is
larger than width field we print directly.

        PR libstdc++/109162

contrib/ChangeLog:

        * unicode/README: Mentioned DerivedGeneralCategory.txt.
        * unicode/gen_libstdcxx_unicode_data.py: Generation __escape_edges
        table from DerivedGeneralCategory.txt. Update file name in comments.
        * unicode/DerivedGeneralCategory.txt: Copy of file distrubuted by

"distributed"

        Unicode Consortium.
        
ftp://ftp.unicode.org/Public/UNIDATA/extracted/DerivedGeneralCategory.txt.


I still don't think we want the URL in the ChangeLog.

libstdc++-v3/ChangeLog:

        * include/bits/chrono_io.h (__detail::_Widen): Moved to std/format file.
        * include/bits/unicode-data.h: Regnerate.
        * include/bits/unicode.h (__unicode::_Utf_iterator::_M_units)
        (__unicode::__should_escape_category): Define.

What happened to the changes to bits/version.def and bits/version.h ?

I thought you were going to change version.def to use no_stdname but
now it's not in the patch at all.

        * include/std/format (_GLIBCXX_WIDEN_, _GLIBCXX_WIDEN): Copied from
        include/bits/chrono_io.h.
        (__format::_Widen): Moved from include/bits/chrono_io.h.
        (__format::_Term_char, __format::_Escapes, __format::_Separators)
        (__format::__should_escape_ascii, __format::__should_escape_unicode)
        (__format::__write_escape_seq, __format::__write_escaped_char)
        (__format::__write_escaped_acii, __format::__write_escaped_unicode)
        (__format::__write_escaped): Define.
        (__formatter_str::_S_format): Extracted truncation of character
        sequences.
        (__formatter_str::format): Handle _Pres_esc.
        (__formatter_int::_M_do_parse) [__glibcxx_format_ranges]: Parse '?'.
        (__formatter_int::_M_format_character_escaped): Define.
        (formatter<_CharT, _CharT>::format, formatter<char, wchar_t>::format):
        Handle _Pres_esc.
        (__formatter_str::set_debug_format, formatter<...>::set_debug_format)
        Guard with __glibcxx_format_ranges.
        (__format::_Fixedbuf_sink): Define.
        * testsuite/std/format/debug.cc: New test.
        * testsuite/std/format/debug_nonunicode.cc: New test.
        * testsuite/std/format/parse_ctx.cc (escaped_strings_supported): Define
        to true if __glibcxx_format_ranges is defined.
        * testsuite/std/format/string.cc (escaped_strings_supported): Define to
         true if __glibcxx_format_ranges is defined.
---
This I believe address all review suggestions. I have also followed
Patrick's suggestions and added debug_nonunicode.cc test file.
Which helped to surface problem, where _GLIBCXX_WIDEN("\FFFFD") was
ill-formed for non-unicode encodings.

contrib/unicode/DerivedGeneralCategory.txt    | 4323 +++++++++++++++++
contrib/unicode/README                        |    3 +-
contrib/unicode/gen_libstdcxx_unicode_data.py |   47 +-
libstdc++-v3/include/bits/chrono_io.h         |   16 +-
libstdc++-v3/include/bits/unicode-data.h      |  260 +-
libstdc++-v3/include/bits/unicode.h           |   17 +
libstdc++-v3/include/std/format               |  492 +-
.../23_containers/vector/bool/format.cc       |    3 +-
libstdc++-v3/testsuite/std/format/debug.cc    |  454 ++
.../testsuite/std/format/debug_nounicode.cc   |    5 +
.../testsuite/std/format/parse_ctx.cc         |    2 +-
libstdc++-v3/testsuite/std/format/string.cc   |    2 +-
12 files changed, 5538 insertions(+), 86 deletions(-)
create mode 100644 contrib/unicode/DerivedGeneralCategory.txt
create mode 100644 libstdc++-v3/testsuite/std/format/debug.cc
create mode 100644 libstdc++-v3/testsuite/std/format/debug_nounicode.cc

[snip]

diff --git a/libstdc++-v3/include/bits/unicode.h 
b/libstdc++-v3/include/bits/unicode.h
index 99d972eccff..f1b6bf49c54 100644
--- a/libstdc++-v3/include/bits/unicode.h
+++ b/libstdc++-v3/include/bits/unicode.h
@@ -150,6 +150,11 @@ namespace __unicode
      base() const requires forward_iterator<_Iter>
      { return _M_curr(); }

+      [[nodiscard]]
+      constexpr iter_difference_t<_Iter>
+      _M_units() const requires forward_iterator<_Iter>
+      { return _M_to_increment; }
+
      [[nodiscard]]
      constexpr value_type
      operator*() const { return _M_buf[_M_buf_index]; }
@@ -609,6 +614,18 @@ inline namespace __v16_0_0
    return (__p - __width_edges) % 2 + 1;
  }

+  // @pre c <= 0x10FFFF
+  constexpr bool
+  __should_escape_category(char32_t __c) noexcept
+  {
+    constexpr uint32_t __mask = 0x01;
+    auto* __end = std::end(__escape_edges);
+    auto* __p = std::lower_bound(__escape_edges, __end,
+                                (__c << 1u) + 2);
+    return __p[-1] & __mask;
+  }
+
+
  // @pre c <= 0x10FFFF
  constexpr _Gcb_property
  __grapheme_cluster_break_property(char32_t __c) noexcept
diff --git a/libstdc++-v3/include/std/format b/libstdc++-v3/include/std/format
index 2e9319cdda6..b905d8c012d 100644
--- a/libstdc++-v3/include/std/format
+++ b/libstdc++-v3/include/std/format
@@ -80,8 +80,35 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
/// @cond undocumented
namespace __format
{
-  // Type-erased character sink.
+  // STATICALLY-WIDEN, see C++20 [time.general]
+  // It doesn't matter for format strings (which can only be char or wchar_t)
+  // but this returns the narrow string for anything that isn't wchar_t. This
+  // is done because const char* can be inserted into any ostream type, and
+  // will be widened at runtime if necessary.
+  template<typename _CharT>
+    consteval auto
+    _Widen(const char* __narrow, const wchar_t* __wide)
+    {
+      if constexpr (is_same_v<_CharT, wchar_t>)
+       return __wide;
+      else
+       return __narrow;
+    }
+#define _GLIBCXX_WIDEN_(C, S) ::std::__format::_Widen<C>(S, L##S)
+#define _GLIBCXX_WIDEN(S) _GLIBCXX_WIDEN_(_CharT, S)
+
+  // Type-erased character sinks.
  template<typename _CharT> class _Sink;
+  template<typename _CharT> class _Fixedbuf_sink;
+  template<typename _Seq> class _Seq_sink;
+
+  template<typename _CharT, typename _Alloc = allocator<_CharT>>
+    using _Str_sink
+      = _Seq_sink<basic_string<_CharT, char_traits<_CharT>, _Alloc>>;
+
+  // template<typename _CharT, typename _Alloc = allocator<_CharT>>
+  // using _Vec_sink = _Seq_sink<vector<_CharT, _Alloc>>;
+
  // Output iterator that writes to a type-erase character sink.
  template<typename _CharT>
    class _Sink_iter;
@@ -848,6 +875,286 @@ namespace __format
                                      __spec._M_fill);
    }

+  // Valus are indicies into _Escapes::all.

"Values" and "indices"

+  enum class _Term_char : unsigned char {
+    _Tc_quote = 12,
+    _Tc_apos = 15
+  };
+
+  template<typename _CharT>
+    struct _Escapes
+    {
+      using _Str_view = basic_string_view<_CharT>;
+
+      static consteval
+      _Str_view _S_all()
+      { return _GLIBCXX_WIDEN("\t\\t\n\\n\r\\r\\\\\\\"\\\"'\\'\\u\\x"); }
+
+      static constexpr
+      _CharT _S_term(_Term_char __term)
+      { return _S_all()[static_cast<unsigned char>(__term)]; }
+
+      static consteval
+      _Str_view _S_tab()
+      { return _S_all().substr(0, 3); }
+
+      static consteval
+      _Str_view _S_newline()
+      { return _S_all().substr(3, 3); }
+
+      static consteval
+      _Str_view _S_return()
+      { return _S_all().substr(6, 3); }
+
+      static consteval
+      _Str_view _S_bslash()
+      { return _S_all().substr(9, 3); }
+
+      static consteval
+      _Str_view _S_quote()
+      { return _S_all().substr(12, 3); }
+
+      static consteval
+      _Str_view _S_apos()
+      { return _S_all().substr(15, 3); }
+
+      static consteval
+      _Str_view _S_u()
+      { return _S_all().substr(18, 2); }
+
+      static consteval
+      _Str_view _S_x()
+      { return _S_all().substr(20, 2); }
+    };
+
+  template<typename _CharT>
+    struct _Separators
+    {
+      using _Str_view = basic_string_view<_CharT>;
+
+      static consteval
+      _Str_view _S_all()
+      { return _GLIBCXX_WIDEN("{}"); }
+
+      static consteval
+      _Str_view _S_braces()
+      { return _S_all().substr(0, 2); }
+    };
+
+  template<typename _CharT>
+    constexpr bool __should_escape_ascii(_CharT __c, _Term_char __term)
+    {
+      using _Esc = _Escapes<_CharT>;
+      switch (__c)
+       {
+         case _Esc::_S_tab()[0]:
+         case _Esc::_S_newline()[0]:
+         case _Esc::_S_return()[0]:
+         case _Esc::_S_bslash()[0]:
+           return true;
+         case _Esc::_S_quote()[0]:
+           return __term == _Term_char::_Tc_quote;
+         case _Esc::_S_apos()[0]:
+           return __term == _Term_char::_Tc_apos;
+         default:
+           return (__c >= 0 && __c < 0x20) || __c == 0x7f;
+       };
+  }
+
+  // @pre __c <= 0x10FFFF
+  constexpr bool __should_escape_unicode(char32_t __c, bool __prev_esc)
+  {
+    if (__unicode::__should_escape_category(__c))
+      return __c != U' ';
+    if (!__prev_esc)
+      return false;
+    return __unicode::__grapheme_cluster_break_property(__c)
+            == __unicode::_Gcb_property::_Gcb_Extend;
+  }
+
+  using uint_least32_t = __UINT_LEAST32_TYPE__;
+  template<typename _Out, typename _CharT>
+    _Out
+    __write_escape_seq(_Out __out, uint_least32_t __val,
+                      basic_string_view<_CharT> __prefix)
+    {
+      using _Str_view = basic_string_view<_CharT>;
+      constexpr size_t __max = 8;
+      char __buf[__max];
+      const string_view __narrow(
+       __buf,
+       std::__to_chars_i<uint_least32_t>(__buf, __buf + __max, __val, 16).ptr);
+
+      __out = __format::__write(__out, __prefix);
+      *__out = _Separators<_CharT>::_S_braces()[0];
+      ++__out;
+      if constexpr (is_same_v<char, _CharT>)
+       __out = __format::__write(__out, __narrow);
+#ifdef _GLIBCXX_USE_WCHAR_T
+      else
+       {
+         _CharT __wbuf[__max];
+         const size_t __n = __narrow.size();
+         std::__to_wstring_numeric(__narrow.data(), __n, __wbuf);
+         __out = __format::__write(__out, _Str_view(__wbuf, __n));
+       }
+#endif
+      *__out = _Separators<_CharT>::_S_braces()[1];
+      return ++__out;
+    }
+
+  template<typename _Out, typename _CharT>
+    _Out
+    __write_escaped_char(_Out __out, _CharT __c)
+    {
+      using _UChar = make_unsigned_t<_CharT>;
+      using _Esc = _Escapes<_CharT>;
+      switch (__c)
+       {
+         case _Esc::_S_tab()[0]:
+           return __format::__write(__out, _Esc::_S_tab().substr(1, 2));
+         case _Esc::_S_newline()[0]:
+           return __format::__write(__out, _Esc::_S_newline().substr(1, 2));
+         case _Esc::_S_return()[0]:
+           return __format::__write(__out, _Esc::_S_return().substr(1, 2));
+         case _Esc::_S_bslash()[0]:
+           return __format::__write(__out, _Esc::_S_bslash().substr(1, 2));
+         case _Esc::_S_quote()[0]:
+           return __format::__write(__out, _Esc::_S_quote().substr(1, 2));
+         case _Esc::_S_apos()[0]:
+           return __format::__write(__out, _Esc::_S_apos().substr(1, 2));
+         default:
+           return __format::__write_escape_seq(__out,
+                               static_cast<_UChar>(__c),
+                                               _Esc::_S_u());
+       }
+    }
+
+  template<typename _CharT, typename _Out>
+    _Out
+    __write_escaped_ascii(_Out __out,
+                         basic_string_view<_CharT> __str,
+                         _Term_char __term)
+    {
+      using _Str_view = basic_string_view<_CharT>;
+      auto __first = __str.begin();
+      auto const __last = __str.end();
+      while (__first != __last)
+      {
+       auto __print = __first;
+       // assume anything outside ASCII is printable
+       while (__print != __last
+                && !__format::__should_escape_ascii(*__print, __term))
+         ++__print;
+
+       if (__print != __first)
+         __out = __format::__write(__out, _Str_view(__first, __print));
+
+       if (__print == __last)
+         return __out;
+
+       __first = __print;
+       __out = __format::__write_escaped_char(__out, *__first);
+       ++__first;
+      }
+      return __out;
+    }
+
+  template<typename _CharT, typename _Out>
+    _Out
+    __write_escaped_unicode(_Out __out,
+                           basic_string_view<_CharT> __str,
+                           _Term_char __term)
+    {
+      using _Str_view = basic_string_view<_CharT>;
+      using _UChar = make_unsigned_t<_CharT>;
+      using _Esc = _Escapes<_CharT>;
+
+      static constexpr char32_t __replace = U'\uFFFD';
+      static constexpr _Str_view __replace_rep = []
+        {
+         // N.B. "\uFFFD" is ill-formed if encoding is not unicode.
+          if constexpr (is_same_v<char, _CharT>)
+           return "\xEF\xBF\xBD";
+         else
+           return L"\xFFFD";
+        }();
+
+      __unicode::_Utf_view<char32_t, _Str_view> __v(std::move(__str));
+      auto __first = __v.begin();
+      auto const __last = __v.end();
+
+      bool __prev_esc = true;
+      while (__first != __last)
+       {
+         bool __esc_ascii = false;
+         bool __esc_unicode = false;
+         bool __esc_replace = false;
+         auto __should_escape = [&](auto const& __it)
+           {
+             if (*__it <= 0x7f)
+               return __esc_ascii
+                        = __format::__should_escape_ascii(*__it.base(), 
__term);
+             if (__format::__should_escape_unicode(*__it, __prev_esc))
+               return __esc_unicode = true;
+             if (*__it == __replace)
+               {
+                 _Str_view __units(__it.base(), __it._M_units());
+                 return __esc_replace = (__units != __replace_rep);
+               }
+             return false;
+           };
+
+         auto __print = __first;
+         while (__print != __last && !__should_escape(__print))
+           {
+             __prev_esc = false;
+             ++__print;
+           }
+
+         if (__print != __first)
+           __out = __format::__write(__out, _Str_view(__first.base(), 
__print.base()));
+
+         if (__print == __last)
+           return __out;
+
+         __first = __print;
+         if (__esc_ascii)
+           __out = __format::__write_escaped_char(__out, *__first.base());
+         else if (__esc_unicode)
+           __out = __format::__write_escape_seq(__out, *__first, _Esc::_S_u());
+         else // __esc_replace
+           for (_CharT __c : _Str_view(__first.base(), __first._M_units()))
+             __out = __format::__write_escape_seq(__out,
+                                                  static_cast<_UChar>(__c),
+                                                  _Esc::_S_x());
+         __prev_esc = true;
+         ++__first;
+
+       }
+      return __out;
+    }
+
+  template<typename _CharT, typename _Out>
+    _Out
+    __write_escaped(_Out __out,  basic_string_view<_CharT> __str, _Term_char 
__term)
+    {
+      *__out = _Escapes<_CharT>::_S_term(__term);
+      ++__out;
+
+      if constexpr (__unicode::__literal_encoding_is_unicode<_CharT>())
+       __out = __format::__write_escaped_unicode(__out, __str, __term);
+      else if constexpr (is_same_v<char, _CharT>
+                         && __unicode::__literal_encoding_is_extended_ascii())
+       __out = __format::__write_escaped_ascii(__out, __str, __term);
+      else
+       // TODO Handle non-ascii extended encoding
+       __out = __format::__write_escaped_ascii(__out, __str, __term);
+
+      *__out = _Escapes<_CharT>::_S_term(__term);
+      return ++__out;
+    }
+
  // A lightweight optional<locale>.
  struct _Optional_locale
  {
@@ -961,7 +1268,7 @@ namespace __format

        if (*__first == 's')
          ++__first;
-#if __cpp_lib_format_ranges
+#if __glibcxx_format_ranges // C++ >= 23 && HOSTED
        else if (*__first == '?')
          {
            __spec._M_type = _Pres_esc;
@@ -980,43 +1287,71 @@ namespace __format
        format(basic_string_view<_CharT> __s,
               basic_format_context<_Out, _CharT>& __fc) const
        {
-         if (_M_spec._M_type == _Pres_esc)
+         constexpr auto __term = __format::_Term_char::_Tc_quote;
+         const auto __write_direct = [&]
            {
-             // TODO: C++23 escaped string presentation
-           }
+             if (_M_spec._M_type == _Pres_esc)
+               return __format::__write_escaped(__fc.out(), __s, __term);
+             else
+               return __format::__write(__fc.out(), __s);
+           };

          if (_M_spec._M_width_kind == _WP_none
                && _M_spec._M_prec_kind == _WP_none)
-           return __format::__write(__fc.out(), __s);
+           return __write_direct();

-         size_t __estimated_width;
-         if constexpr (__unicode::__literal_encoding_is_unicode<_CharT>())
-           {
-             if (_M_spec._M_prec_kind != _WP_none)
-               {
-                 size_t __prec = _M_spec._M_get_precision(__fc);
-                 __estimated_width = __unicode::__truncate(__s, __prec);
-               }
-             else
-               __estimated_width = __unicode::__field_width(__s);
-           }
-         else
-           {
-             __s = __s.substr(0, _M_spec._M_get_precision(__fc));
-             __estimated_width = __s.size();
-           }
+         const size_t __prec =
+           _M_spec._M_prec_kind != _WP_none
+             ? _M_spec._M_get_precision(__fc)
+             : basic_string_view<_CharT>::npos;

-         return __format::__write_padded_as_spec(__s, __estimated_width,
+         const size_t __estimated_width = _S_trunc(__s, __prec);
+         // N.B. Escaping only increases width
+         if (_M_spec._M_get_width(__fc) <= __estimated_width
+               && _M_spec._M_prec_kind == _WP_none)
+            return __write_direct();
+
+         if (_M_spec._M_type != _Pres_esc)
+           return __format::__write_padded_as_spec(__s, __estimated_width,
+                                                   __fc, _M_spec);
+
+         __format::_Str_sink<_CharT> __sink;
+         __format::_Sink_iter<_CharT> __out(__sink);
+         __format::__write_escaped(__out, __s, __term);
+         basic_string_view<_CharT> __escaped(__sink.view().data(),
+                                             __sink.view().size());
+         const size_t __escaped_width = _S_trunc(__escaped, __prec);
+         // N.B. [tab:format.type.string] defines '?' as
+         // Copies the escaped string ([format.string.escaped]) to the output,
+         // so precision seem to appy to escaped string.
+         return __format::__write_padded_as_spec(__escaped, __escaped_width,
                                                  __fc, _M_spec);
        }

-#if __cpp_lib_format_ranges
+#if __glibcxx_format_ranges // C++ >= 23 && HOSTED
      constexpr void
      set_debug_format() noexcept
      { _M_spec._M_type = _Pres_esc; }
#endif

    private:
+      static size_t
+      _S_trunc(basic_string_view<_CharT>& __s, size_t __prec)
+      {
+       if constexpr (__unicode::__literal_encoding_is_unicode<_CharT>())
+        {
+          if (__prec != basic_string_view<_CharT>::npos)
+            return __unicode::__truncate(__s, __prec);
+          else
+            return __unicode::__field_width(__s);
+        }
+       else
+       {
+         __s = __s.substr(0, __prec);
+         return __s.size();
+       }
+      }
+
      _Spec<_CharT> _M_spec{};
    };

@@ -1120,7 +1455,7 @@ namespace __format
                ++__first;
              }
            break;
-#if __cpp_lib_format_ranges
+#if __glibcxx_format_ranges // C++ >= 23 && HOSTED
          case '?':
            if (__type == _AsChar)
              {
@@ -1272,7 +1607,7 @@ namespace __format
      _S_character_width(_CharT __c)
      {
        // N.B. single byte cannot encode charcter of width greater than 1
-       if constexpr (sizeof(_CharT) > 1u &&
+       if constexpr (sizeof(_CharT) > 1u &&
                        __unicode::__literal_encoding_is_unicode<_CharT>())
          return __unicode::__field_width(__c);
        else
@@ -1286,7 +1621,34 @@ namespace __format
        {
          return __format::__write_padded_as_spec({&__c, 1u},
                                                  _S_character_width(__c),
-                                                 __fc, _M_spec);
+                                                 __fc, _M_spec);
+       }
+
+      template<typename _Out>
+       typename basic_format_context<_Out, _CharT>::iterator
+       _M_format_character_escaped(_CharT __c,
+                                  basic_format_context<_Out, _CharT>& __fc) 
const
+       {
+         using _Esc = _Escapes<_CharT>;
+         constexpr auto __term = __format::_Term_char::_Tc_apos;
+         const basic_string_view<_CharT> __in(&__c, 1u);
+         if (_M_spec._M_get_width(__fc) <= 3u)
+           return __format::__write_escaped(__fc.out(), __in, __term);
+
+         _CharT __buf[12];
+         __format::_Fixedbuf_sink<_CharT> __sink(__buf);
+         __format::_Sink_iter<_CharT> __out(__sink);
+         __format::__write_escaped(__out, __in, __term);
+
+         const basic_string_view<_CharT> __escaped = __sink.view();
+         size_t __estimated_width;
+         if (__escaped[1] == _Esc::_S_bslash()[0]) // escape sequence
+           __estimated_width = __escaped.size();
+         else
+           __estimated_width = 2 + _S_character_width(__c);
+         return __format::__write_padded_as_spec(__escaped,
+                                                 __estimated_width,
+                                                 __fc, _M_spec);
        }

      template<typename _Int>
@@ -1973,15 +2335,12 @@ namespace __format
              || _M_f._M_spec._M_type == __format::_Pres_c)
            return _M_f._M_format_character(__u, __fc);
          else if (_M_f._M_spec._M_type == __format::_Pres_esc)
-           {
-             // TODO
-             return __fc.out();
-           }
+           return _M_f._M_format_character_escaped(__u, __fc);
          else
            return _M_f.format(static_cast<make_unsigned_t<_CharT>>(__u), __fc);
        }

-#if __cpp_lib_format_ranges
+#if __glibcxx_format_ranges // C++ >= 23 && HOSTED
      constexpr void
      set_debug_format() noexcept
      { _M_f._M_spec._M_type = __format::_Pres_esc; }
@@ -2012,15 +2371,12 @@ namespace __format
              || _M_f._M_spec._M_type == __format::_Pres_c)
            return _M_f._M_format_character(__u, __fc);
          else if (_M_f._M_spec._M_type == __format::_Pres_esc)
-           {
-             // TODO
-             return __fc.out();
-           }
+           return _M_f._M_format_character_escaped(__u, __fc);
          else
            return _M_f.format(static_cast<unsigned char>(__u), __fc);
        }

-#if __cpp_lib_format_ranges
+#if __glibcxx_format_ranges // C++ >= 23 && HOSTED
      constexpr void
      set_debug_format() noexcept
      { _M_f._M_spec._M_type = __format::_Pres_esc; }
@@ -2050,7 +2406,7 @@ namespace __format
        format(_CharT* __u, basic_format_context<_Out, _CharT>& __fc) const
        { return _M_f.format(__u, __fc); }

-#if __cpp_lib_format_ranges
+#if __glibcxx_format_ranges // C++ >= 23 && HOSTED
      constexpr void set_debug_format() noexcept { _M_f.set_debug_format(); }
#endif

@@ -2075,7 +2431,7 @@ namespace __format
               basic_format_context<_Out, _CharT>& __fc) const
        { return _M_f.format(__u, __fc); }

-#if __cpp_lib_format_ranges
+#if __glibcxx_format_ranges // C++ >= 23 && HOSTED
      constexpr void set_debug_format() noexcept { _M_f.set_debug_format(); }
#endif

@@ -2099,7 +2455,7 @@ namespace __format
               basic_format_context<_Out, _CharT>& __fc) const
        { return _M_f.format({__u, _Nm}, __fc); }

-#if __cpp_lib_format_ranges
+#if __glibcxx_format_ranges // C++ >= 23 && HOSTED
      constexpr void set_debug_format() noexcept { _M_f.set_debug_format(); }
#endif

@@ -2123,7 +2479,7 @@ namespace __format
               basic_format_context<_Out, char>& __fc) const
        { return _M_f.format(__u, __fc); }

-#if __cpp_lib_format_ranges
+#if __glibcxx_format_ranges // C++ >= 23 && HOSTED
      constexpr void set_debug_format() noexcept { _M_f.set_debug_format(); }
#endif

@@ -2148,7 +2504,7 @@ namespace __format
               basic_format_context<_Out, wchar_t>& __fc) const
        { return _M_f.format(__u, __fc); }

-#if __cpp_lib_format_ranges
+#if __glibcxx_format_ranges // C++ >= 23 && HOSTED
      constexpr void set_debug_format() noexcept { _M_f.set_debug_format(); }
#endif

@@ -2173,7 +2529,7 @@ namespace __format
               basic_format_context<_Out, char>& __fc) const
        { return _M_f.format(__u, __fc); }

-#if __cpp_lib_format_ranges
+#if __glibcxx_format_ranges // C++ >= 23 && HOSTED
      constexpr void set_debug_format() noexcept { _M_f.set_debug_format(); }
#endif

@@ -2198,7 +2554,7 @@ namespace __format
               basic_format_context<_Out, wchar_t>& __fc) const
        { return _M_f.format(__u, __fc); }

-#if __cpp_lib_format_ranges
+#if __glibcxx_format_ranges // C++ >= 23 && HOSTED
      constexpr void set_debug_format() noexcept { _M_f.set_debug_format(); }
#endif

@@ -2859,6 +3215,32 @@ namespace __format
      { return _Sink_iter<_CharT>(*this); }
    };

+
+  template<typename _CharT>
+    class _Fixedbuf_sink final : public _Sink<_CharT>
+    {
+      void
+      _M_overflow() override
+      {
+       __glibcxx_assert(false);
+       this->_M_rewind();
+      }
+
+    public:
+      [[__gnu__::__always_inline__]]
+      constexpr explicit
+      _Fixedbuf_sink(span<_CharT> __buf)
+       : _Sink<_CharT>(__buf)
+      { }
+
+      constexpr basic_string_view<_CharT>
+      view() const
+      {
+       auto __s = this->_M_used();
+       return basic_string_view<_CharT>(__s.data(), __s.size());
+      }
+    };
+
  // A sink with an internal buffer. This is used to implement concrete sinks.
  template<typename _CharT>
    class _Buf_sink : public _Sink<_CharT>
@@ -2993,13 +3375,6 @@ namespace __format
      }
    };

-  template<typename _CharT, typename _Alloc = allocator<_CharT>>
-    using _Str_sink
-      = _Seq_sink<basic_string<_CharT, char_traits<_CharT>, _Alloc>>;
-
-  // template<typename _CharT, typename _Alloc = allocator<_CharT>>
-    // using _Vec_sink = _Seq_sink<vector<_CharT, _Alloc>>;
-
  // A sink that writes to an output iterator.
  // Writes to a fixed-size buffer and then flushes to the output iterator
  // when the buffer fills up.
@@ -3675,17 +4050,17 @@ namespace __format
          return _M_visit([&__vis]<typename _Tp>(_Tp& __val) -> decltype(auto)
            {
              constexpr bool __user_facing = __is_one_of<_Tp,
-               monostate, bool, _CharT,
-               int, unsigned int, long long int, unsigned long long int,
-               float, double, long double,
-               const _CharT*, basic_string_view<_CharT>,
-               const void*, handle>::value;
+               monostate, bool, _CharT,
+               int, unsigned int, long long int, unsigned long long int,
+               float, double, long double,
+               const _CharT*, basic_string_view<_CharT>,
+               const void*, handle>::value;
             if constexpr (__user_facing)
               return std::forward<_Visitor>(__vis)(__val);
             else
               {
-                handle __h(__val);
-                return std::forward<_Visitor>(__vis)(__h);
+                handle __h(__val);
+                return std::forward<_Visitor>(__vis)(__h);
               }
           }, __type);
        }
@@ -4781,6 +5156,7 @@ namespace __format
    : __format::__range_default_formatter<format_kind<_Rg>, _Rg, _CharT>
    { };
#endif // C++23 formatting ranges
+#undef _GLIBCXX_WIDEN

_GLIBCXX_END_NAMESPACE_VERSION
} // namespace std
diff --git a/libstdc++-v3/testsuite/23_containers/vector/bool/format.cc 
b/libstdc++-v3/testsuite/23_containers/vector/bool/format.cc
index 16f6e86dcee..2586225dd05 100644
--- a/libstdc++-v3/testsuite/23_containers/vector/bool/format.cc
+++ b/libstdc++-v3/testsuite/23_containers/vector/bool/format.cc
@@ -3,7 +3,6 @@

#include <format>
#include <vector>
-#include <chrono> // For _Widen
#include <testsuite_hooks.h>

static_assert(!std::formattable<std::vector<bool>::reference, int>);
@@ -21,7 +20,7 @@ is_format_string_for(const char* str, Args&&... args)
  }
}

-#define WIDEN_(C, S) ::std::chrono::__detail::_Widen<C>(S, L##S)
+#define WIDEN_(C, S) ::std::__format::_Widen<C>(S, L##S)
#define WIDEN(S) WIDEN_(_CharT, S)

void
diff --git a/libstdc++-v3/testsuite/std/format/debug.cc 
b/libstdc++-v3/testsuite/std/format/debug.cc
new file mode 100644
index 00000000000..07cd1e0e349
--- /dev/null
+++ b/libstdc++-v3/testsuite/std/format/debug.cc
@@ -0,0 +1,454 @@
+// { dg-options "-fexec-charset=UTF-8 -fwide-exec-charset=UTF-32LE 
-DUNICODE_ENC" }
+// { dg-do run { target c++23 } }
+// { dg-add-options no_pch }
+
+#include <format>
+#include <testsuite_hooks.h>
+
+std::string
+fdebug(char t)
+{ return std::format("{:?}", t); }
+
+std::wstring
+fdebug(wchar_t t)
+{ return std::format(L"{:?}", t); }
+
+std::string
+fdebug(std::string_view t)
+{ return std::format("{:?}", t); }
+
+std::wstring
+fdebug(std::wstring_view t)
+{ return std::format(L"{:?}", t); }
+
+
+#define WIDEN_(C, S) ::std::__format::_Widen<C>(S, L##S)
+#define WIDEN(S) WIDEN_(_CharT, S)
+
+template<typename _CharT>
+void
+test_basic_escapes()
+{
+  std::basic_string<_CharT> res;
+
+  const auto tab = WIDEN("\t");
+  res = fdebug(tab);
+  VERIFY( res == WIDEN(R"("\t")") );
+  res = fdebug(tab[0]);
+  VERIFY( res == WIDEN(R"('\t')") );
+
+  const auto nline = WIDEN("\n");
+  res = fdebug(nline);
+  VERIFY( res == WIDEN(R"("\n")") );
+  res = fdebug(nline[0]);
+  VERIFY( res == WIDEN(R"('\n')") );
+
+  const auto carret = WIDEN("\r");
+  res = fdebug(carret);
+  VERIFY( res == WIDEN(R"("\r")") );
+  res = fdebug(carret[0]);
+  VERIFY( res == WIDEN(R"('\r')") );
+
+  const auto bslash = WIDEN("\\");
+  res = fdebug(bslash);
+  VERIFY( res == WIDEN(R"("\\")") );
+  res = fdebug(bslash[0]);
+  VERIFY( res == WIDEN(R"('\\')") );
+
+  const auto quote = WIDEN("\"");
+  res = fdebug(quote);
+  VERIFY( res == WIDEN(R"("\"")") );
+  res = fdebug(quote[0]);
+  VERIFY( res == WIDEN(R"('"')") );
+
+  const auto apos = WIDEN("\'");
+  res = fdebug(apos);
+  VERIFY( res == WIDEN(R"("'")") );
+  res = fdebug(apos[0]);
+  VERIFY( res == WIDEN(R"('\'')") );
+}
+
+template<typename _CharT>
+void
+test_ascii_escapes()
+{
+  std::basic_string<_CharT> res;
+
+  const auto in = WIDEN("\x10 abcde\x7f\t0123");
+  res = fdebug(in);
+  VERIFY( res == WIDEN(R"("\u{10} abcde\u{7f}\t0123")") );
+  res = fdebug(in[0]);
+  VERIFY( res == WIDEN(R"('\u{10}')") );
+  res = fdebug(in[1]);
+  VERIFY( res == WIDEN(R"(' ')") );
+  res = fdebug(in[2]);
+  VERIFY( res == WIDEN(R"('a')") );
+}
+
+template<typename _CharT>
+void
+test_extended_ascii()
+{
+  std::basic_string<_CharT> res;
+
+  const auto in = WIDEN("Åëÿ");
+  res = fdebug(in);
+  VERIFY( res == WIDEN(R"("Åëÿ")") );
+
+  static constexpr bool __test_characters
+#if UNICODE_ENC
+    = sizeof(_CharT) >= 2;
+#else // ISO8859-1
+    = true;
+#endif // UNICODE_ENC
+
+  if constexpr (__test_characters)
+  {
+    res = fdebug(in[0]);
+    VERIFY( res == WIDEN(R"('Å')") );
+    res = fdebug(in[1]);
+    VERIFY( res == WIDEN(R"('ë')") );
+    res = fdebug(in[2]);
+    VERIFY( res == WIDEN(R"('ÿ')") );
+  }
+}
+
+#if UNICODE_ENC
+template<typename _CharT>
+void
+test_unicode_escapes()
+{
+  std::basic_string<_CharT> res;
+
+  const auto in = WIDEN(
+    "\u008a"     // Cc, Control,             Line Tabulation Set,
+    "\u00ad"     // Cf, Format,              Soft Hyphen
+    "\u1d3d"     // Lm, Modifier letter,     Modifier Letter Capital Ou
+    "\u00a0"     // Zs, Space Separator,     No-Break Space (NBSP)
+    "\u2029"     // Zp, Paragraph Separator, Paragraph Separator
+    "\U0001f984" // So, Other Symbol,        Unicorn Face
+  );
+  const auto out = WIDEN("\""
+   R"(\u{8a})"
+   R"(\u{ad})"
+   "\u1d3d"
+   R"(\u{a0})"
+   R"(\u{2029})"
+   "\U0001f984"
+  "\"");
+
+  res = fdebug(in);
+  VERIFY( res == out );
+
+  if constexpr (sizeof(_CharT) >= 2)
+  {
+    res = fdebug(in[0]);
+    VERIFY( res == WIDEN(R"('\u{8a}')") );
+    res = fdebug(in[1]);
+    VERIFY( res == WIDEN(R"('\u{ad}')") );
+    res = fdebug(in[2]);
+    VERIFY( res == WIDEN("'\u1d3d'") );
+    res = fdebug(in[3]);
+    VERIFY( res == WIDEN(R"('\u{a0}')") );
+    res = fdebug(in[4]);
+    VERIFY( res == WIDEN(R"('\u{2029}')") );
+  }
+
+  if constexpr (sizeof(_CharT) >= 4)
+  {
+    res = fdebug(in[5]);
+    VERIFY( res == WIDEN("'\U0001f984'") );
+  }
+}
+
+template<typename _CharT>
+void
+test_grapheme_extend()
+{
+  std::basic_string<_CharT> res;
+
+  const auto vin = WIDEN("o\u0302\u0323");
+  res = fdebug(vin);
+  VERIFY( res == WIDEN("\"o\u0302\u0323\"") );
+
+  std::basic_string_view<_CharT> in = WIDEN("\t\u0302\u0323");
+  res = fdebug(in);
+  VERIFY( res == WIDEN(R"("\t\u{302}\u{323}")") );
+
+  res = fdebug(in.substr(1));
+  VERIFY( res == WIDEN(R"("\u{302}\u{323}")") );
+
+  if constexpr (sizeof(_CharT) >= 2)
+  {
+    res = fdebug(in[1]);
+    VERIFY( res == WIDEN(R"('\u{302}')") );
+  }
+}
+
+template<typename _CharT>
+void
+test_replacement_char()
+{
+  std::basic_string<_CharT> repl = WIDEN("\uFFFD");
+  std::basic_string<_CharT> res = fdebug(repl);
+  VERIFY( res == WIDEN("\"\uFFFD\"") );
+
+  repl = WIDEN("\uFFFD\uFFFD");
+  res = fdebug(repl);
+  VERIFY( res == WIDEN("\"\uFFFD\uFFFD\"") );
+}
+
+void
+test_ill_formed_utf8_seq()
+{
+  std::string_view seq = "\xf0\x9f\xa6\x84"; //  \U0001F984
+  std::string res;
+
+  res = fdebug(seq);
+  VERIFY( res == "\"\U0001F984\"" );
+
+  res = fdebug(seq.substr(1));
+  VERIFY( res == R"("\x{9f}\x{a6}\x{84}")" );
+
+  res = fdebug(seq.substr(2));
+  VERIFY( res == R"("\x{a6}\x{84}")" );
+
+  res = fdebug(seq[0]);
+  VERIFY( res == R"('\x{f0}')" );
+  res = fdebug(seq.substr(0, 1));
+  VERIFY( res == R"("\x{f0}")" );
+
+  res = fdebug(seq[1]);
+  VERIFY( res == R"('\x{9f}')" );
+  res = fdebug(seq.substr(1, 1));
+  VERIFY( res == R"("\x{9f}")" );
+
+  res = fdebug(seq[2]);
+  VERIFY( res == R"('\x{a6}')" );
+  res = fdebug(seq.substr(2, 1));
+  VERIFY( res == R"("\x{a6}")" );
+
+  res = fdebug(seq[3]);
+  VERIFY( res == R"('\x{84}')" );
+  res = fdebug(seq.substr(3, 1));
+  VERIFY( res == R"("\x{84}")" );
+}
+
+void
+test_ill_formed_utf32()
+{
+  std::wstring res;
+
+  wchar_t ic1 = static_cast<wchar_t>(0xff'ffff);
+  res = fdebug(ic1);
+  VERIFY( res == LR"('\x{ffffff}')" );
+
+  std::wstring is1(1, ic1);
+  res = fdebug(is1);
+  VERIFY( res == LR"("\x{ffffff}")" );
+
+  wchar_t ic2 = static_cast<wchar_t>(0xffff'ffff);
+  res = fdebug(ic2);
+  VERIFY( res == LR"('\x{ffffffff}')" );
+
+  std::wstring is2(1, ic2);
+  res = fdebug(is2);
+  VERIFY( res == LR"("\x{ffffffff}")" );
+}
+#endif // UNICODE_ENC
+
+template<typename _CharT>
+void
+test_fill()
+{
+  std::basic_string<_CharT> res;
+
+  std::basic_string_view<_CharT> in = WIDEN("a\t\x10\u00ad");
+  res = std::format(WIDEN("{:10?}"), in.substr(0, 1));
+  VERIFY( res == WIDEN(R"("a"       )") );
+
+  res = std::format(WIDEN("{:->10?}"), in.substr(1, 1));
+  VERIFY( res == WIDEN(R"(------"\t")") );
+
+  res = std::format(WIDEN("{:+<10?}"), in.substr(2, 1));
+  VERIFY( res == WIDEN(R"("\u{10}"++)") );
+
+
+  res = std::format(WIDEN("{:10?}"), in[0]);
+  VERIFY( res == WIDEN(R"('a'       )") );
+
+  res = std::format(WIDEN("{:->10?}"), in[1]);
+  VERIFY( res == WIDEN(R"(------'\t')") );
+
+  res = std::format(WIDEN("{:+<10?}"), in[2]);
+  VERIFY( res == WIDEN(R"('\u{10}'++)") );
+
+#if UNICODE_ENC
+  res = std::format(WIDEN("{:=^10?}"), in.substr(3));
+  VERIFY( res == WIDEN(R"(="\u{ad}"=)") );
+
+  // width is 2
+  std::basic_string_view<_CharT> in2 = WIDEN("\u1100");
+  res = std::format(WIDEN("{:*^10?}"), in2);
+  VERIFY( res == WIDEN("***\"\u1100\"***") );
+
+  if constexpr (sizeof(_CharT) >= 2)
+  {
+    res = std::format(WIDEN("{:=^10?}"), in[3]);
+    VERIFY( res == WIDEN(R"(='\u{ad}'=)") );
+
+    res = std::format(WIDEN("{:*^10?}"), in2[0]);
+    VERIFY( res == WIDEN("***'\u1100'***") );
+  }
+#endif // UNICODE_ENC
+}
+
+template<typename _CharT>
+void
+test_prec()
+{
+  std::basic_string<_CharT> res;
+  // with ? escpaed presentation is copied to ouput, same as source
+
+  std::basic_string_view<_CharT> in = WIDEN("a\t\x10\u00ad");
+  res = std::format(WIDEN("{:.2?}"), in.substr(0, 1));
+  VERIFY( res == WIDEN(R"("a)") );
+
+  res = std::format(WIDEN("{:.4?}"), in.substr(1, 1));
+  VERIFY( res == WIDEN(R"("\t")") );
+
+  res = std::format(WIDEN("{:.5?}"), in.substr(2, 1));
+  VERIFY( res == WIDEN(R"("\u{1)") );
+
+#if UNICODE_ENC
+  res = std::format(WIDEN("{:.10?}"), in.substr(3));
+  VERIFY( res == WIDEN(R"("\u{ad}")") );
+
+  std::basic_string_view<_CharT> in2 = WIDEN("\u1100");
+  res = std::format(WIDEN("{:.3?}"), in2);
+  VERIFY( res == WIDEN("\"\u1100") );
+#endif // UNICODE_ENC
+}
+
+void test_char_as_wchar()
+{
+  std::wstring res;
+
+  res = std::format(L"{:?}", 'a');
+  VERIFY( res == LR"('a')" );
+
+  res = std::format(L"{:?}", '\t');
+  VERIFY( res == LR"('\t')" );
+
+  res = std::format(L"{:+<10?}", '\x10');
+  VERIFY( res == LR"('\u{10}'++)" );
+}
+
+template<typename T>
+struct DebugWrapper
+{
+  T val;
+};
+
+template<typename T, typename CharT>
+struct std::formatter<DebugWrapper<T>, CharT>
+{
+  constexpr std::basic_format_parse_context<CharT>::iterator
+  parse(std::basic_format_parse_context<CharT>& pc)
+  {
+    auto out = under.parse(pc);
+    under.set_debug_format();
+    return out;
+  }
+
+  template<typename Out>
+  Out format(DebugWrapper<T> const& t,
+            std::basic_format_context<Out, CharT>& fc) const
+  { return under.format(t.val, fc); }
+
+private:
+  std::formatter<T, CharT> under;
+};
+
+template<typename _CharT, typename StrT>
+void
+test_formatter_str()
+{
+  _CharT buf[]{ 'a', 'b', 'c', 0 };
+  DebugWrapper<StrT> in{ buf };
+  std::basic_string<_CharT> res = std::format(WIDEN("{:?}"), in );
+  VERIFY( res == WIDEN(R"("abc")") );
+}
+
+template<typename _CharT>
+void
+test_formatter_arr()
+{
+  std::basic_string<_CharT> res;
+
+  DebugWrapper<_CharT[3]> in3{ 'a', 'b', 'c' };
+  res = std::format(WIDEN("{:?}"), in3 );
+  VERIFY( res == WIDEN(R"("abc")") );
+
+  // We print all characters, including null-terminator
+  DebugWrapper<_CharT[4]> in4{ 'a', 'b', 'c', 0 };
+  res = std::format(WIDEN("{:?}"), in4 );
+  VERIFY( res == WIDEN(R"("abc\u{0}")") );
+}
+
+template<typename _CharT, typename SrcT>
+void
+test_formatter_char()
+{
+  DebugWrapper<SrcT> in{ 'a' };
+  std::basic_string<_CharT> res = std::format(WIDEN("{:?}"), in);
+  VERIFY( res == WIDEN(R"('a')") );
+}
+
+template<typename CharT>
+void
+test_formatters()
+{
+  test_formatter_char<CharT, CharT>();
+  test_formatter_str<CharT, CharT*>();
+  test_formatter_str<CharT, const CharT*>();
+  test_formatter_str<CharT, std::basic_string<CharT>>();
+  test_formatter_str<CharT, std::basic_string_view<CharT>>();
+  test_formatter_arr<CharT>();
+}
+
+void
+test_formatters_c()
+{
+  test_formatters<char>();
+  test_formatters<wchar_t>();
+  test_formatter_char<wchar_t, char>();
+}
+
+int main()
+{
+  test_basic_escapes<char>();
+  test_basic_escapes<wchar_t>();
+  test_ascii_escapes<char>();
+  test_ascii_escapes<wchar_t>();
+  test_extended_ascii<char>();
+  test_extended_ascii<wchar_t>();
+
+#if UNICODE_ENC
+  test_unicode_escapes<char>();
+  test_unicode_escapes<wchar_t>();
+  test_grapheme_extend<char>();
+  test_grapheme_extend<wchar_t>();
+  test_replacement_char<char>();
+  test_replacement_char<wchar_t>();
+  test_ill_formed_utf8_seq();
+  test_ill_formed_utf32();
+#endif // UNICODE_ENC
+
+  test_fill<char>();
+  test_fill<wchar_t>();
+  test_prec<char>();
+  test_prec<wchar_t>();
+
+  test_formatters_c();
+}
diff --git a/libstdc++-v3/testsuite/std/format/debug_nounicode.cc 
b/libstdc++-v3/testsuite/std/format/debug_nounicode.cc
new file mode 100644
index 00000000000..5c03171d71a
--- /dev/null
+++ b/libstdc++-v3/testsuite/std/format/debug_nounicode.cc
@@ -0,0 +1,5 @@
+// { dg-options "-fexec-charset=ISO8859-1 -fwide-exec-charset=UTF-32LE" }
+// { dg-do run { target c++23 } }
+// { dg-add-options no_pch }
+
+#include "debug.cc"
diff --git a/libstdc++-v3/testsuite/std/format/parse_ctx.cc 
b/libstdc++-v3/testsuite/std/format/parse_ctx.cc
index b5dd7cdba78..b338ac7b762 100644
--- a/libstdc++-v3/testsuite/std/format/parse_ctx.cc
+++ b/libstdc++-v3/testsuite/std/format/parse_ctx.cc
@@ -108,7 +108,7 @@ is_std_format_spec_for(std::string_view spec)
  }
}

-#if __cpp_lib_format_ranges
+#if __glibcxx_format_ranges
constexpr bool escaped_strings_supported = true;
#else
constexpr bool escaped_strings_supported = false;
diff --git a/libstdc++-v3/testsuite/std/format/string.cc 
b/libstdc++-v3/testsuite/std/format/string.cc
index ee987a15ec3..76614d4bc3e 100644
--- a/libstdc++-v3/testsuite/std/format/string.cc
+++ b/libstdc++-v3/testsuite/std/format/string.cc
@@ -62,7 +62,7 @@ test_indexing()
  VERIFY( ! is_format_string_for("{} {0}", 1) );
}

-#if __cpp_lib_format_ranges
+#if __glibcxx_format_ranges
constexpr bool escaped_strings_supported = true;
#else
constexpr bool escaped_strings_supported = false;
--
2.49.0



Reply via email to