On Tue, 16 Jul 2024 at 13:05, Jonathan Wakely <jwak...@redhat.com> wrote:
>
> On Fri, 12 Jul 2024 at 00:23, I wrote:
> >
> > I sent v1 of this patch in February, and it added the new symbols to
> > libstdc++exp.a which meant users needed to use -lstdc++exp to format
> > chrono types in C++23 mode. That was less than ideal.
> >
> > This v2 patch adds the new symbols to the main library, which means no
> > extra step to get the new features, and we can enable them as a DR for
> > C++20 mode. But that means we need new exports in the shared library,
> > and so need to be more confident that the feature is stable and ready to
> > go into the lib.
> >
> > I'm not 100% confident that we want to add a new, private facet to the
> > std::locale, but it seems reasonable. And that's not exposed to users at
> > all, as the two new symbols added to the library hide the creation and
> > use of that facet.
>
> Here's v3, which fixes a missing export of the __sso_string constructors
> and destructors, needed so that the old ABI can use the new function to
> transcode a locale-specific string to UTF-8, with a std::string buffer.
>
> I haven't done so here, but we could keep a least recently used cache of
> __encoding facets, so that repeatedly calling std::format with the same
> locale doesn't need to keep re-checking the locale's encoding and then
> re-opening the same iconv_t descriptor.
>
> This v3 patch also tweaks the commented out parts of
> include/bits/version.def in preparation for enabling the C++26 <format>
> features in the following patches in this series.
>
> Tested x86_64-linux. I think this is ready to push now, but I'll wait a
> bit for any comments on it.
>
> -- >8 --
>
> This implements the C++23 paper P2419R2 (Clarify handling of encodings
> in localized formatting of chrono types). The requirement is that when
> the literal encoding is "a Unicode encoding form" and the formatting
> locale uses a different encoding, any locale-specific strings such as
> "août" for std::chrono::August should be converted to the literal
> encoding.
>
> Using the recently-added std::locale::encoding() function we can check
> the locale's encoding and then use iconv if a conversion is needed.
> Because nl_langinfo_l and iconv_open both allocate memory, a naive
> implementation would perform multiple allocations and deallocations for
> every snippet of locale-specific text that needs to be converted to
> UTF-8. To avoid that, a new internal locale::facet is defined to store
> the text_encoding and an iconv_t descriptor, which are then cached in
> the formatting locale. This requires access to the internals of a
> std::locale object in src/c++20/format.cc, so that new file needs to be
> compiled with -fno-access-control, as well as -std=gnu++26 in order to
> use std::text_encoding.
>
> Because the new std::text_encoding and std::locale::encoding() symbols
> are only in the libstdc++exp.a archive, we need to include
> src/c++26/text_encoding.cc in the main library, but not export its
> symbols yet. This means they can be used by the two new functions which
> are exported from the main library.
>
> The encoding conversions are done for C++20, treating it as a DR that
> resolves LWG 3656.
>
> With this change we can increase the value of the __cpp_lib_format macro
> for C++23. The value should be 202207 for P2419R2, but we already
> implement P2510R3 (Formatting pointers) so can use the value 202304.
>
> libstdc++-v3/ChangeLog:
>
>         PR libstdc++/109162
>         * acinclude.m4 (libtool_VERSION): Update to 6:34:0.
>         * config/abi/pre/gnu.ver: Disambiguate old patters. Add new
>         GLIBCXX_3.4.34 symbol version and new exports.
>         * configure: Regenerate.
>         * include/bits/chrono_io.h (_ChronoSpec::_M_locale_specific):
>         Add new accessor functions to use a reserved bit in _Spec.
>         (__formatter_chrono::_M_parse): Use _M_locale_specific(true)
>         when chrono-specs contains locale-dependent conversion
>         specifiers.
>         (__formatter_chrono::_M_format): Open iconv descriptor if
>         conversion to UTF-8 will be needed.
>         (__formatter_chrono::_M_write): New function to write a
>         localized string with possible character conversion.
>         (__formatter_chrono::_M_a_A, __formatter_chrono::_M_b_B)
>         (__formatter_chrono::_M_p, __formatter_chrono::_M_r)
>         (__formatter_chrono::_M_x, __formatter_chrono::_M_X)
>         (__formatter_chrono::_M_locale_fmt): Use _M_write.
>         * include/bits/version.def (format): Update value.
>         * include/bits/version.h: Regenerate.
>         * include/std/format (_GLIBCXX_P2518R3): Check feature test
>         macro instead of __cplusplus.
>         (basic_format_context): Declare __formatter_chrono as friend.
>         * src/c++20/Makefile.am: Add new file.
>         * src/c++20/Makefile.in: Regenerate.
>         * src/c++20/format.cc: New file.
>         * testsuite/std/time/format_localized.cc: New test.
>         * testsuite/util/testsuite_abi.cc: Add new symbol version.
> ---
>  libstdc++-v3/acinclude.m4                     |   2 +-
>  libstdc++-v3/config/abi/pre/gnu.ver           |  18 +-
>  libstdc++-v3/configure                        |   2 +-
>  libstdc++-v3/include/bits/chrono_io.h         |  96 ++++++++--
>  libstdc++-v3/include/bits/version.def         |  29 ++-
>  libstdc++-v3/include/bits/version.h           |   4 +-
>  libstdc++-v3/include/std/format               |  16 +-
>  libstdc++-v3/src/c++20/Makefile.am            |   8 +-
>  libstdc++-v3/src/c++20/Makefile.in            |  10 +-
>  libstdc++-v3/src/c++20/format.cc              | 174 ++++++++++++++++++
>  .../testsuite/std/time/format_localized.cc    |  47 +++++
>  libstdc++-v3/testsuite/util/testsuite_abi.cc  |   1 +
>  12 files changed, 378 insertions(+), 29 deletions(-)
>  create mode 100644 libstdc++-v3/src/c++20/format.cc
>  create mode 100644 libstdc++-v3/testsuite/std/time/format_localized.cc
>
> diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
> index e04aae25360..e4ed583b3ae 100644
> --- a/libstdc++-v3/acinclude.m4
> +++ b/libstdc++-v3/acinclude.m4
> @@ -4230,7 +4230,7 @@ changequote([,])dnl
>  fi
>
>  # For libtool versioning info, format is CURRENT:REVISION:AGE
> -libtool_VERSION=6:33:0
> +libtool_VERSION=6:34:0
>
>  # Everything parsed; figure out what files and settings to use.
>  case $enable_symvers in
> diff --git a/libstdc++-v3/config/abi/pre/gnu.ver 
> b/libstdc++-v3/config/abi/pre/gnu.ver
> index 31449b5b87b..ae79b371d80 100644
> --- a/libstdc++-v3/config/abi/pre/gnu.ver
> +++ b/libstdc++-v3/config/abi/pre/gnu.ver
> @@ -109,7 +109,11 @@ GLIBCXX_3.4 {
>        std::[j-k]*;
>  #     std::length_error::l*;
>  #     std::length_error::~l*;
> -      std::locale::[A-Za-e]*;
> +      # std::locale::[A-Za-d]*;
> +      std::locale::all;
> +      std::locale::classic*;
> +      std::locale::collate;
> +      std::locale::ctype;
>        std::locale::facet::[A-Za-z]*;
>        std::locale::facet::_S_get_c_locale*;
>        std::locale::facet::_S_clone_c_locale*;
> @@ -168,7 +172,7 @@ GLIBCXX_3.4 {
>        std::strstream*;
>        std::strstreambuf*;
>  #     std::t[a-q]*;
> -      std::t[a-g]*;
> +      std::terminate*;
>        std::th[a-h]*;
>        std::th[j-q]*;
>        std::th[s-z]*;
> @@ -2528,6 +2532,16 @@ GLIBCXX_3.4.33 {
>      _ZNKSt12__basic_fileIcE13native_handleEv;
>  } GLIBCXX_3.4.32;
>
> +# GCC 15.1.0
> +GLIBCXX_3.4.34 {
> +    # std::__format::__with_encoding_conversion
> +    _ZNSt8__format26__with_encoding_conversionERKSt6locale;
> +    # std::__format::__locale_encoding_to_utf8
> +    
> _ZNSt8__format25__locale_encoding_to_utf8ERKSt6localeSt17basic_string_viewIcSt11char_traitsIcEEPv;
> +    # __sso_string constructor and destructor
> +    _ZNSt12__sso_string[CD][12]Ev;
> +} GLIBCXX_3.4.33;
> +
>  # Symbols in the support library (libsupc++) have their own tag.
>  CXXABI_1.3 {
>
> diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure
> index 5645e991af7..fe525308ae2 100755
> --- a/libstdc++-v3/configure
> +++ b/libstdc++-v3/configure
> @@ -51040,7 +51040,7 @@ $as_echo "$as_me: WARNING: === Symbol versioning will 
> be disabled." >&2;}
>  fi
>
>  # For libtool versioning info, format is CURRENT:REVISION:AGE
> -libtool_VERSION=6:33:0
> +libtool_VERSION=6:34:0
>
>  # Everything parsed; figure out what files and settings to use.
>  case $enable_symvers in
> diff --git a/libstdc++-v3/include/bits/chrono_io.h 
> b/libstdc++-v3/include/bits/chrono_io.h
> index 72c66a0fef0..2f3ba89de61 100644
> --- a/libstdc++-v3/include/bits/chrono_io.h
> +++ b/libstdc++-v3/include/bits/chrono_io.h
> @@ -38,8 +38,10 @@
>  #include <iomanip> // setw, setfill
>  #include <format>
>  #include <charconv> // from_chars
> +#include <stdexcept> // __sso_string
>
>  #include <bits/streambuf_iterator.h>
> +#include <bits/unique_ptr.h>
>
>  namespace std _GLIBCXX_VISIBILITY(default)
>  {
> @@ -211,6 +213,20 @@ namespace __format
>      struct _ChronoSpec : _Spec<_CharT>
>      {
>        basic_string_view<_CharT> _M_chrono_specs;
> +
> +      // Use one of the reserved bits in __format::_Spec<C>.
> +      // This indicates that a locale-dependent conversion specifier such as
> +      // %a is used in the chrono-specs. This is not the same as the
> +      // _Spec<C>::_M_localized member which indicates that "L" was present
> +      // in the format-spec, e.g. "{:L%a}" is localized and locale-specific,
> +      // but "{:L}" is only localized and "{:%a}" is only locale-specific.
> +      constexpr bool
> +      _M_locale_specific() const noexcept
> +      { return this->_M_reserved; }
> +
> +      constexpr void
> +      _M_locale_specific(bool __b) noexcept
> +      { this->_M_reserved = __b; }
>      };
>
>    // Represents the information provided by a chrono type.
> @@ -305,11 +321,12 @@ namespace __format
>           const auto __chrono_specs = __first++; // Skip leading '%'
>           if (*__chrono_specs != '%')
>             __throw_format_error("chrono format error: no '%' at start of "
> -                                    "chrono-specs");
> +                                "chrono-specs");
>
>           _CharT __mod{};
>           bool __conv = true;
>           int __needed = 0;
> +         bool __locale_specific = false;
>
>           while (__first != __last)
>             {
> @@ -322,15 +339,18 @@ namespace __format
>                 case 'a':
>                 case 'A':
>                   __needed = _Weekday;
> +                 __locale_specific = true;
>                   break;
>                 case 'b':
>                 case 'h':
>                 case 'B':
>                   __needed = _Month;
> +                 __locale_specific = true;
>                   break;
>                 case 'c':
>                   __needed = _DateTime;
>                   __allowed_mods = _Mod_E;
> +                 __locale_specific = true;
>                   break;
>                 case 'C':
>                   __needed = _Year;
> @@ -368,6 +388,8 @@ namespace __format
>                   break;
>                 case 'p':
>                 case 'r':
> +                 __locale_specific = true;
> +                 [[fallthrough]];
>                 case 'R':
>                 case 'T':
>                   __needed = _TimeOfDay;
> @@ -393,10 +415,12 @@ namespace __format
>                   break;
>                 case 'x':
>                   __needed = _Date;
> +                 __locale_specific = true;
>                   __allowed_mods = _Mod_E;
>                   break;
>                 case 'X':
>                   __needed = _TimeOfDay;
> +                 __locale_specific = true;
>                   __allowed_mods = _Mod_E;
>                   break;
>                 case 'y':
> @@ -436,6 +460,8 @@ namespace __format
>                     || (__mod == 'O' && !(__allowed_mods & _Mod_O)))
>                 __throw_format_error("chrono format error: invalid "
>                                      " modifier in chrono-specs");
> +             if (__mod && __c != 'z')
> +               __locale_specific = true;
>               __mod = _CharT();
>
>               if ((__parts & __needed) != __needed)
> @@ -467,6 +493,7 @@ namespace __format
>           _M_spec = __spec;
>           _M_spec._M_chrono_specs
>                  = __string_view(__chrono_specs, __first - __chrono_specs);
> +         _M_spec._M_locale_specific(__locale_specific);
>
>           return __first;
>         }
> @@ -486,6 +513,24 @@ namespace __format
>           if (__first == __last)
>             return _M_format_to_ostream(__t, __fc, __is_neg);
>
> +#if __glibcxx_format >= 202207L // C++ >= 23
> +         // _GLIBCXX_RESOLVE_LIB_DEFECTS
> +         // 3565. Handling of encodings in localized formatting
> +         //       of chrono types is underspecified
> +         if constexpr (is_same_v<_CharT, char>)
> +           if constexpr (__unicode::__literal_encoding_is_utf8())
> +             if (_M_spec._M_localized && _M_spec._M_locale_specific())
> +               {
> +                 extern locale __with_encoding_conversion(const locale&);
> +
> +                 // Allocate and cache the necessary state to convert strings
> +                 // in the locale's encoding to UTF-8.
> +                 locale __loc = __fc.locale();
> +                 if (__loc != locale::classic())
> +                   __fc._M_loc =  __with_encoding_conversion(__loc);
> +               }
> +#endif
> +
>           _Sink_iter<_CharT> __out;
>           __format::_Str_sink<_CharT> __sink;
>           bool __write_direct = false;
> @@ -742,6 +787,30 @@ namespace __format
>        static constexpr _CharT _S_space = _S_chars[14];
>        static constexpr const _CharT* _S_empty_spec = _S_chars + 15;
>
> +      template<typename _OutIter>
> +       _OutIter
> +       _M_write(_OutIter __out, const locale& __loc, __string_view __s) const
> +       {
> +#if __glibcxx_format >= 202207L // C++ >= 20
> +         __sso_string __buf;
> +         // _GLIBCXX_RESOLVE_LIB_DEFECTS
> +         // 3565. Handling of encodings in localized formatting
> +         //       of chrono types is underspecified
> +         if constexpr (is_same_v<_CharT, char>)
> +           if constexpr (__unicode::__literal_encoding_is_utf8())
> +             if (_M_spec._M_localized && _M_spec._M_locale_specific()
> +                   && __loc != locale::classic())
> +               {
> +                 extern string_view
> +                 __locale_encoding_to_utf8(const std::locale&, string_view,
> +                                           void*);
> +
> +                 __s = __locale_encoding_to_utf8(__loc, __s, &__buf);
> +               }
> +#endif
> +         return __format::__write(std::move(__out), __s);
> +       }
> +
>        template<typename _Tp, typename _FormatContext>
>         typename _FormatContext::iterator
>         _M_a_A(const _Tp& __t, typename _FormatContext::iterator __out,
> @@ -761,7 +830,7 @@ namespace __format
>           else
>             __tp._M_days_abbreviated(__days);
>           __string_view __str(__days[__wd.c_encoding()]);
> -         return __format::__write(std::move(__out), __str);
> +         return _M_write(std::move(__out), __loc, __str);
>         }
>
>        template<typename _Tp, typename _FormatContext>
> @@ -782,7 +851,7 @@ namespace __format
>           else
>             __tp._M_months_abbreviated(__months);
>           __string_view __str(__months[(unsigned)__m - 1]);
> -         return __format::__write(std::move(__out), __str);
> +         return _M_write(std::move(__out), __loc, __str);
>         }
>
>        template<typename _Tp, typename _FormatContext>
> @@ -1059,8 +1128,8 @@ namespace __format
>           const auto& __tp = use_facet<__timepunct<_CharT>>(__loc);
>           const _CharT* __ampm[2];
>           __tp._M_am_pm(__ampm);
> -         return std::format_to(std::move(__out), _S_empty_spec,
> -                               __ampm[__hms.hours().count() >= 12]);
> +         return _M_write(std::move(__out), __loc,
> +                         __ampm[__hms.hours().count() >= 12]);
>         }
>
>        template<typename _Tp, typename _FormatContext>
> @@ -1095,8 +1164,9 @@ namespace __format
>           basic_string<_CharT> __fmt(_S_empty_spec);
>           __fmt.insert(1u, 1u, _S_colon);
>           __fmt.insert(2u, __ampm_fmt);
> -         return std::vformat_to(std::move(__out), __fmt,
> -                                std::make_format_args<_FormatContext>(__t));
> +         using _FmtStr = _Runtime_format_string<_CharT>;
> +         return _M_write(std::move(__out), __loc,
> +                         std::format(__loc, _FmtStr(__fmt), __t));
>         }
>
>        template<typename _Tp, typename _FormatContext>
> @@ -1279,8 +1349,9 @@ namespace __format
>           basic_string<_CharT> __fmt(_S_empty_spec);
>           __fmt.insert(1u, 1u, _S_colon);
>           __fmt.insert(2u, __rep);
> -         return std::vformat_to(std::move(__out), __fmt,
> -                                std::make_format_args<_FormatContext>(__t));
> +         using _FmtStr = _Runtime_format_string<_CharT>;
> +         return _M_write(std::move(__out), __loc,
> +                         std::format(__loc, _FmtStr(__fmt), __t));
>         }
>
>        template<typename _Tp, typename _FormatContext>
> @@ -1302,8 +1373,9 @@ namespace __format
>           basic_string<_CharT> __fmt(_S_empty_spec);
>           __fmt.insert(1u, 1u, _S_colon);
>           __fmt.insert(2u, __rep);
> -         return std::vformat_to(std::move(__out), __fmt,
> -                                std::make_format_args<_FormatContext>(__t));
> +         using _FmtStr = _Runtime_format_string<_CharT>;
> +         return _M_write(std::move(__out), __loc,
> +                         std::format(__loc, _FmtStr(__fmt), __t));
>         }
>
>        template<typename _Tp, typename _FormatContext>
> @@ -1580,7 +1652,7 @@ namespace __format
>           const auto& __tp = use_facet<time_put<_CharT>>(__loc);
>           __tp.put(__os, __os, _S_space, &__tm, __fmt, __mod);
>           if (__os)
> -           __out = __format::__write(std::move(__out), __os.view());
> +           __out = _M_write(std::move(__out), __loc, __os.view());
>           return __out;
>         }
>      };
> diff --git a/libstdc++-v3/include/bits/version.def 
> b/libstdc++-v3/include/bits/version.def
> index 42cdef2f526..74947301760 100644
> --- a/libstdc++-v3/include/bits/version.def
> +++ b/libstdc++-v3/include/bits/version.def
> @@ -1161,16 +1161,22 @@ ftms = {
>  };
>
>  ftms = {
> +  name = format;
> +  // 202304 P2510R3 Formatting pointers
> +  // 202305 P2757R3 Type checking format args
> +  // 202306 P2637R3 Member visit
> +  // 202311 P2918R2 Runtime format strings II
> +  // values = {
> +    // v = 202304;
> +    // cxxmin = 26;
> +    // hosted = yes;
> +  // };
>    // 201907 Text Formatting, Integration of chrono, printf corner cases.
>    // 202106 std::format improvements.
>    // 202110 Fixing locale handling in chrono formatters, generator-like 
> types.
>    // 202207 Encodings in localized formatting of chrono, basic-format-string.
> -  // 202207 P2286R8 Formatting Ranges
> -  // 202207 P2585R1 Improving default container formatting
> -  // TODO: #define __cpp_lib_format_ranges 202207L
> -  name = format;
>    values = {
> -    v = 202110;
> +    v = 202207;
>      cxxmin = 20;
>      hosted = yes;
>    };
> @@ -1374,6 +1380,19 @@ ftms = {
>    };
>  };
>
> +// ftms = {
> +  // name = format_ranges;
> +  // 202207 P2286R8 Formatting Ranges
> +  // 202207 P2585R1 Improving default container formatting
> +  // LWG3750 Too many papers bump __cpp_lib_format
> +  // TODO: #define __cpp_lib_format_ranges 202207L
> +  // values = {
> +    // v = 202207;
> +    // cxxmin = 23;
> +    // hosted = yes;
> +  // };
> +// };
> +
>  ftms = {
>    name = freestanding_algorithm;
>    values = {
> diff --git a/libstdc++-v3/include/bits/version.h 
> b/libstdc++-v3/include/bits/version.h
> index 1eaf3733bc2..9f8673395da 100644
> --- a/libstdc++-v3/include/bits/version.h
> +++ b/libstdc++-v3/include/bits/version.h
> @@ -1305,9 +1305,9 @@
>
>  #if !defined(__cpp_lib_format)
>  # if (__cplusplus >= 202002L) && _GLIBCXX_HOSTED
> -#  define __glibcxx_format 202110L
> +#  define __glibcxx_format 202207L
>  #  if defined(__glibcxx_want_all) || defined(__glibcxx_want_format)
> -#   define __cpp_lib_format 202110L
> +#   define __cpp_lib_format 202207L
>  #  endif
>  # endif
>  #endif /* !defined(__cpp_lib_format) && defined(__glibcxx_want_format) */
> diff --git a/libstdc++-v3/include/std/format b/libstdc++-v3/include/std/format
> index 16cee0d3c74..a4921ce391b 100644
> --- a/libstdc++-v3/include/std/format
> +++ b/libstdc++-v3/include/std/format
> @@ -2342,10 +2342,10 @@ namespace __format
>
>  // _GLIBCXX_RESOLVE_LIB_DEFECTS
>  // P2510R3 Formatting pointers
> -#if __cplusplus > 202302L || ! defined __STRICT_ANSI__
> -#define _GLIBCXX_P2518R3 1
> +#if __glibcxx_format >= 202304L || ! defined __STRICT_ANSI__
> +# define _GLIBCXX_P2518R3 1
>  #else
> -#define _GLIBCXX_P2518R3 0
> +# define _GLIBCXX_P2518R3 0
>  #endif
>
>  #if _GLIBCXX_P2518R3
> @@ -3821,6 +3821,9 @@ namespace __format
>      __do_vformat_to(_Out, basic_string_view<_CharT>,
>                     const basic_format_args<_Context>&,
>                     const locale* = nullptr);
> +
> +  template<typename _CharT> struct __formatter_chrono;
> +
>  } // namespace __format
>  /// @endcond
>
> @@ -3831,6 +3834,11 @@ namespace __format
>     * this class template explicitly. For typical uses of `std::format` the
>     * library will use the specializations `std::format_context` (for `char`)
>     * and `std::wformat_context` (for `wchar_t`).
> +   *
> +   * You are not allowed to define partial or explicit specializations of
> +   * this class template.
> +   *
> +   * @since C++20
>     */
>    template<typename _Out, typename _CharT>
>      class basic_format_context
> @@ -3863,6 +3871,8 @@ namespace __format
>                                   const basic_format_args<_Context2>&,
>                                   const locale*);
>
> +      friend __format::__formatter_chrono<_CharT>;
> +
>      public:
>        ~basic_format_context() = default;
>
> diff --git a/libstdc++-v3/src/c++20/Makefile.am 
> b/libstdc++-v3/src/c++20/Makefile.am
> index a24505e5141..d0f7859290c 100644
> --- a/libstdc++-v3/src/c++20/Makefile.am
> +++ b/libstdc++-v3/src/c++20/Makefile.am
> @@ -36,7 +36,7 @@ else
>  inst_sources =
>  endif
>
> -sources = tzdb.cc
> +sources = tzdb.cc format.cc
>
>  vpath % $(top_srcdir)/src/c++20
>
> @@ -53,6 +53,12 @@ tzdb.o: tzdb.cc tzdata.zi.h
>         $(CXXCOMPILE) -I. -c $<
>  endif
>
> +# This needs access to std::text_encoding and to the internals of 
> std::locale.
> +format.lo: format.cc
> +       $(LTCXXCOMPILE) -std=gnu++26 -fno-access-control -c $<
> +format.o: format.cc
> +       $(CXXCOMPILE) -std=gnu++26 -fno-access-control -c $<
> +
>  if GLIBCXX_HOSTED
>  libc__20convenience_la_SOURCES = $(sources)  $(inst_sources)
>  else
> diff --git a/libstdc++-v3/src/c++20/Makefile.in 
> b/libstdc++-v3/src/c++20/Makefile.in
> index 3ec8c5ce804..d759b8dcc7c 100644
> --- a/libstdc++-v3/src/c++20/Makefile.in
> +++ b/libstdc++-v3/src/c++20/Makefile.in
> @@ -121,7 +121,7 @@ CONFIG_CLEAN_FILES =
>  CONFIG_CLEAN_VPATH_FILES =
>  LTLIBRARIES = $(noinst_LTLIBRARIES)
>  libc__20convenience_la_LIBADD =
> -am__objects_1 = tzdb.lo
> +am__objects_1 = tzdb.lo format.lo
>  @ENABLE_EXTERN_TEMPLATE_TRUE@am__objects_2 = sstream-inst.lo
>  @GLIBCXX_HOSTED_TRUE@am_libc__20convenience_la_OBJECTS =  \
>  @GLIBCXX_HOSTED_TRUE@  $(am__objects_1) $(am__objects_2)
> @@ -432,7 +432,7 @@ headers =
>  @ENABLE_EXTERN_TEMPLATE_TRUE@inst_sources = \
>  @ENABLE_EXTERN_TEMPLATE_TRUE@  sstream-inst.cc
>
> -sources = tzdb.cc
> +sources = tzdb.cc format.cc
>  @GLIBCXX_HOSTED_FALSE@libc__20convenience_la_SOURCES =
>  @GLIBCXX_HOSTED_TRUE@libc__20convenience_la_SOURCES = $(sources)  
> $(inst_sources)
>
> @@ -755,6 +755,12 @@ vpath % $(top_srcdir)/src/c++20
>  @USE_STATIC_TZDATA_TRUE@tzdb.o: tzdb.cc tzdata.zi.h
>  @USE_STATIC_TZDATA_TRUE@       $(CXXCOMPILE) -I. -c $<
>
> +# This needs access to std::text_encoding and to the internals of 
> std::locale.
> +format.lo: format.cc
> +       $(LTCXXCOMPILE) -std=gnu++26 -fno-access-control -c $<
> +format.o: format.cc
> +       $(CXXCOMPILE) -std=gnu++26 -fno-access-control -c $<
> +
>  # Tell versions [3.59,3.63) of GNU make to not export all variables.
>  # Otherwise a system limit (for SysV at least) may be exceeded.
>  .NOEXPORT:
> diff --git a/libstdc++-v3/src/c++20/format.cc 
> b/libstdc++-v3/src/c++20/format.cc
> new file mode 100644
> index 00000000000..507bac79e95
> --- /dev/null
> +++ b/libstdc++-v3/src/c++20/format.cc
> @@ -0,0 +1,174 @@
> +// Definitions for <chrono> formatting -*- C++ -*-
> +
> +// Copyright The GNU Toolchain Authors.
> +//
> +// This file is part of the GNU ISO C++ Library.  This library is free
> +// software; you can redistribute it and/or modify it under the
> +// terms of the GNU General Public License as published by the
> +// Free Software Foundation; either version 3, or (at your option)
> +// any later version.
> +
> +// This library is distributed in the hope that it will be useful,
> +// but WITHOUT ANY WARRANTY; without even the implied warranty of
> +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +// GNU General Public License for more details.
> +
> +// Under Section 7 of GPL version 3, you are granted additional
> +// permissions described in the GCC Runtime Library Exception, version
> +// 3.1, as published by the Free Software Foundation.
> +
> +// You should have received a copy of the GNU General Public License and
> +// a copy of the GCC Runtime Library Exception along with this program;
> +// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
> +// <http://www.gnu.org/licenses/>.
> +
> +#define _GLIBCXX_USE_CXX11_ABI 1
> +#include "../c++26/text_encoding.cc"
> +
> +#if defined _GLIBCXX_USE_NL_LANGINFO_L && defined _GLIBCXX_HAVE_ICONV
> +# include <format>
> +# include <chrono>
> +# include <memory>   // make_unique
> +# include <string.h> // strlen, strcpy
> +# include <iconv.h>
> +# include <errno.h>
> +#endif
> +
> +namespace std
> +{
> +_GLIBCXX_BEGIN_NAMESPACE_VERSION
> +namespace __format
> +{
> +// Helpers for P2419R2
> +// (Clarify handling of encodings in localized formatting of chrono types)
> +// Convert a string from the locale's charset to UTF-8.
> +
> +namespace
> +{
> +// A non-standard locale::facet that caches the locale's std::text_encoding
> +// and an iconv descriptor for converting from that encoding to UTF-8.
> +struct __encoding : locale::facet
> +{
> +  static locale::id id;
> +
> +  explicit
> +  __encoding(const text_encoding& enc, size_t refs = 0)
> +  : facet(refs), _M_enc(enc)
> +  {
> +#if defined _GLIBCXX_HAVE_ICONV
> +    if (enc != text_encoding::UTF8 && enc != text_encoding::ASCII)
> +      _M_cd = ::iconv_open("UTF-8", enc.name());
> +#endif
> +  }
> +
> +  ~__encoding()
> +  {
> +#if defined _GLIBCXX_HAVE_ICONV
> +    if (_M_has_desc())
> +      ::iconv_close(_M_cd);
> +#endif
> +  }
> +
> +  bool _M_has_desc() const
> +  {
> +#if defined _GLIBCXX_HAVE_ICONV
> +    return _M_cd != (::iconv_t)-1;
> +#else
> +    return false;
> +#endif
> +  }
> +
> +  text_encoding _M_enc;
> +#if defined _GLIBCXX_HAVE_ICONV
> +  ::iconv_t _M_cd = (::iconv_t)-1;
> +#endif
> +};
> +
> +locale::id __encoding::id;
> +} // namespace
> +
> +std::locale
> +__with_encoding_conversion(const std::locale& loc)
> +{
> +#if defined _GLIBCXX_USE_NL_LANGINFO_L && __CHAR_BIT__ == 8
> +  if (std::__try_use_facet<__encoding>(loc))
> +    return loc;
> +
> +  string name = loc.name();
> +  if (name == "C" || name == "*")
> +    return loc;
> +
> +  text_encoding locenc = __locale_encoding(name.c_str());
> +
> +  if (locenc == text_encoding::UTF8 || locenc == text_encoding::ASCII
> +     || locenc == text_encoding::unknown)
> +    return loc;
> +
> +  auto impl = std::make_unique<locale::_Impl>(*loc._M_impl, 1);

While looking into implementing the LRU cache mentioned above, I
realised that this impl variable is unused. That's a leftover from an
earlier attempt to solve this. I'll remove it.


> +  auto facetp = std::make_unique<__encoding>(locenc);
> +  locale loc2(loc, facetp.get()); // FIXME: PR libstdc++/113704
> +  facetp.release();
> +  // FIXME: Ideally we wouldn't need to reallocate this string again,
> +  // just don't delete[] it in the locale(locale, Facet*) constructor.
> +  if (const char* name = loc._M_impl->_M_names[0])
> +    {
> +      loc2._M_impl->_M_names[0] = new char[strlen(name) + 1];
> +      strcpy(loc2._M_impl->_M_names[0], name);
> +    }
> +  return loc2;
> +#else
> +  return loc;
> +#endif
> +}
> +
> +string_view
> +__locale_encoding_to_utf8(const std::locale& loc, string_view str,
> +                         void* poutbuf)
> +{
> +#if defined _GLIBCXX_USE_NL_LANGINFO_L && __CHAR_BIT__ == 8 \
> +  && _GLIBCXX_HAVE_ICONV
> +  string& outbuf = *static_cast<string*>(poutbuf);
> +  // Don't need to use __try_use_facet with its dynamic_cast<__encoding*>,
> +  // since we know there are no types derived from __encoding. If the array
> +  // element is non-null, we have the facet.
> +  auto id = __encoding::id._M_id();
> +  auto enc_facet = static_cast<const 
> __encoding*>(loc._M_impl->_M_facets[id]);
> +  if (!enc_facet || !enc_facet->_M_has_desc())
> +    return str;
> +
> +  size_t inbytesleft = str.size();
> +  size_t written = 0;
> +  bool done = false;
> +
> +  auto overwrite = [&](char* p, size_t n) {
> +    auto inbytes = const_cast<char*>(str.data()) + str.size() - inbytesleft;
> +    char* outbytes = p + written;
> +    size_t outbytesleft = n - written;
> +    size_t res = ::iconv(enc_facet->_M_cd, &inbytes, &inbytesleft,
> +                        &outbytes, &outbytesleft);
> +    if (res == (size_t)-1)
> +      {
> +       if (errno != E2BIG)
> +         {
> +           done = true;
> +           return 0zu;
> +         }
> +      }
> +    else
> +      done = true;
> +    written = outbytes - p;
> +    return written;
> +  };
> +  do
> +    outbuf.resize_and_overwrite(outbuf.capacity() + (inbytesleft * 3 / 2),
> +                               overwrite);
> +  while (!done);
> +  if (outbuf.size())
> +    str = outbuf;
> +#endif // USE_NL_LANGINFO_L && CHAR_BIT == 8 && HAVE_ICONV
> +
> +  return str;
> +}
> +} // namespace __format
> +_GLIBCXX_END_NAMESPACE_VERSION
> +} // namespace std
> diff --git a/libstdc++-v3/testsuite/std/time/format_localized.cc 
> b/libstdc++-v3/testsuite/std/time/format_localized.cc
> new file mode 100644
> index 00000000000..2e553110f03
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/std/time/format_localized.cc
> @@ -0,0 +1,47 @@
> +// { dg-do run { target c++20 } }
> +// { dg-require-namedlocale "ru_UA.koi8u" }
> +// { dg-require-namedlocale "es_ES.ISO8859-1" }
> +// { dg-require-namedlocale "fr_FR.ISO8859-1" }
> +// { dg-require-effective-target cxx11_abi }
> +
> +// P2419R2
> +// Clarify handling of encodings in localized formatting of chrono types
> +
> +// Localized date-time strings such as "février" should be converted to UTF-8
> +// if the locale uses a different encoding.
> +
> +#include <chrono>
> +#include <format>
> +#include <testsuite_hooks.h>
> +
> +void
> +test_ru()
> +{
> +  std::locale loc("ru_UA.koi8u");
> +  auto s = std::format(loc, "День недели: {:L}", std::chrono::Monday);
> +  VERIFY( s == "День недели: Пн" );
> +}
> +
> +void
> +test_es()
> +{
> +  std::locale loc(ISO_8859(1,es_ES));
> +  auto s = std::format(loc, "Día de la semana: {:L%A %a}", 
> std::chrono::Wednesday);
> +  VERIFY( s == "Día de la semana: miércoles mié" );
> +}
> +
> +void
> +test_fr()
> +{
> +  std::locale loc(ISO_8859(1,fr_FR));
> +  auto s = std::format(loc, "Six mois après {0:L%b}, c'est {1:L%B}.",
> +                      std::chrono::February, std::chrono::August);
> +  VERIFY( s == "Six mois après févr., c'est août." );
> +}
> +
> +int main()
> +{
> +  test_ru();
> +  test_es();
> +  test_fr();
> +}
> diff --git a/libstdc++-v3/testsuite/util/testsuite_abi.cc 
> b/libstdc++-v3/testsuite/util/testsuite_abi.cc
> index ec7c3df9ecc..ce9cda660fa 100644
> --- a/libstdc++-v3/testsuite/util/testsuite_abi.cc
> +++ b/libstdc++-v3/testsuite/util/testsuite_abi.cc
> @@ -215,6 +215,7 @@ check_version(symbol& test, bool added)
>        known_versions.push_back("GLIBCXX_3.4.31");
>        known_versions.push_back("GLIBCXX_3.4.32");
>        known_versions.push_back("GLIBCXX_3.4.33");
> +      known_versions.push_back("GLIBCXX_3.4.34");
>        known_versions.push_back("GLIBCXX_LDBL_3.4.31");
>        known_versions.push_back("GLIBCXX_IEEE128_3.4.29");
>        known_versions.push_back("GLIBCXX_IEEE128_3.4.30");
> --
> 2.45.2
>

Reply via email to