On Tue, 16 Jul 2024 at 13:05, Jonathan Wakely <jwak...@redhat.com> wrote: > > On Fri, 12 Jul 2024 at 00:23, I wrote: > > > > I sent v1 of this patch in February, and it added the new symbols to > > libstdc++exp.a which meant users needed to use -lstdc++exp to format > > chrono types in C++23 mode. That was less than ideal. > > > > This v2 patch adds the new symbols to the main library, which means no > > extra step to get the new features, and we can enable them as a DR for > > C++20 mode. But that means we need new exports in the shared library, > > and so need to be more confident that the feature is stable and ready to > > go into the lib. > > > > I'm not 100% confident that we want to add a new, private facet to the > > std::locale, but it seems reasonable. And that's not exposed to users at > > all, as the two new symbols added to the library hide the creation and > > use of that facet. > > Here's v3, which fixes a missing export of the __sso_string constructors > and destructors, needed so that the old ABI can use the new function to > transcode a locale-specific string to UTF-8, with a std::string buffer. > > I haven't done so here, but we could keep a least recently used cache of > __encoding facets, so that repeatedly calling std::format with the same > locale doesn't need to keep re-checking the locale's encoding and then > re-opening the same iconv_t descriptor. > > This v3 patch also tweaks the commented out parts of > include/bits/version.def in preparation for enabling the C++26 <format> > features in the following patches in this series. > > Tested x86_64-linux. I think this is ready to push now, but I'll wait a > bit for any comments on it. > > -- >8 -- > > This implements the C++23 paper P2419R2 (Clarify handling of encodings > in localized formatting of chrono types). The requirement is that when > the literal encoding is "a Unicode encoding form" and the formatting > locale uses a different encoding, any locale-specific strings such as > "août" for std::chrono::August should be converted to the literal > encoding. > > Using the recently-added std::locale::encoding() function we can check > the locale's encoding and then use iconv if a conversion is needed. > Because nl_langinfo_l and iconv_open both allocate memory, a naive > implementation would perform multiple allocations and deallocations for > every snippet of locale-specific text that needs to be converted to > UTF-8. To avoid that, a new internal locale::facet is defined to store > the text_encoding and an iconv_t descriptor, which are then cached in > the formatting locale. This requires access to the internals of a > std::locale object in src/c++20/format.cc, so that new file needs to be > compiled with -fno-access-control, as well as -std=gnu++26 in order to > use std::text_encoding. > > Because the new std::text_encoding and std::locale::encoding() symbols > are only in the libstdc++exp.a archive, we need to include > src/c++26/text_encoding.cc in the main library, but not export its > symbols yet. This means they can be used by the two new functions which > are exported from the main library. > > The encoding conversions are done for C++20, treating it as a DR that > resolves LWG 3656. > > With this change we can increase the value of the __cpp_lib_format macro > for C++23. The value should be 202207 for P2419R2, but we already > implement P2510R3 (Formatting pointers) so can use the value 202304. > > libstdc++-v3/ChangeLog: > > PR libstdc++/109162 > * acinclude.m4 (libtool_VERSION): Update to 6:34:0. > * config/abi/pre/gnu.ver: Disambiguate old patters. Add new > GLIBCXX_3.4.34 symbol version and new exports. > * configure: Regenerate. > * include/bits/chrono_io.h (_ChronoSpec::_M_locale_specific): > Add new accessor functions to use a reserved bit in _Spec. > (__formatter_chrono::_M_parse): Use _M_locale_specific(true) > when chrono-specs contains locale-dependent conversion > specifiers. > (__formatter_chrono::_M_format): Open iconv descriptor if > conversion to UTF-8 will be needed. > (__formatter_chrono::_M_write): New function to write a > localized string with possible character conversion. > (__formatter_chrono::_M_a_A, __formatter_chrono::_M_b_B) > (__formatter_chrono::_M_p, __formatter_chrono::_M_r) > (__formatter_chrono::_M_x, __formatter_chrono::_M_X) > (__formatter_chrono::_M_locale_fmt): Use _M_write. > * include/bits/version.def (format): Update value. > * include/bits/version.h: Regenerate. > * include/std/format (_GLIBCXX_P2518R3): Check feature test > macro instead of __cplusplus. > (basic_format_context): Declare __formatter_chrono as friend. > * src/c++20/Makefile.am: Add new file. > * src/c++20/Makefile.in: Regenerate. > * src/c++20/format.cc: New file. > * testsuite/std/time/format_localized.cc: New test. > * testsuite/util/testsuite_abi.cc: Add new symbol version. > --- > libstdc++-v3/acinclude.m4 | 2 +- > libstdc++-v3/config/abi/pre/gnu.ver | 18 +- > libstdc++-v3/configure | 2 +- > libstdc++-v3/include/bits/chrono_io.h | 96 ++++++++-- > libstdc++-v3/include/bits/version.def | 29 ++- > libstdc++-v3/include/bits/version.h | 4 +- > libstdc++-v3/include/std/format | 16 +- > libstdc++-v3/src/c++20/Makefile.am | 8 +- > libstdc++-v3/src/c++20/Makefile.in | 10 +- > libstdc++-v3/src/c++20/format.cc | 174 ++++++++++++++++++ > .../testsuite/std/time/format_localized.cc | 47 +++++ > libstdc++-v3/testsuite/util/testsuite_abi.cc | 1 + > 12 files changed, 378 insertions(+), 29 deletions(-) > create mode 100644 libstdc++-v3/src/c++20/format.cc > create mode 100644 libstdc++-v3/testsuite/std/time/format_localized.cc > > diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4 > index e04aae25360..e4ed583b3ae 100644 > --- a/libstdc++-v3/acinclude.m4 > +++ b/libstdc++-v3/acinclude.m4 > @@ -4230,7 +4230,7 @@ changequote([,])dnl > fi > > # For libtool versioning info, format is CURRENT:REVISION:AGE > -libtool_VERSION=6:33:0 > +libtool_VERSION=6:34:0 > > # Everything parsed; figure out what files and settings to use. > case $enable_symvers in > diff --git a/libstdc++-v3/config/abi/pre/gnu.ver > b/libstdc++-v3/config/abi/pre/gnu.ver > index 31449b5b87b..ae79b371d80 100644 > --- a/libstdc++-v3/config/abi/pre/gnu.ver > +++ b/libstdc++-v3/config/abi/pre/gnu.ver > @@ -109,7 +109,11 @@ GLIBCXX_3.4 { > std::[j-k]*; > # std::length_error::l*; > # std::length_error::~l*; > - std::locale::[A-Za-e]*; > + # std::locale::[A-Za-d]*; > + std::locale::all; > + std::locale::classic*; > + std::locale::collate; > + std::locale::ctype; > std::locale::facet::[A-Za-z]*; > std::locale::facet::_S_get_c_locale*; > std::locale::facet::_S_clone_c_locale*; > @@ -168,7 +172,7 @@ GLIBCXX_3.4 { > std::strstream*; > std::strstreambuf*; > # std::t[a-q]*; > - std::t[a-g]*; > + std::terminate*; > std::th[a-h]*; > std::th[j-q]*; > std::th[s-z]*; > @@ -2528,6 +2532,16 @@ GLIBCXX_3.4.33 { > _ZNKSt12__basic_fileIcE13native_handleEv; > } GLIBCXX_3.4.32; > > +# GCC 15.1.0 > +GLIBCXX_3.4.34 { > + # std::__format::__with_encoding_conversion > + _ZNSt8__format26__with_encoding_conversionERKSt6locale; > + # std::__format::__locale_encoding_to_utf8 > + > _ZNSt8__format25__locale_encoding_to_utf8ERKSt6localeSt17basic_string_viewIcSt11char_traitsIcEEPv; > + # __sso_string constructor and destructor > + _ZNSt12__sso_string[CD][12]Ev; > +} GLIBCXX_3.4.33; > + > # Symbols in the support library (libsupc++) have their own tag. > CXXABI_1.3 { > > diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure > index 5645e991af7..fe525308ae2 100755 > --- a/libstdc++-v3/configure > +++ b/libstdc++-v3/configure > @@ -51040,7 +51040,7 @@ $as_echo "$as_me: WARNING: === Symbol versioning will > be disabled." >&2;} > fi > > # For libtool versioning info, format is CURRENT:REVISION:AGE > -libtool_VERSION=6:33:0 > +libtool_VERSION=6:34:0 > > # Everything parsed; figure out what files and settings to use. > case $enable_symvers in > diff --git a/libstdc++-v3/include/bits/chrono_io.h > b/libstdc++-v3/include/bits/chrono_io.h > index 72c66a0fef0..2f3ba89de61 100644 > --- a/libstdc++-v3/include/bits/chrono_io.h > +++ b/libstdc++-v3/include/bits/chrono_io.h > @@ -38,8 +38,10 @@ > #include <iomanip> // setw, setfill > #include <format> > #include <charconv> // from_chars > +#include <stdexcept> // __sso_string > > #include <bits/streambuf_iterator.h> > +#include <bits/unique_ptr.h> > > namespace std _GLIBCXX_VISIBILITY(default) > { > @@ -211,6 +213,20 @@ namespace __format > struct _ChronoSpec : _Spec<_CharT> > { > basic_string_view<_CharT> _M_chrono_specs; > + > + // Use one of the reserved bits in __format::_Spec<C>. > + // This indicates that a locale-dependent conversion specifier such as > + // %a is used in the chrono-specs. This is not the same as the > + // _Spec<C>::_M_localized member which indicates that "L" was present > + // in the format-spec, e.g. "{:L%a}" is localized and locale-specific, > + // but "{:L}" is only localized and "{:%a}" is only locale-specific. > + constexpr bool > + _M_locale_specific() const noexcept > + { return this->_M_reserved; } > + > + constexpr void > + _M_locale_specific(bool __b) noexcept > + { this->_M_reserved = __b; } > }; > > // Represents the information provided by a chrono type. > @@ -305,11 +321,12 @@ namespace __format > const auto __chrono_specs = __first++; // Skip leading '%' > if (*__chrono_specs != '%') > __throw_format_error("chrono format error: no '%' at start of " > - "chrono-specs"); > + "chrono-specs"); > > _CharT __mod{}; > bool __conv = true; > int __needed = 0; > + bool __locale_specific = false; > > while (__first != __last) > { > @@ -322,15 +339,18 @@ namespace __format > case 'a': > case 'A': > __needed = _Weekday; > + __locale_specific = true; > break; > case 'b': > case 'h': > case 'B': > __needed = _Month; > + __locale_specific = true; > break; > case 'c': > __needed = _DateTime; > __allowed_mods = _Mod_E; > + __locale_specific = true; > break; > case 'C': > __needed = _Year; > @@ -368,6 +388,8 @@ namespace __format > break; > case 'p': > case 'r': > + __locale_specific = true; > + [[fallthrough]]; > case 'R': > case 'T': > __needed = _TimeOfDay; > @@ -393,10 +415,12 @@ namespace __format > break; > case 'x': > __needed = _Date; > + __locale_specific = true; > __allowed_mods = _Mod_E; > break; > case 'X': > __needed = _TimeOfDay; > + __locale_specific = true; > __allowed_mods = _Mod_E; > break; > case 'y': > @@ -436,6 +460,8 @@ namespace __format > || (__mod == 'O' && !(__allowed_mods & _Mod_O))) > __throw_format_error("chrono format error: invalid " > " modifier in chrono-specs"); > + if (__mod && __c != 'z') > + __locale_specific = true; > __mod = _CharT(); > > if ((__parts & __needed) != __needed) > @@ -467,6 +493,7 @@ namespace __format > _M_spec = __spec; > _M_spec._M_chrono_specs > = __string_view(__chrono_specs, __first - __chrono_specs); > + _M_spec._M_locale_specific(__locale_specific); > > return __first; > } > @@ -486,6 +513,24 @@ namespace __format > if (__first == __last) > return _M_format_to_ostream(__t, __fc, __is_neg); > > +#if __glibcxx_format >= 202207L // C++ >= 23 > + // _GLIBCXX_RESOLVE_LIB_DEFECTS > + // 3565. Handling of encodings in localized formatting > + // of chrono types is underspecified > + if constexpr (is_same_v<_CharT, char>) > + if constexpr (__unicode::__literal_encoding_is_utf8()) > + if (_M_spec._M_localized && _M_spec._M_locale_specific()) > + { > + extern locale __with_encoding_conversion(const locale&); > + > + // Allocate and cache the necessary state to convert strings > + // in the locale's encoding to UTF-8. > + locale __loc = __fc.locale(); > + if (__loc != locale::classic()) > + __fc._M_loc = __with_encoding_conversion(__loc); > + } > +#endif > + > _Sink_iter<_CharT> __out; > __format::_Str_sink<_CharT> __sink; > bool __write_direct = false; > @@ -742,6 +787,30 @@ namespace __format > static constexpr _CharT _S_space = _S_chars[14]; > static constexpr const _CharT* _S_empty_spec = _S_chars + 15; > > + template<typename _OutIter> > + _OutIter > + _M_write(_OutIter __out, const locale& __loc, __string_view __s) const > + { > +#if __glibcxx_format >= 202207L // C++ >= 20 > + __sso_string __buf; > + // _GLIBCXX_RESOLVE_LIB_DEFECTS > + // 3565. Handling of encodings in localized formatting > + // of chrono types is underspecified > + if constexpr (is_same_v<_CharT, char>) > + if constexpr (__unicode::__literal_encoding_is_utf8()) > + if (_M_spec._M_localized && _M_spec._M_locale_specific() > + && __loc != locale::classic()) > + { > + extern string_view > + __locale_encoding_to_utf8(const std::locale&, string_view, > + void*); > + > + __s = __locale_encoding_to_utf8(__loc, __s, &__buf); > + } > +#endif > + return __format::__write(std::move(__out), __s); > + } > + > template<typename _Tp, typename _FormatContext> > typename _FormatContext::iterator > _M_a_A(const _Tp& __t, typename _FormatContext::iterator __out, > @@ -761,7 +830,7 @@ namespace __format > else > __tp._M_days_abbreviated(__days); > __string_view __str(__days[__wd.c_encoding()]); > - return __format::__write(std::move(__out), __str); > + return _M_write(std::move(__out), __loc, __str); > } > > template<typename _Tp, typename _FormatContext> > @@ -782,7 +851,7 @@ namespace __format > else > __tp._M_months_abbreviated(__months); > __string_view __str(__months[(unsigned)__m - 1]); > - return __format::__write(std::move(__out), __str); > + return _M_write(std::move(__out), __loc, __str); > } > > template<typename _Tp, typename _FormatContext> > @@ -1059,8 +1128,8 @@ namespace __format > const auto& __tp = use_facet<__timepunct<_CharT>>(__loc); > const _CharT* __ampm[2]; > __tp._M_am_pm(__ampm); > - return std::format_to(std::move(__out), _S_empty_spec, > - __ampm[__hms.hours().count() >= 12]); > + return _M_write(std::move(__out), __loc, > + __ampm[__hms.hours().count() >= 12]); > } > > template<typename _Tp, typename _FormatContext> > @@ -1095,8 +1164,9 @@ namespace __format > basic_string<_CharT> __fmt(_S_empty_spec); > __fmt.insert(1u, 1u, _S_colon); > __fmt.insert(2u, __ampm_fmt); > - return std::vformat_to(std::move(__out), __fmt, > - std::make_format_args<_FormatContext>(__t)); > + using _FmtStr = _Runtime_format_string<_CharT>; > + return _M_write(std::move(__out), __loc, > + std::format(__loc, _FmtStr(__fmt), __t)); > } > > template<typename _Tp, typename _FormatContext> > @@ -1279,8 +1349,9 @@ namespace __format > basic_string<_CharT> __fmt(_S_empty_spec); > __fmt.insert(1u, 1u, _S_colon); > __fmt.insert(2u, __rep); > - return std::vformat_to(std::move(__out), __fmt, > - std::make_format_args<_FormatContext>(__t)); > + using _FmtStr = _Runtime_format_string<_CharT>; > + return _M_write(std::move(__out), __loc, > + std::format(__loc, _FmtStr(__fmt), __t)); > } > > template<typename _Tp, typename _FormatContext> > @@ -1302,8 +1373,9 @@ namespace __format > basic_string<_CharT> __fmt(_S_empty_spec); > __fmt.insert(1u, 1u, _S_colon); > __fmt.insert(2u, __rep); > - return std::vformat_to(std::move(__out), __fmt, > - std::make_format_args<_FormatContext>(__t)); > + using _FmtStr = _Runtime_format_string<_CharT>; > + return _M_write(std::move(__out), __loc, > + std::format(__loc, _FmtStr(__fmt), __t)); > } > > template<typename _Tp, typename _FormatContext> > @@ -1580,7 +1652,7 @@ namespace __format > const auto& __tp = use_facet<time_put<_CharT>>(__loc); > __tp.put(__os, __os, _S_space, &__tm, __fmt, __mod); > if (__os) > - __out = __format::__write(std::move(__out), __os.view()); > + __out = _M_write(std::move(__out), __loc, __os.view()); > return __out; > } > }; > diff --git a/libstdc++-v3/include/bits/version.def > b/libstdc++-v3/include/bits/version.def > index 42cdef2f526..74947301760 100644 > --- a/libstdc++-v3/include/bits/version.def > +++ b/libstdc++-v3/include/bits/version.def > @@ -1161,16 +1161,22 @@ ftms = { > }; > > ftms = { > + name = format; > + // 202304 P2510R3 Formatting pointers > + // 202305 P2757R3 Type checking format args > + // 202306 P2637R3 Member visit > + // 202311 P2918R2 Runtime format strings II > + // values = { > + // v = 202304; > + // cxxmin = 26; > + // hosted = yes; > + // }; > // 201907 Text Formatting, Integration of chrono, printf corner cases. > // 202106 std::format improvements. > // 202110 Fixing locale handling in chrono formatters, generator-like > types. > // 202207 Encodings in localized formatting of chrono, basic-format-string. > - // 202207 P2286R8 Formatting Ranges > - // 202207 P2585R1 Improving default container formatting > - // TODO: #define __cpp_lib_format_ranges 202207L > - name = format; > values = { > - v = 202110; > + v = 202207; > cxxmin = 20; > hosted = yes; > }; > @@ -1374,6 +1380,19 @@ ftms = { > }; > }; > > +// ftms = { > + // name = format_ranges; > + // 202207 P2286R8 Formatting Ranges > + // 202207 P2585R1 Improving default container formatting > + // LWG3750 Too many papers bump __cpp_lib_format > + // TODO: #define __cpp_lib_format_ranges 202207L > + // values = { > + // v = 202207; > + // cxxmin = 23; > + // hosted = yes; > + // }; > +// }; > + > ftms = { > name = freestanding_algorithm; > values = { > diff --git a/libstdc++-v3/include/bits/version.h > b/libstdc++-v3/include/bits/version.h > index 1eaf3733bc2..9f8673395da 100644 > --- a/libstdc++-v3/include/bits/version.h > +++ b/libstdc++-v3/include/bits/version.h > @@ -1305,9 +1305,9 @@ > > #if !defined(__cpp_lib_format) > # if (__cplusplus >= 202002L) && _GLIBCXX_HOSTED > -# define __glibcxx_format 202110L > +# define __glibcxx_format 202207L > # if defined(__glibcxx_want_all) || defined(__glibcxx_want_format) > -# define __cpp_lib_format 202110L > +# define __cpp_lib_format 202207L > # endif > # endif > #endif /* !defined(__cpp_lib_format) && defined(__glibcxx_want_format) */ > diff --git a/libstdc++-v3/include/std/format b/libstdc++-v3/include/std/format > index 16cee0d3c74..a4921ce391b 100644 > --- a/libstdc++-v3/include/std/format > +++ b/libstdc++-v3/include/std/format > @@ -2342,10 +2342,10 @@ namespace __format > > // _GLIBCXX_RESOLVE_LIB_DEFECTS > // P2510R3 Formatting pointers > -#if __cplusplus > 202302L || ! defined __STRICT_ANSI__ > -#define _GLIBCXX_P2518R3 1 > +#if __glibcxx_format >= 202304L || ! defined __STRICT_ANSI__ > +# define _GLIBCXX_P2518R3 1 > #else > -#define _GLIBCXX_P2518R3 0 > +# define _GLIBCXX_P2518R3 0 > #endif > > #if _GLIBCXX_P2518R3 > @@ -3821,6 +3821,9 @@ namespace __format > __do_vformat_to(_Out, basic_string_view<_CharT>, > const basic_format_args<_Context>&, > const locale* = nullptr); > + > + template<typename _CharT> struct __formatter_chrono; > + > } // namespace __format > /// @endcond > > @@ -3831,6 +3834,11 @@ namespace __format > * this class template explicitly. For typical uses of `std::format` the > * library will use the specializations `std::format_context` (for `char`) > * and `std::wformat_context` (for `wchar_t`). > + * > + * You are not allowed to define partial or explicit specializations of > + * this class template. > + * > + * @since C++20 > */ > template<typename _Out, typename _CharT> > class basic_format_context > @@ -3863,6 +3871,8 @@ namespace __format > const basic_format_args<_Context2>&, > const locale*); > > + friend __format::__formatter_chrono<_CharT>; > + > public: > ~basic_format_context() = default; > > diff --git a/libstdc++-v3/src/c++20/Makefile.am > b/libstdc++-v3/src/c++20/Makefile.am > index a24505e5141..d0f7859290c 100644 > --- a/libstdc++-v3/src/c++20/Makefile.am > +++ b/libstdc++-v3/src/c++20/Makefile.am > @@ -36,7 +36,7 @@ else > inst_sources = > endif > > -sources = tzdb.cc > +sources = tzdb.cc format.cc > > vpath % $(top_srcdir)/src/c++20 > > @@ -53,6 +53,12 @@ tzdb.o: tzdb.cc tzdata.zi.h > $(CXXCOMPILE) -I. -c $< > endif > > +# This needs access to std::text_encoding and to the internals of > std::locale. > +format.lo: format.cc > + $(LTCXXCOMPILE) -std=gnu++26 -fno-access-control -c $< > +format.o: format.cc > + $(CXXCOMPILE) -std=gnu++26 -fno-access-control -c $< > + > if GLIBCXX_HOSTED > libc__20convenience_la_SOURCES = $(sources) $(inst_sources) > else > diff --git a/libstdc++-v3/src/c++20/Makefile.in > b/libstdc++-v3/src/c++20/Makefile.in > index 3ec8c5ce804..d759b8dcc7c 100644 > --- a/libstdc++-v3/src/c++20/Makefile.in > +++ b/libstdc++-v3/src/c++20/Makefile.in > @@ -121,7 +121,7 @@ CONFIG_CLEAN_FILES = > CONFIG_CLEAN_VPATH_FILES = > LTLIBRARIES = $(noinst_LTLIBRARIES) > libc__20convenience_la_LIBADD = > -am__objects_1 = tzdb.lo > +am__objects_1 = tzdb.lo format.lo > @ENABLE_EXTERN_TEMPLATE_TRUE@am__objects_2 = sstream-inst.lo > @GLIBCXX_HOSTED_TRUE@am_libc__20convenience_la_OBJECTS = \ > @GLIBCXX_HOSTED_TRUE@ $(am__objects_1) $(am__objects_2) > @@ -432,7 +432,7 @@ headers = > @ENABLE_EXTERN_TEMPLATE_TRUE@inst_sources = \ > @ENABLE_EXTERN_TEMPLATE_TRUE@ sstream-inst.cc > > -sources = tzdb.cc > +sources = tzdb.cc format.cc > @GLIBCXX_HOSTED_FALSE@libc__20convenience_la_SOURCES = > @GLIBCXX_HOSTED_TRUE@libc__20convenience_la_SOURCES = $(sources) > $(inst_sources) > > @@ -755,6 +755,12 @@ vpath % $(top_srcdir)/src/c++20 > @USE_STATIC_TZDATA_TRUE@tzdb.o: tzdb.cc tzdata.zi.h > @USE_STATIC_TZDATA_TRUE@ $(CXXCOMPILE) -I. -c $< > > +# This needs access to std::text_encoding and to the internals of > std::locale. > +format.lo: format.cc > + $(LTCXXCOMPILE) -std=gnu++26 -fno-access-control -c $< > +format.o: format.cc > + $(CXXCOMPILE) -std=gnu++26 -fno-access-control -c $< > + > # Tell versions [3.59,3.63) of GNU make to not export all variables. > # Otherwise a system limit (for SysV at least) may be exceeded. > .NOEXPORT: > diff --git a/libstdc++-v3/src/c++20/format.cc > b/libstdc++-v3/src/c++20/format.cc > new file mode 100644 > index 00000000000..507bac79e95 > --- /dev/null > +++ b/libstdc++-v3/src/c++20/format.cc > @@ -0,0 +1,174 @@ > +// Definitions for <chrono> formatting -*- C++ -*- > + > +// Copyright The GNU Toolchain Authors. > +// > +// This file is part of the GNU ISO C++ Library. This library is free > +// software; you can redistribute it and/or modify it under the > +// terms of the GNU General Public License as published by the > +// Free Software Foundation; either version 3, or (at your option) > +// any later version. > + > +// This library is distributed in the hope that it will be useful, > +// but WITHOUT ANY WARRANTY; without even the implied warranty of > +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +// GNU General Public License for more details. > + > +// Under Section 7 of GPL version 3, you are granted additional > +// permissions described in the GCC Runtime Library Exception, version > +// 3.1, as published by the Free Software Foundation. > + > +// You should have received a copy of the GNU General Public License and > +// a copy of the GCC Runtime Library Exception along with this program; > +// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see > +// <http://www.gnu.org/licenses/>. > + > +#define _GLIBCXX_USE_CXX11_ABI 1 > +#include "../c++26/text_encoding.cc" > + > +#if defined _GLIBCXX_USE_NL_LANGINFO_L && defined _GLIBCXX_HAVE_ICONV > +# include <format> > +# include <chrono> > +# include <memory> // make_unique > +# include <string.h> // strlen, strcpy > +# include <iconv.h> > +# include <errno.h> > +#endif > + > +namespace std > +{ > +_GLIBCXX_BEGIN_NAMESPACE_VERSION > +namespace __format > +{ > +// Helpers for P2419R2 > +// (Clarify handling of encodings in localized formatting of chrono types) > +// Convert a string from the locale's charset to UTF-8. > + > +namespace > +{ > +// A non-standard locale::facet that caches the locale's std::text_encoding > +// and an iconv descriptor for converting from that encoding to UTF-8. > +struct __encoding : locale::facet > +{ > + static locale::id id; > + > + explicit > + __encoding(const text_encoding& enc, size_t refs = 0) > + : facet(refs), _M_enc(enc) > + { > +#if defined _GLIBCXX_HAVE_ICONV > + if (enc != text_encoding::UTF8 && enc != text_encoding::ASCII) > + _M_cd = ::iconv_open("UTF-8", enc.name()); > +#endif > + } > + > + ~__encoding() > + { > +#if defined _GLIBCXX_HAVE_ICONV > + if (_M_has_desc()) > + ::iconv_close(_M_cd); > +#endif > + } > + > + bool _M_has_desc() const > + { > +#if defined _GLIBCXX_HAVE_ICONV > + return _M_cd != (::iconv_t)-1; > +#else > + return false; > +#endif > + } > + > + text_encoding _M_enc; > +#if defined _GLIBCXX_HAVE_ICONV > + ::iconv_t _M_cd = (::iconv_t)-1; > +#endif > +}; > + > +locale::id __encoding::id; > +} // namespace > + > +std::locale > +__with_encoding_conversion(const std::locale& loc) > +{ > +#if defined _GLIBCXX_USE_NL_LANGINFO_L && __CHAR_BIT__ == 8 > + if (std::__try_use_facet<__encoding>(loc)) > + return loc; > + > + string name = loc.name(); > + if (name == "C" || name == "*") > + return loc; > + > + text_encoding locenc = __locale_encoding(name.c_str()); > + > + if (locenc == text_encoding::UTF8 || locenc == text_encoding::ASCII > + || locenc == text_encoding::unknown) > + return loc; > + > + auto impl = std::make_unique<locale::_Impl>(*loc._M_impl, 1);
While looking into implementing the LRU cache mentioned above, I realised that this impl variable is unused. That's a leftover from an earlier attempt to solve this. I'll remove it. > + auto facetp = std::make_unique<__encoding>(locenc); > + locale loc2(loc, facetp.get()); // FIXME: PR libstdc++/113704 > + facetp.release(); > + // FIXME: Ideally we wouldn't need to reallocate this string again, > + // just don't delete[] it in the locale(locale, Facet*) constructor. > + if (const char* name = loc._M_impl->_M_names[0]) > + { > + loc2._M_impl->_M_names[0] = new char[strlen(name) + 1]; > + strcpy(loc2._M_impl->_M_names[0], name); > + } > + return loc2; > +#else > + return loc; > +#endif > +} > + > +string_view > +__locale_encoding_to_utf8(const std::locale& loc, string_view str, > + void* poutbuf) > +{ > +#if defined _GLIBCXX_USE_NL_LANGINFO_L && __CHAR_BIT__ == 8 \ > + && _GLIBCXX_HAVE_ICONV > + string& outbuf = *static_cast<string*>(poutbuf); > + // Don't need to use __try_use_facet with its dynamic_cast<__encoding*>, > + // since we know there are no types derived from __encoding. If the array > + // element is non-null, we have the facet. > + auto id = __encoding::id._M_id(); > + auto enc_facet = static_cast<const > __encoding*>(loc._M_impl->_M_facets[id]); > + if (!enc_facet || !enc_facet->_M_has_desc()) > + return str; > + > + size_t inbytesleft = str.size(); > + size_t written = 0; > + bool done = false; > + > + auto overwrite = [&](char* p, size_t n) { > + auto inbytes = const_cast<char*>(str.data()) + str.size() - inbytesleft; > + char* outbytes = p + written; > + size_t outbytesleft = n - written; > + size_t res = ::iconv(enc_facet->_M_cd, &inbytes, &inbytesleft, > + &outbytes, &outbytesleft); > + if (res == (size_t)-1) > + { > + if (errno != E2BIG) > + { > + done = true; > + return 0zu; > + } > + } > + else > + done = true; > + written = outbytes - p; > + return written; > + }; > + do > + outbuf.resize_and_overwrite(outbuf.capacity() + (inbytesleft * 3 / 2), > + overwrite); > + while (!done); > + if (outbuf.size()) > + str = outbuf; > +#endif // USE_NL_LANGINFO_L && CHAR_BIT == 8 && HAVE_ICONV > + > + return str; > +} > +} // namespace __format > +_GLIBCXX_END_NAMESPACE_VERSION > +} // namespace std > diff --git a/libstdc++-v3/testsuite/std/time/format_localized.cc > b/libstdc++-v3/testsuite/std/time/format_localized.cc > new file mode 100644 > index 00000000000..2e553110f03 > --- /dev/null > +++ b/libstdc++-v3/testsuite/std/time/format_localized.cc > @@ -0,0 +1,47 @@ > +// { dg-do run { target c++20 } } > +// { dg-require-namedlocale "ru_UA.koi8u" } > +// { dg-require-namedlocale "es_ES.ISO8859-1" } > +// { dg-require-namedlocale "fr_FR.ISO8859-1" } > +// { dg-require-effective-target cxx11_abi } > + > +// P2419R2 > +// Clarify handling of encodings in localized formatting of chrono types > + > +// Localized date-time strings such as "février" should be converted to UTF-8 > +// if the locale uses a different encoding. > + > +#include <chrono> > +#include <format> > +#include <testsuite_hooks.h> > + > +void > +test_ru() > +{ > + std::locale loc("ru_UA.koi8u"); > + auto s = std::format(loc, "День недели: {:L}", std::chrono::Monday); > + VERIFY( s == "День недели: Пн" ); > +} > + > +void > +test_es() > +{ > + std::locale loc(ISO_8859(1,es_ES)); > + auto s = std::format(loc, "Día de la semana: {:L%A %a}", > std::chrono::Wednesday); > + VERIFY( s == "Día de la semana: miércoles mié" ); > +} > + > +void > +test_fr() > +{ > + std::locale loc(ISO_8859(1,fr_FR)); > + auto s = std::format(loc, "Six mois après {0:L%b}, c'est {1:L%B}.", > + std::chrono::February, std::chrono::August); > + VERIFY( s == "Six mois après févr., c'est août." ); > +} > + > +int main() > +{ > + test_ru(); > + test_es(); > + test_fr(); > +} > diff --git a/libstdc++-v3/testsuite/util/testsuite_abi.cc > b/libstdc++-v3/testsuite/util/testsuite_abi.cc > index ec7c3df9ecc..ce9cda660fa 100644 > --- a/libstdc++-v3/testsuite/util/testsuite_abi.cc > +++ b/libstdc++-v3/testsuite/util/testsuite_abi.cc > @@ -215,6 +215,7 @@ check_version(symbol& test, bool added) > known_versions.push_back("GLIBCXX_3.4.31"); > known_versions.push_back("GLIBCXX_3.4.32"); > known_versions.push_back("GLIBCXX_3.4.33"); > + known_versions.push_back("GLIBCXX_3.4.34"); > known_versions.push_back("GLIBCXX_LDBL_3.4.31"); > known_versions.push_back("GLIBCXX_IEEE128_3.4.29"); > known_versions.push_back("GLIBCXX_IEEE128_3.4.30"); > -- > 2.45.2 >