https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98677
Bug ID: 98677 Summary: std::regex constructor triggers valgrind under clang++ with undefined sanitizer; possible use-after-move Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: egor_suvorov at mail dot ru Target Milestone: --- Consider the following code: #include <regex> int main() { std::regex regex("x{2,}"); } If I compile and run it at Ubuntu 20.04 with clang++-10 -fsanitize=undefined -O2 -g a.cpp && valgrind ./a.out I get the following error: ==2367== Memcheck, a memory error detector ==2367== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==2367== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info ==2367== Command: ./a.out ==2367== ==2367== Conditional jump or move depends on uninitialised value(s) ==2367== at 0x45AC3C: std::__detail::_StateSeq<std::__cxx11::regex_traits<char> >::_M_clone() (regex_automaton.tcc:208) ==2367== by 0x4341EA: std::__detail::_Compiler<std::__cxx11::regex_traits<char> >::_M_quantifier() (regex_compiler.tcc:253) ==2367== by 0x432F67: std::__detail::_Compiler<std::__cxx11::regex_traits<char> >::_M_term() (regex_compiler.tcc:143) ==2367== by 0x432B9A: std::__detail::_Compiler<std::__cxx11::regex_traits<char> >::_M_alternative() (regex_compiler.tcc:123) ==2367== by 0x427E00: std::__detail::_Compiler<std::__cxx11::regex_traits<char> >::_M_disjunction() (regex_compiler.tcc:99) ==2367== by 0x42747E: std::__detail::_Compiler<std::__cxx11::regex_traits<char> >::_Compiler(char const*, char const*, std::locale const&, std::regex_constants::syntax_option_type) (regex_compiler.tcc:84) ==2367== by 0x427149: __compile_nfa<std::__cxx11::regex_traits<char>, const char *> (regex_compiler.h:183) ==2367== by 0x427149: std::__cxx11::basic_regex<char, std::__cxx11::regex_traits<char> >::basic_regex<char const*>(char const*, char const*, std::locale, std::regex_constants::syntax_option_type) (regex.h:763) ==2367== by 0x427025: basic_regex<const char *> (regex.h:507) ==2367== by 0x427025: basic_regex (regex.h:440) ==2367== by 0x427025: main (a.cpp:3) ==2367== ==2367== Conditional jump or move depends on uninitialised value(s) ==2367== at 0x45AC3C: std::__detail::_StateSeq<std::__cxx11::regex_traits<char> >::_M_clone() (regex_automaton.tcc:208) ==2367== by 0x434218: std::__detail::_Compiler<std::__cxx11::regex_traits<char> >::_M_quantifier() (regex_compiler.tcc:257) ==2367== by 0x432F67: std::__detail::_Compiler<std::__cxx11::regex_traits<char> >::_M_term() (regex_compiler.tcc:143) ==2367== by 0x432B9A: std::__detail::_Compiler<std::__cxx11::regex_traits<char> >::_M_alternative() (regex_compiler.tcc:123) ==2367== by 0x427E00: std::__detail::_Compiler<std::__cxx11::regex_traits<char> >::_M_disjunction() (regex_compiler.tcc:99) ==2367== by 0x42747E: std::__detail::_Compiler<std::__cxx11::regex_traits<char> >::_Compiler(char const*, char const*, std::locale const&, std::regex_constants::syntax_option_type) (regex_compiler.tcc:84) ==2367== by 0x427149: __compile_nfa<std::__cxx11::regex_traits<char>, const char *> (regex_compiler.h:183) ==2367== by 0x427149: std::__cxx11::basic_regex<char, std::__cxx11::regex_traits<char> >::basic_regex<char const*>(char const*, char const*, std::locale, std::regex_constants::syntax_option_type) (regex.h:763) ==2367== by 0x427025: basic_regex<const char *> (regex.h:507) ==2367== by 0x427025: basic_regex (regex.h:440) ==2367== by 0x427025: main (a.cpp:3) ==2367== ==2367== ==2367== HEAP SUMMARY: ==2367== in use at exit: 0 bytes in 0 blocks ==2367== total heap usage: 20 allocs, 20 frees, 76,776 bytes allocated ==2367== ==2367== All heap blocks were freed -- no leaks are possible ==2367== ==2367== Use --track-origins=yes to see where uninitialised values come from ==2367== For lists of detected and suppressed errors, rerun with: -s ==2367== ERROR SUMMARY: 3 errors from 2 contexts (suppressed: 0 from 0) Any of the following actions remove the error: replacing clang++ with g++, disabling -fsanitize=undefined, disabling -O2, switching to -stdlib=libc++. Versions are: clang version 10.0.0-4ubuntu1 Target: x86_64-pc-linux-gnu Thread model: posix InstalledDir: /usr/bin valgrind-3.15.0 libstdc++-10-dev/focal-updates,focal-security,now 10.2.0-5ubuntu1~20.04 amd64 [installed,automatic] A friend of mine suggested that it's probably caused by use-after-move of `__dup` in regex_automaton.tcc:206 (commit e45c41988bfd655b1df7cff8fcf111dc6fb732e3 at GitHub mirror) and vaguely suggested that maybe clang++ starts to implement some kind of destructive moves: auto __id = _M_nfa._M_insert_state(std::move(__dup)); __m[__u] = __id; if (__dup._M_has_alt())