https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84110
Bug ID: 84110 Summary: Null character in regex Product: gcc Version: 8.0.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: abigail.buccaneer at gmail dot com Target Milestone: --- The following code, when compiled with libstdc++: #include <regex> int main() { auto r = std::regex{"\0", std::size_t{1}}; } ...results in std::regex_error being thrown. My reading of the ECMAScript regex spec says that this should be allowed, and that a null byte should match a literal null byte: PatternCharacter :: SourceCharacter but not one of ^ $ \ . * + ? ( ) [ ] { } | SourceCharacter :: any Unicode code unit (Elsewhere in the ECMAScript spec, it explicitly specifies that an unrelated grammar production is 'SourceCharacter but not one of " or \ or U+0000 through U+001F', so it makes sense to assume that SourceCharacter here very intentionally includes null.) Clang/libc++ seems to agree with this reading, and successfully compiles and runs the following: #include <cassert> #include <regex> int main() { auto null = std::string{"\0", std::size_t{1}}; std::smatch match_results; assert(std::regex_match(null, match_results, std::regex{null})); assert(match_results.position() == 0 && match_results.length() == 1 && match_results[0] == null); }