https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66456

            Bug ID: 66456
           Summary: regex memory corruption on large input strings
           Product: gcc
           Version: 5.1.0
            Status: UNCONFIRMED
          Severity: major
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: morandidodo at gmail dot com
  Target Milestone: ---

I noticed a memory corruption with regex when using long input strings in
combination with multiple occurrences of a pattern. I built a simple testcase
to verify that.

regex_fault.cpp:

#include <regex>
#include <iostream>

int main()
{
    static const std::regex reFloats(R"(^(\s*-?\d+\.\d+)+\s*$)");

    std::string input;
    std::getline(std::cin, input);
    if(std::regex_match(input, reFloats))
        std::cout << "List of floats matched." << std::endl;

    return 0;
}

$ g++ -Wall -Wextra -std=c++11 regex_fault.cpp -o regex_fault
$ echo "0.000 1.000 2.000" | ./regex_fault 
List of floats matched.

$ OUT=""; for i in `seq 10000`; do OUT="$OUT 0.0000"; done; echo $OUT |
./regex_fault
Segmentation fault

The problem occurs also with the non-capture version of the regex. It does not
matter if I append the "-fno-strict-aliasing -fwrapv" flags to the compiler
options.
I do not know if this is a duplicate of 61582 or not.
I am using the last version of GCC supplied by Arch Linux on a x86_64 machine.

Reply via email to