On Mon, Apr 7, 2025 at 10:34 AM Bruno Haible via Bug reports for GNU grep <bug-grep@gnu.org> wrote: > On > - GNU/Hurd x86_64 from 2024, > - GNU/Hurd i386 from 2023, > I see a test hang: hash-collision-perf. > > On GNU/Hurd x86_64: > > When I interrupted the build, the file 'in' has 5120000 lines, and > find attached the log file of this test. As you can see, the value of > small_ms stays 0 even for larger files. > > By running > $ date; LC_ALL=C ../../src/grep --file=in empty; date > I can see that the execution times grow like this: > 640000 0.3 sec > 1280000 0.9 sec > 2560000 1.5 sec > 5120000 > 60 sec > > On GNU/Hurd i386, it's similar. Here it's when the file 'in' has > 40960000 lines, that the grep execution hangs. Find attached the > last stack trace I was able to obtain before it hung. > > Regardless how much RAM I give to the machine, there will always > be a point where "grep --file=in empty" will take more RAM than > available, and (since Hurd does not have an OOM killer) the machine > then hangs. > > IMO, the correct behaviour would be that 'grep' exits via xalloc_die(), > not that it hangs. > > Whereas on GNU/Linux (in a machine that has the same amount of RAM as > the GNU/Hurd machine): > > $ : > empty > $ seq 640000 > in; LC_ALL=C time ./src/grep --file=in empty > real 0.44s > $ seq 1280000 > in; LC_ALL=C time ./src/grep --file=in empty > real 0.99s > $ seq 2560000 > in; LC_ALL=C time ./src/grep --file=in empty > real 2.22s > $ seq 5120000 > in; LC_ALL=C time ./src/grep --file=in empty > real 4.84s > $ seq 10240000 > in; LC_ALL=C time ./src/grep --file=in empty > real 24.19s > $ seq 20480000 > in; LC_ALL=C time ./src/grep --file=in empty > Killed > real 24.40s > > Here it was the OOM killer that saved the machine from hanging. > > So, IMO, there are two bugs: > > 1) When the allocation of the kwset takes more memory than available, > 'grep' should exit via xalloc_die(), instead of waiting to be killed > by the OOM killer. > > 2) In the 'hash-collision-perf' unit test: The use of a perl primitive > for measuring the execution time of a child process, that is not > properly ported to GNU/Hurd.
Thanks for reporting that! Adding a timeout should resolve this. Expect to push tomorrow:
gr-Hurd-hang.diff
Description: Binary data