On Mon, Apr 7, 2025 at 10:34 AM Bruno Haible via Bug reports for GNU
grep <bug-grep@gnu.org> wrote:
> On
>   - GNU/Hurd x86_64 from 2024,
>   - GNU/Hurd i386 from 2023,
> I see a test hang: hash-collision-perf.
>
> On GNU/Hurd x86_64:
>
> When I interrupted the build, the file 'in' has 5120000 lines, and
> find attached the log file of this test. As you can see, the value of
> small_ms stays 0 even for larger files.
>
> By running
>   $ date; LC_ALL=C ../../src/grep --file=in empty; date
> I can see that the execution times grow like this:
>   640000  0.3 sec
>  1280000  0.9 sec
>  2560000  1.5 sec
>  5120000  > 60 sec
>
> On GNU/Hurd i386, it's similar. Here it's when the file 'in' has
> 40960000 lines, that the grep execution hangs. Find attached the
> last stack trace I was able to obtain before it hung.
>
> Regardless how much RAM I give to the machine, there will always
> be a point where "grep --file=in empty" will take more RAM than
> available, and (since Hurd does not have an OOM killer) the machine
> then hangs.
>
> IMO, the correct behaviour would be that 'grep' exits via xalloc_die(),
> not that it hangs.
>
> Whereas on GNU/Linux (in a machine that has the same amount of RAM as
> the GNU/Hurd machine):
>
>   $ : > empty
>   $ seq 640000 > in; LC_ALL=C time ./src/grep --file=in empty
>   real 0.44s
>   $ seq 1280000 > in; LC_ALL=C time ./src/grep --file=in empty
>   real 0.99s
>   $ seq 2560000 > in; LC_ALL=C time ./src/grep --file=in empty
>   real 2.22s
>   $ seq 5120000 > in; LC_ALL=C time ./src/grep --file=in empty
>   real 4.84s
>   $ seq 10240000 > in; LC_ALL=C time ./src/grep --file=in empty
>   real 24.19s
>   $ seq 20480000 > in; LC_ALL=C time ./src/grep --file=in empty
>   Killed
>   real 24.40s
>
> Here it was the OOM killer that saved the machine from hanging.
>
> So, IMO, there are two bugs:
>
>   1) When the allocation of the kwset takes more memory than available,
>      'grep' should exit via xalloc_die(), instead of waiting to be killed
>      by the OOM killer.
>
>   2) In the 'hash-collision-perf' unit test: The use of a perl primitive
>      for measuring the execution time of a child process, that is not
>      properly ported to GNU/Hurd.

Thanks for reporting that!
Adding a timeout should resolve this. Expect to push tomorrow:

Attachment: gr-Hurd-hang.diff
Description: Binary data

Reply via email to