Hi Anoop,

I'm glad that increasing -n helped.  It's hard to say what exactly the
problem is without digging in further, but often the ROCm stack will launch
additional processes to do a variety of things (e.g., check which version
of LLVM is being used).  In gem5, each of these require a separate CPU
thread context -- which increasing -n handles in SE mode.  So if I had to
guess, I would say that this is what is happening.

If you added gdb locally to your docker, and you built the docker properly,
then I would expect gdb to work with gem5.

Thanks,
Matt

On Wed, Aug 16, 2023 at 11:41 PM Anoop Mysore <mysan...@gmail.com> wrote:

> Thank you, Matt, having 10 CPUs (up from previous 3) in the simulated
> system seems to make it work! (At least, I don't see that error at that
> point anymore). Is "resource temporarily unavailable" commonly due to CPU
> count? Curious to know how you made that connection.
>
> Re gdb: I am indeed using a local docker build
> (gem5/util/dockerfiles/gcn-gpu) with an added gdb installation -- is that
> what you meant?
>
> Will send in a PR to the repo soon as I'm done :)
>
> On Wed, Aug 16, 2023, 5:03 PM Matt Sinclair <mattdsinclair.w...@gmail.com>
> wrote:
>
>> Hi Anoop,
>>
>> A few things here:
>>
>> - Regarding the original failure (at least the !FS part), this is
>> normally happening either because of the GPU Target ISA (e.g., gfx900) you
>> used in your Makefile (e.g., it is not supported) or because you didn't
>> properly specify what GPU ISA you are using when running the program.  So,
>> what is your command line for running this application and what ISA are you
>> specifying in your Makefile?
>> - If the "what()" is the real source of the error, then I think this
>> could be related to the number of CPU thread contexts you are running with
>> gem5.  What did you set "-n" to?
>> - Regarding gdb, @Matt P: did you remove gdb from what is installed in
>> the Docker a while back?  If so, I think Anoop would need to add it back
>> and create a local docker or something like that.
>> - Setting aside the above, it would be wonderful if you contribute the
>> CHAI benchmarks to gem5-resources once you get them working!  Please let us
>> know if we can do anything to help with that.
>>
>> Thanks,
>> Matt
>>
>> On Wed, Aug 16, 2023 at 9:51 AM Anoop Mysore via gem5-users <
>> gem5-users@gem5.org> wrote:
>>
>>> Curiously, running the gem5.debug executable with gdb within docker results
>>> in:
>>> Reading symbols from gem5/build/GCN3_X86/gem5.debug...
>>> (gdb) quit
>>> (the quit wasn't a command I provided, it just quits automatically). Is
>>> gdb working with gem5 GCN3 in Docker?
>>>
>>> I ran gem5.opt with ExecAll and SyscallAll debug flags, the debug tail
>>> and the simerr logs are attached.
>>> I don't see anything peculiar other than a tgkill syscall with a SIGABRT
>>> sent to a thread thereafter halting within a few instructions.
>>>
>>> On Tue, Aug 15, 2023 at 9:00 PM Anoop Mysore <mysan...@gmail.com> wrote:
>>>
>>>> I am trying to port CHAI benchmarks
>>>> <https://github.com/chai-benchmarks/chai>similarly to
>>>> gem5-resources/src/gpu/pannotia
>>>> <https://github.com/gem5/gem5-resources/tree/stable/src/gpu/pannotia>.
>>>> I was able to HIPify (through the perl script + some manual changes) all
>>>> the code files, and ran the BFS program. I see the following error message
>>>> at the point of launching the CPU threads here
>>>> <https://github.com/mysoreanoop/chai/blob/678c18fd551fbf12f4abbb05ab7164f1b588be68/HIP-U-gem5/BFS/main.cpp#L273>
>>>>  (fork
>>>> of HIPified CHAI). I do not see any of the prints from the CPU threads
>>>> which leads me to believe the error is to do with the threads not being
>>>> launched or a related error.
>>>>
>>>> (This looks related; incorporated the suggestion of linking against
>>>> -pthread: https://stackoverflow.com/a/6485728)
>>>>
>>>> The stderr log is below; any help is appreciated.
>>>> _________
>>>> ....
>>>> AM: Launching CPU
>>>> terminate called after throwing an instance of 'std::system_error'
>>>> what():  Resource temporarily unavailable
>>>> build/GCN3_X86/sim/faults.cc:60: panic: panic condition !FullSystem
>>>> occurred: fault (General-Protection) detected @ PC
>>>> (0x7ffff6afa941=>0x7ffff6afa942).(0=>1)
>>>> Memory Usage: 19704072 KBytes
>>>>
>>>> Program aborted at tick 441590522500
>>>> --- BEGIN LIBC BACKTRACE ---
>>>> gem5/build/GCN3_X86/gem5.opt(+0x550200)[0x55a709b31200]
>>>> gem5/build/GCN3_X86/gem5.opt(+0x57d46e)[0x55a709b5e46e]
>>>> /lib/x86_64-linux-gnu/libpthread.so.0(+0x14420)[0x7f18881a0420]
>>>> /lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb)[0x7f188734800b]
>>>> /lib/x86_64-linux-gnu/libc.so.6(abort+0x12b)[0x7f1887327859]
>>>> gem5/build/GCN3_X86/gem5.opt(+0x4be295)[0x55a709a9f295]
>>>> gem5/build/GCN3_X86/gem5.opt(+0x5f6169)[0x55a709bd7169]
>>>> gem5/build/GCN3_X86/gem5.opt(+0x9fd9ed)[0x55a709fde9ed]
>>>> gem5/build/GCN3_X86/gem5.opt(+0x15b1d10)[0x55a70ab92d10]
>>>> gem5/build/GCN3_X86/gem5.opt(+0x15b2fd5)[0x55a70ab93fd5]
>>>> gem5/build/GCN3_X86/gem5.opt(+0x15b5620)[0x55a70ab96620]
>>>> gem5/build/GCN3_X86/gem5.opt(+0x15b6348)[0x55a70ab97348]
>>>> gem5/build/GCN3_X86/gem5.opt(+0x15c2954)[0x55a70aba3954]
>>>> gem5/build/GCN3_X86/gem5.opt(+0x56a082)[0x55a709b4b082]
>>>> gem5/build/GCN3_X86/gem5.opt(+0x59e2c4)[0x55a709b7f2c4]
>>>> gem5/build/GCN3_X86/gem5.opt(+0x59e8a3)[0x55a709b7f8a3]
>>>> gem5/build/GCN3_X86/gem5.opt(+0x4ed462)[0x55a709ace462]
>>>> gem5/build/GCN3_X86/gem5.opt(+0x4af427)[0x55a709a90427]
>>>> /lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x2a8738)[0x7f1888459738]
>>>>
>>>> /lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x8dd8)[0x7f188822ef48]
>>>>
>>>> /lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0x8fb)[0x7f188837be3b]
>>>>
>>>> /lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x94)[0x7f1888459114]
>>>> /lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x74d6d)[0x7f1888225d6d]
>>>>
>>>> /lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x7d86)[0x7f188822def6]
>>>>
>>>> /lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0x8fb)[0x7f188837be3b]
>>>>
>>>> /lib/x86_64-linux-gnu/libpython3.8.so.1.0(PyEval_EvalCodeEx+0x42)[0x7f188837c1c2]
>>>>
>>>> /lib/x86_64-linux-gnu/libpython3.8.so.1.0(PyEval_EvalCode+0x1f)[0x7f188837c5af]
>>>> /lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x1cfbf1)[0x7f1888380bf1]
>>>> /lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x25f537)[0x7f1888410537]
>>>> /lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x74d6d)[0x7f1888225d6d]
>>>>
>>>> /lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x12fd)[0x7f188822746d]
>>>> /lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x8006b)[0x7f188823106b]
>>>> --- END LIBC BACKTRACE ---
>>>> Failed to execute default signal handler!
>>>> _________
>>>>
>>> _______________________________________________
>>> gem5-users mailing list -- gem5-users@gem5.org
>>> To unsubscribe send an email to gem5-users-le...@gem5.org
>>>
>>
_______________________________________________
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org

Reply via email to