Matt,

Here is codespace with ls -l /usr/local/bin

root@codespaces-e992b2:/workspaces/en525-712-81-sp25-amd-gpu-using-gem5-gem5bootcamp2024#
 ls -l /usr/local/bin
total 3274048
lrwxrwxrwx 1 root   root           36 Feb 26 00:48 actionlint -> 
/usr/local/lib/actionlint/actionlint
-rwxrwxr-x 1 ubuntu ubuntu   21869136 Feb 12 23:27 code
-rwxr-xr-x 1 root   root      2953216 Feb 26 00:48 compose-switch
lrwxrwxrwx 1 root   root           32 Feb 26 00:48 docker-compose -> 
/etc/alternatives/docker-compose
-rwxr-xr-x 1 root   root     73691062 Feb 26 00:48 docker-compose-v1
lrwxrwxrwx 1 root   root           23 Jul 25  2024 gem5 -> 
/usr/local/bin/gem5-chi
-rwxr-xr-x 1 root   root   1185554096 Jul 25  2024 gem5-chi
-rwxr-xr-x 1 root   root   1184136040 Jul 25  2024 gem5-mesi
-rwxr-xr-x 1 root   root    884400248 Jul 25  2024 gem5-vega
root@codespaces-e992b2:/workspaces/en525-712-81-sp25-amd-gpu-using-gem5-gem5bootcamp2024#

Since it returned with the binary, I assumed it would run. It did run after we 
compiled it explicitly.

As it turns out the stats.txt file has two sets of simulation runs documented. 
One completes after the CPU finishes, and the second completes after the CPU 
checks the result.

src/sim/syscall_emul.cc:74: warn: ignoring syscall mprotect(...)
GPU Kernel Completed dump and reset
src/sim/simulate.cc:199: info: Entering event queue @ 105637104000.  Starting 
simulation...
info: check result
PASSED!
breaking loop due to: exiting with last active thread context.
Ticks: 143303043500
Exiting because  exiting with last active thread context

Nick
From: Matt Sinclair <mattdsinclair.w...@gmail.com>
Sent: Sunday, March 2, 2025 4:17 PM
To: The gem5 Users mailing list <gem5-users@gem5.org>
Cc: Beser, Nicholas D. <nick.be...@jhuapl.edu>; Jason Lowe-Power 
<ja...@lowepower.com>
Subject: [EXT] Re: [gem5-users] Success with GPU

APL external email warning: Verify sender 
mattdsinclair.w...@gmail.com<mailto:mattdsinclair.w...@gmail.com> before 
clicking links or attachments



Hi Nick,

I'm not sure why you believe gem5-vega should be in /usr/local/bin? It's been a 
few months since I last looked at this codespace, but looking at the 
instructions here: 
https://github.com/gem5bootcamp/gem5-bootcamp-env/blob/51590ae00b0e451c9b6a8854addbb94128ab4cac/materials/developing-gem5-models/11-gpu/README.md#to-run-square-in-gem5-static-register-allocator,
 they do not seem to assume /usr/local/bin.  Instead, they are setting up the 
volume for docker for other folders.  Have you tried this command?  Of course, 
it's possible I'm wrong though about /usr/local/bin -- but Bobby or Jason would 
have to answer that.

Setting that aside, it looks like the instructions you have done are basically 
bypassing the prebuilt gem5-vega and building it yourself -- this is ultimately 
fine, and what my students do in my research group, but of course takes a bit 
longer.

What is the issue with the stats file exactly?  I guess you wrote your own CPU 
version of square and that version is not behaving as expected?  I am not an 
expert at the CPU part of gem5, but I'd need more information about how you 
disabled the CPU part to understand or try to look into this.  Likewise, what 
stats are you looking at for the CPU?

Thanks,
Matt

On Sun, Mar 2, 2025 at 2:54 PM Beser, Nicholas D. via gem5-users 
<gem5-users@gem5.org<mailto:gem5-users@gem5.org>> wrote:
Based on the discussion, It seems that docker can’t find the gem5-vega that is 
in the /usr/local/bin. I noticed that the instructions also had us building the 
VEGA_X86/gem5.opt binary with the following command:
                          I.               docker run --volume $(pwd):$(pwd) -w 
$(pwd) ghcr.io/gem5/gcn-gpu:v24-0<http://ghcr.io/gem5/gcn-gpu:v24-0> scons 
build/VEGA_X86/gem5.opt -j# (# is the number of cores on your X86 system)
I build VEGA_X86 in codespace. The following command afterwards was able to run 
the GPU square binary:
 docker run --volume $(pwd):$(pwd) -w $(pwd) 
ghcr.io/gem5/gcn-gpu:v24-0<http://ghcr.io/gem5/gcn-gpu:v24-0> 
gem5/build/VEGA_X86/gem5.opt gem5/configs/example/apu_se.py -n 3 -c 
gem5-resources/src/gpu/square/bin/square
The program exsecuted correctly and create a stats.txt file. I have sent this 
instruction to my class so they could proceed with the experiments using the 
GPU.
I do have a question about the results in the stats.txt file. We noticed that 
the program computed the square operation using the GPU, and then compared the 
result using a CPU only code. When one of my students disabled the CPU only 
code, he did not see a drop in the cpu instructions that would have 
corresponded to that loop. I have them looking at the stats.txt file for 
indications about what resources the GPU had used to perform the operation.

Nick
_______________________________________________
gem5-users mailing list -- gem5-users@gem5.org<mailto:gem5-users@gem5.org>
To unsubscribe send an email to 
gem5-users-le...@gem5.org<mailto:gem5-users-le...@gem5.org>
_______________________________________________
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org

Reply via email to