Thank you, I will take a look at these.

Nick

From: Matt Sinclair <mattdsinclair.w...@gmail.com>
Sent: Sunday, March 2, 2025 5:31 PM
To: Beser, Nicholas D. <nick.be...@jhuapl.edu>
Cc: The gem5 Users mailing list <gem5-users@gem5.org>; Jason Lowe-Power 
<ja...@lowepower.com>
Subject: Re: [EXT] Re: [gem5-users] Success with GPU

APL external email warning: Verify sender 
mattdsinclair.w...@gmail.com<mailto:mattdsinclair.w...@gmail.com> before 
clicking links or attachments



Whoops, thanks.  I meant this one: 
https://github.com/gem5bootcamp/2024/tree/main/materials/04-GPU-model.  Note in 
these commands we actually specified the full path to gem5-vega in them (we 
also didn't need the docker in this case because of how Jason set things up).  
I do not know though if that same setup carried over to yours or not though.  
If not, then the comments from my previous email about the volume would be my 
suggestion on how to proceed there.

Matt

On Sun, Mar 2, 2025 at 4:20 PM Beser, Nicholas D. 
<nick.be...@jhuapl.edu<mailto:nick.be...@jhuapl.edu>> wrote:
Matt,

No I did not run those commands. I had not seen them before.  They are not 
setup for the repository I am working from (they look like they are setup for 
bootcamp 2022).

Nick

From: Matt Sinclair 
<mattdsinclair.w...@gmail.com<mailto:mattdsinclair.w...@gmail.com>>
Sent: Sunday, March 2, 2025 4:49 PM
To: Beser, Nicholas D. <nick.be...@jhuapl.edu<mailto:nick.be...@jhuapl.edu>>
Cc: The gem5 Users mailing list 
<gem5-users@gem5.org<mailto:gem5-users@gem5.org>>; Jason Lowe-Power 
<ja...@lowepower.com<mailto:ja...@lowepower.com>>
Subject: Re: [EXT] Re: [gem5-users] Success with GPU

APL external email warning: Verify sender 
mattdsinclair.w...@gmail.com<mailto:mattdsinclair.w...@gmail.com> before 
clicking links or attachments



Hi Nick,

In regards to gem5-vega-se, the ls you sent before shows it does not exist in 
/usr/local/bin, so it's not surprising that didn't work.

In regards to docker more broadly though, setting up volumes is tricky.  The 
mental model I always have is: with this volume I specify, is everything I need 
accessible or not?  When you "only" specify /usr/local/bin, everything not in 
/usr/local/bin will not be part of the docker's volume when it runs -- which is 
why Python (for example) cannot be found in that case.  So, you'd need to setup 
a volume(s) with all of the files you need accessible in order to avoid the 
Python error.

This is why I had asked if you ran the same commands as specified here: 
https://github.com/gem5bootcamp/gem5-bootcamp-env/blob/51590ae00b0e451c9b6a8854addbb94128ab4cac/materials/developing-gem5-models/11-gpu/README.md#to-run-square-in-gem5-static-register-allocator,
 because I believe they were set up so all of this is handled for you in the 
bootcamp.

Matt

On Sun, Mar 2, 2025 at 3:37 PM Beser, Nicholas D. 
<nick.be...@jhuapl.edu<mailto:nick.be...@jhuapl.edu>> wrote:
Matt,

I really appreciate your help with this section.

I had tried explicitly specifying the user/local/bin based on a readme. Doc 
that was listed when codespace started:

55  docker run -v $PWD:$PWD -v /usr/local/bin:/usr/local/bin -w $PWD 
ghcr.io/gem5/gcn-gpu:v24-0<http://ghcr.io/gem5/gcn-gpu:v24-0> gem5-vega-se 
gem5/configs/example/apu_se.py -n 3 -c square
   56  docker run -v $PWD:$PWD -v /usr/local/bin:/usr/local/bin -w $PWD 
ghcr.io/gem5/gcn-gpu:v24-0<http://ghcr.io/gem5/gcn-gpu:v24-0> gem5-vega 
gem5/configs/example/apu_se.py -n 3 -c square

Gem5-vega-se did not exist, and the second run produced the following error 
message:

docker run -v $PWD:$PWD -v /usr/local/bin:/usr/local/bin -w $PWD 
ghcr.io/gem5/gcn-gpu:v24-0<http://ghcr.io/gem5/gcn-gpu:v24-0> gem5-vega 
gem5/configs/example/apu_se.py -n 3 -c square
gem5-vega: error while loading shared libraries: libpython3.12.so.1.0: cannot 
open shared object file: No such file or directory

Nick

From: Matt Sinclair 
<mattdsinclair.w...@gmail.com<mailto:mattdsinclair.w...@gmail.com>>
Sent: Sunday, March 2, 2025 4:30 PM
To: Beser, Nicholas D. <nick.be...@jhuapl.edu<mailto:nick.be...@jhuapl.edu>>
Cc: The gem5 Users mailing list 
<gem5-users@gem5.org<mailto:gem5-users@gem5.org>>; Jason Lowe-Power 
<ja...@lowepower.com<mailto:ja...@lowepower.com>>
Subject: Re: [EXT] Re: [gem5-users] Success with GPU

APL external email warning: Verify sender 
mattdsinclair.w...@gmail.com<mailto:mattdsinclair.w...@gmail.com> before 
clicking links or attachments



Hi Nick,

Did you try setting up the docker volume to explicitly include /usr/local/bin 
then?  The reason (likely) why it worked after you compiled gem5.opt explicitly 
is that the folder you built it in was explicitly included in the docker 
volume.  Unless you are saying you built gem5.opt for VEGA_X86 in 
/usr/local/bin and it only worked after you did this?

It sounds like you are using (or gem5 is using) m5 dumpreset stats somewhere in 
the run (e.g., here: 
https://github.com/gem5/gem5/blob/develop/configs/example/apu_se.py#L1078) and 
this is causing the separate stats outputs.  But even if that is the case, a) 
you should see the print on line 1077 of apu_se.py in your simout to verify 
this happened and b) what is the issue with the stats?  Just that there are two 
pieces?

Thanks,
Matt

On Sun, Mar 2, 2025 at 3:23 PM Beser, Nicholas D. 
<nick.be...@jhuapl.edu<mailto:nick.be...@jhuapl.edu>> wrote:
Matt,

Here is codespace with ls -l /usr/local/bin

root@codespaces-e992b2:/workspaces/en525-712-81-sp25-amd-gpu-using-gem5-gem5bootcamp2024#
 ls -l /usr/local/bin
total 3274048
lrwxrwxrwx 1 root   root           36 Feb 26 00:48 actionlint -> 
/usr/local/lib/actionlint/actionlint
-rwxrwxr-x 1 ubuntu ubuntu   21869136 Feb 12 23:27 code
-rwxr-xr-x 1 root   root      2953216 Feb 26 00:48 compose-switch
lrwxrwxrwx 1 root   root           32 Feb 26 00:48 docker-compose -> 
/etc/alternatives/docker-compose
-rwxr-xr-x 1 root   root     73691062 Feb 26 00:48 docker-compose-v1
lrwxrwxrwx 1 root   root           23 Jul 25  2024 gem5 -> 
/usr/local/bin/gem5-chi
-rwxr-xr-x 1 root   root   1185554096 Jul 25  2024 gem5-chi
-rwxr-xr-x 1 root   root   1184136040 Jul 25  2024 gem5-mesi
-rwxr-xr-x 1 root   root    884400248 Jul 25  2024 gem5-vega
root@codespaces-e992b2:/workspaces/en525-712-81-sp25-amd-gpu-using-gem5-gem5bootcamp2024#

Since it returned with the binary, I assumed it would run. It did run after we 
compiled it explicitly.

As it turns out the stats.txt file has two sets of simulation runs documented. 
One completes after the CPU finishes, and the second completes after the CPU 
checks the result.

src/sim/syscall_emul.cc:74: warn: ignoring syscall mprotect(...)
GPU Kernel Completed dump and reset
src/sim/simulate.cc:199: info: Entering event queue @ 105637104000.  Starting 
simulation...
info: check result
PASSED!
breaking loop due to: exiting with last active thread context.
Ticks: 143303043500
Exiting because  exiting with last active thread context

Nick
From: Matt Sinclair 
<mattdsinclair.w...@gmail.com<mailto:mattdsinclair.w...@gmail.com>>
Sent: Sunday, March 2, 2025 4:17 PM
To: The gem5 Users mailing list 
<gem5-users@gem5.org<mailto:gem5-users@gem5.org>>
Cc: Beser, Nicholas D. <nick.be...@jhuapl.edu<mailto:nick.be...@jhuapl.edu>>; 
Jason Lowe-Power <ja...@lowepower.com<mailto:ja...@lowepower.com>>
Subject: [EXT] Re: [gem5-users] Success with GPU

APL external email warning: Verify sender 
mattdsinclair.w...@gmail.com<mailto:mattdsinclair.w...@gmail.com> before 
clicking links or attachments



Hi Nick,

I'm not sure why you believe gem5-vega should be in /usr/local/bin? It's been a 
few months since I last looked at this codespace, but looking at the 
instructions here: 
https://github.com/gem5bootcamp/gem5-bootcamp-env/blob/51590ae00b0e451c9b6a8854addbb94128ab4cac/materials/developing-gem5-models/11-gpu/README.md#to-run-square-in-gem5-static-register-allocator,
 they do not seem to assume /usr/local/bin.  Instead, they are setting up the 
volume for docker for other folders.  Have you tried this command?  Of course, 
it's possible I'm wrong though about /usr/local/bin -- but Bobby or Jason would 
have to answer that.

Setting that aside, it looks like the instructions you have done are basically 
bypassing the prebuilt gem5-vega and building it yourself -- this is ultimately 
fine, and what my students do in my research group, but of course takes a bit 
longer.

What is the issue with the stats file exactly?  I guess you wrote your own CPU 
version of square and that version is not behaving as expected?  I am not an 
expert at the CPU part of gem5, but I'd need more information about how you 
disabled the CPU part to understand or try to look into this.  Likewise, what 
stats are you looking at for the CPU?

Thanks,
Matt

On Sun, Mar 2, 2025 at 2:54 PM Beser, Nicholas D. via gem5-users 
<gem5-users@gem5.org<mailto:gem5-users@gem5.org>> wrote:
Based on the discussion, It seems that docker can’t find the gem5-vega that is 
in the /usr/local/bin. I noticed that the instructions also had us building the 
VEGA_X86/gem5.opt binary with the following command:
                          I.               docker run --volume $(pwd):$(pwd) -w 
$(pwd) ghcr.io/gem5/gcn-gpu:v24-0<http://ghcr.io/gem5/gcn-gpu:v24-0> scons 
build/VEGA_X86/gem5.opt -j# (# is the number of cores on your X86 system)
I build VEGA_X86 in codespace. The following command afterwards was able to run 
the GPU square binary:
 docker run --volume $(pwd):$(pwd) -w $(pwd) 
ghcr.io/gem5/gcn-gpu:v24-0<http://ghcr.io/gem5/gcn-gpu:v24-0> 
gem5/build/VEGA_X86/gem5.opt gem5/configs/example/apu_se.py -n 3 -c 
gem5-resources/src/gpu/square/bin/square
The program exsecuted correctly and create a stats.txt file. I have sent this 
instruction to my class so they could proceed with the experiments using the 
GPU.
I do have a question about the results in the stats.txt file. We noticed that 
the program computed the square operation using the GPU, and then compared the 
result using a CPU only code. When one of my students disabled the CPU only 
code, he did not see a drop in the cpu instructions that would have 
corresponded to that loop. I have them looking at the stats.txt file for 
indications about what resources the GPU had used to perform the operation.

Nick
_______________________________________________
gem5-users mailing list -- gem5-users@gem5.org<mailto:gem5-users@gem5.org>
To unsubscribe send an email to 
gem5-users-le...@gem5.org<mailto:gem5-users-le...@gem5.org>
_______________________________________________
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org

Reply via email to