Hi Nick,

In regards to gem5-vega-se, the ls you sent before shows it does not exist
in /usr/local/bin, so it's not surprising that didn't work.

In regards to docker more broadly though, setting up volumes is tricky.
The mental model I always have is: with this volume I specify, is
everything I need accessible or not?  When you "only" specify
/usr/local/bin, everything not in /usr/local/bin will not be part of the
docker's volume when it runs -- which is why Python (for example) cannot be
found in that case.  So, you'd need to setup a volume(s) with all of the
files you need accessible in order to avoid the Python error.

This is why I had asked if you ran the same commands as specified here:
https://github.com/gem5bootcamp/gem5-bootcamp-env/blob/51590ae00b0e451c9b6a8854addbb94128ab4cac/materials/developing-gem5-models/11-gpu/README.md#to-run-square-in-gem5-static-register-allocator,
because I believe they were set up so all of this is handled for you in the
bootcamp.

Matt

On Sun, Mar 2, 2025 at 3:37 PM Beser, Nicholas D. <nick.be...@jhuapl.edu>
wrote:

> Matt,
>
>
>
> I really appreciate your help with this section.
>
>
>
> I had tried explicitly specifying the user/local/bin based on a readme.
> Doc that was listed when codespace started:
>
>
>
> 55  docker run -v $PWD:$PWD -v /usr/local/bin:/usr/local/bin -w $PWD
> ghcr.io/gem5/gcn-gpu:v24-0 gem5-vega-se gem5/configs/example/apu_se.py -n
> 3 -c square
>
>    56  docker run -v $PWD:$PWD -v /usr/local/bin:/usr/local/bin -w $PWD
> ghcr.io/gem5/gcn-gpu:v24-0 gem5-vega gem5/configs/example/apu_se.py -n 3
> -c square
>
>
>
> Gem5-vega-se did not exist, and the second run produced the following
> error message:
>
>
>
> docker run -v $PWD:$PWD -v /usr/local/bin:/usr/local/bin -w $PWD
> ghcr.io/gem5/gcn-gpu:v24-0 gem5-vega gem5/configs/example/apu_se.py -n 3
> -c square
>
> gem5-vega: error while loading shared libraries: libpython3.12.so.1.0:
> cannot open shared object file: No such file or directory
>
>
>
> Nick
>
>
>
> *From:* Matt Sinclair <mattdsinclair.w...@gmail.com>
> *Sent:* Sunday, March 2, 2025 4:30 PM
> *To:* Beser, Nicholas D. <nick.be...@jhuapl.edu>
> *Cc:* The gem5 Users mailing list <gem5-users@gem5.org>; Jason Lowe-Power
> <ja...@lowepower.com>
> *Subject:* Re: [EXT] Re: [gem5-users] Success with GPU
>
>
>
> *APL external email warning: *Verify sender mattdsinclair.w...@gmail.com
> before clicking links or attachments
>
>
>
> Hi Nick,
>
>
>
> Did you try setting up the docker volume to explicitly include
> /usr/local/bin then?  The reason (likely) why it worked after you compiled
> gem5.opt explicitly is that the folder you built it in was explicitly
> included in the docker volume.  Unless you are saying you built gem5.opt
> for VEGA_X86 in /usr/local/bin and it only worked after you did this?
>
>
>
> It sounds like you are using (or gem5 is using) m5 dumpreset stats
> somewhere in the run (e.g., here:
> https://github.com/gem5/gem5/blob/develop/configs/example/apu_se.py#L1078)
> and this is causing the separate stats outputs.  But even if that is the
> case, a) you should see the print on line 1077 of apu_se.py in your simout
> to verify this happened and b) what is the issue with the stats?  Just that
> there are two pieces?
>
>
>
> Thanks,
>
> Matt
>
>
>
> On Sun, Mar 2, 2025 at 3:23 PM Beser, Nicholas D. <nick.be...@jhuapl.edu>
> wrote:
>
> Matt,
>
>
>
> Here is codespace with ls -l /usr/local/bin
>
>
>
> root@codespaces-e992b2:/workspaces/en525-712-81-sp25-amd-gpu-using-gem5-gem5bootcamp2024#
> ls -l /usr/local/bin
>
> total 3274048
>
> lrwxrwxrwx 1 root   root           36 Feb 26 00:48 actionlint ->
> /usr/local/lib/actionlint/actionlint
>
> -rwxrwxr-x 1 ubuntu ubuntu   21869136 Feb 12 23:27 code
>
> -rwxr-xr-x 1 root   root      2953216 Feb 26 00:48 compose-switch
>
> lrwxrwxrwx 1 root   root           32 Feb 26 00:48 docker-compose ->
> /etc/alternatives/docker-compose
>
> -rwxr-xr-x 1 root   root     73691062 Feb 26 00:48 docker-compose-v1
>
> lrwxrwxrwx 1 root   root           23 Jul 25  2024 gem5 ->
> /usr/local/bin/gem5-chi
>
> -rwxr-xr-x 1 root   root   1185554096 Jul 25  2024 gem5-chi
>
> -rwxr-xr-x 1 root   root   1184136040 Jul 25  2024 gem5-mesi
>
> -rwxr-xr-x 1 root   root    884400248 Jul 25  2024 gem5-vega
>
> root@codespaces-e992b2
> :/workspaces/en525-712-81-sp25-amd-gpu-using-gem5-gem5bootcamp2024#
>
>
>
> Since it returned with the binary, I assumed it would run. It did run
> after we compiled it explicitly.
>
>
> As it turns out the stats.txt file has two sets of simulation runs
> documented. One completes after the CPU finishes, and the second completes
> after the CPU checks the result.
>
>
>
> src/sim/syscall_emul.cc:74: warn: ignoring syscall mprotect(...)
>
> GPU Kernel Completed dump and reset
>
> src/sim/simulate.cc:199: info: Entering event queue @ 105637104000.
> Starting simulation...
>
> info: check result
>
> PASSED!
>
> breaking loop due to: exiting with last active thread context.
>
> Ticks: 143303043500
>
> Exiting because  exiting with last active thread context
>
>
>
> Nick
>
> *From:* Matt Sinclair <mattdsinclair.w...@gmail.com>
> *Sent:* Sunday, March 2, 2025 4:17 PM
> *To:* The gem5 Users mailing list <gem5-users@gem5.org>
> *Cc:* Beser, Nicholas D. <nick.be...@jhuapl.edu>; Jason Lowe-Power <
> ja...@lowepower.com>
> *Subject:* [EXT] Re: [gem5-users] Success with GPU
>
>
>
> *APL external email warning: *Verify sender mattdsinclair.w...@gmail.com
> before clicking links or attachments
>
>
>
> Hi Nick,
>
>
>
> I'm not sure why you believe gem5-vega should be in /usr/local/bin? It's
> been a few months since I last looked at this codespace, but looking at the
> instructions here:
> https://github.com/gem5bootcamp/gem5-bootcamp-env/blob/51590ae00b0e451c9b6a8854addbb94128ab4cac/materials/developing-gem5-models/11-gpu/README.md#to-run-square-in-gem5-static-register-allocator,
> they do not seem to assume /usr/local/bin.  Instead, they are setting up
> the volume for docker for other folders.  Have you tried this command?  Of
> course, it's possible I'm wrong though about /usr/local/bin -- but Bobby or
> Jason would have to answer that.
>
>
>
> Setting that aside, it looks like the instructions you have done are
> basically bypassing the prebuilt gem5-vega and building it yourself -- this
> is ultimately fine, and what my students do in my research group, but of
> course takes a bit longer.
>
>
>
> What is the issue with the stats file exactly?  I guess you wrote your own
> CPU version of square and that version is not behaving as expected?  I am
> not an expert at the CPU part of gem5, but I'd need more information about
> how you disabled the CPU part to understand or try to look into this.
> Likewise, what stats are you looking at for the CPU?
>
>
>
> Thanks,
>
> Matt
>
>
>
> On Sun, Mar 2, 2025 at 2:54 PM Beser, Nicholas D. via gem5-users <
> gem5-users@gem5.org> wrote:
>
> Based on the discussion, It seems that docker can’t find the gem5-vega
> that is in the /usr/local/bin. I noticed that the instructions also had us
> building the VEGA_X86/gem5.opt binary with the following command:
>
>                           I.               docker run --volume
> $(pwd):$(pwd) -w $(pwd) ghcr.io/gem5/gcn-gpu:v24-0 scons
> build/VEGA_X86/gem5.opt -j# (# is the number of cores on your X86 system)
>
> I build VEGA_X86 in codespace. The following command afterwards was able
> to run the GPU square binary:
>
>  docker run --volume $(pwd):$(pwd) -w $(pwd) ghcr.io/gem5/gcn-gpu:v24-0
> gem5/build/VEGA_X86/gem5.opt gem5/configs/example/apu_se.py -n 3 -c
> gem5-resources/src/gpu/square/bin/square
>
> The program exsecuted correctly and create a stats.txt file. I have sent
> this instruction to my class so they could proceed with the experiments
> using the GPU.
>
> I do have a question about the results in the stats.txt file. We noticed
> that the program computed the square operation using the GPU, and then
> compared the result using a CPU only code. When one of my students disabled
> the CPU only code, he did not see a drop in the cpu instructions that would
> have corresponded to that loop. I have them looking at the stats.txt file
> for indications about what resources the GPU had used to perform the
> operation.
>
>
>
> Nick
>
> _______________________________________________
> gem5-users mailing list -- gem5-users@gem5.org
> To unsubscribe send an email to gem5-users-le...@gem5.org
>
>
_______________________________________________
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org

Reply via email to