Hi Nick,

Did you try setting up the docker volume to explicitly include
/usr/local/bin then?  The reason (likely) why it worked after you compiled
gem5.opt explicitly is that the folder you built it in was explicitly
included in the docker volume.  Unless you are saying you built gem5.opt
for VEGA_X86 in /usr/local/bin and it only worked after you did this?

It sounds like you are using (or gem5 is using) m5 dumpreset stats
somewhere in the run (e.g., here:
https://github.com/gem5/gem5/blob/develop/configs/example/apu_se.py#L1078)
and this is causing the separate stats outputs.  But even if that is the
case, a) you should see the print on line 1077 of apu_se.py in your simout
to verify this happened and b) what is the issue with the stats?  Just that
there are two pieces?

Thanks,
Matt

On Sun, Mar 2, 2025 at 3:23 PM Beser, Nicholas D. <nick.be...@jhuapl.edu>
wrote:

> Matt,
>
>
>
> Here is codespace with ls -l /usr/local/bin
>
>
>
> root@codespaces-e992b2:/workspaces/en525-712-81-sp25-amd-gpu-using-gem5-gem5bootcamp2024#
> ls -l /usr/local/bin
>
> total 3274048
>
> lrwxrwxrwx 1 root   root           36 Feb 26 00:48 actionlint ->
> /usr/local/lib/actionlint/actionlint
>
> -rwxrwxr-x 1 ubuntu ubuntu   21869136 Feb 12 23:27 code
>
> -rwxr-xr-x 1 root   root      2953216 Feb 26 00:48 compose-switch
>
> lrwxrwxrwx 1 root   root           32 Feb 26 00:48 docker-compose ->
> /etc/alternatives/docker-compose
>
> -rwxr-xr-x 1 root   root     73691062 Feb 26 00:48 docker-compose-v1
>
> lrwxrwxrwx 1 root   root           23 Jul 25  2024 gem5 ->
> /usr/local/bin/gem5-chi
>
> -rwxr-xr-x 1 root   root   1185554096 Jul 25  2024 gem5-chi
>
> -rwxr-xr-x 1 root   root   1184136040 Jul 25  2024 gem5-mesi
>
> -rwxr-xr-x 1 root   root    884400248 Jul 25  2024 gem5-vega
>
> root@codespaces-e992b2
> :/workspaces/en525-712-81-sp25-amd-gpu-using-gem5-gem5bootcamp2024#
>
>
>
> Since it returned with the binary, I assumed it would run. It did run
> after we compiled it explicitly.
>
>
> As it turns out the stats.txt file has two sets of simulation runs
> documented. One completes after the CPU finishes, and the second completes
> after the CPU checks the result.
>
>
>
> src/sim/syscall_emul.cc:74: warn: ignoring syscall mprotect(...)
>
> GPU Kernel Completed dump and reset
>
> src/sim/simulate.cc:199: info: Entering event queue @ 105637104000.
> Starting simulation...
>
> info: check result
>
> PASSED!
>
> breaking loop due to: exiting with last active thread context.
>
> Ticks: 143303043500
>
> Exiting because  exiting with last active thread context
>
>
>
> Nick
>
> *From:* Matt Sinclair <mattdsinclair.w...@gmail.com>
> *Sent:* Sunday, March 2, 2025 4:17 PM
> *To:* The gem5 Users mailing list <gem5-users@gem5.org>
> *Cc:* Beser, Nicholas D. <nick.be...@jhuapl.edu>; Jason Lowe-Power <
> ja...@lowepower.com>
> *Subject:* [EXT] Re: [gem5-users] Success with GPU
>
>
>
> *APL external email warning: *Verify sender mattdsinclair.w...@gmail.com
> before clicking links or attachments
>
>
>
> Hi Nick,
>
>
>
> I'm not sure why you believe gem5-vega should be in /usr/local/bin? It's
> been a few months since I last looked at this codespace, but looking at the
> instructions here:
> https://github.com/gem5bootcamp/gem5-bootcamp-env/blob/51590ae00b0e451c9b6a8854addbb94128ab4cac/materials/developing-gem5-models/11-gpu/README.md#to-run-square-in-gem5-static-register-allocator,
> they do not seem to assume /usr/local/bin.  Instead, they are setting up
> the volume for docker for other folders.  Have you tried this command?  Of
> course, it's possible I'm wrong though about /usr/local/bin -- but Bobby or
> Jason would have to answer that.
>
>
>
> Setting that aside, it looks like the instructions you have done are
> basically bypassing the prebuilt gem5-vega and building it yourself -- this
> is ultimately fine, and what my students do in my research group, but of
> course takes a bit longer.
>
>
>
> What is the issue with the stats file exactly?  I guess you wrote your own
> CPU version of square and that version is not behaving as expected?  I am
> not an expert at the CPU part of gem5, but I'd need more information about
> how you disabled the CPU part to understand or try to look into this.
> Likewise, what stats are you looking at for the CPU?
>
>
>
> Thanks,
>
> Matt
>
>
>
> On Sun, Mar 2, 2025 at 2:54 PM Beser, Nicholas D. via gem5-users <
> gem5-users@gem5.org> wrote:
>
> Based on the discussion, It seems that docker can’t find the gem5-vega
> that is in the /usr/local/bin. I noticed that the instructions also had us
> building the VEGA_X86/gem5.opt binary with the following command:
>
>                           I.               docker run --volume
> $(pwd):$(pwd) -w $(pwd) ghcr.io/gem5/gcn-gpu:v24-0 scons
> build/VEGA_X86/gem5.opt -j# (# is the number of cores on your X86 system)
>
> I build VEGA_X86 in codespace. The following command afterwards was able
> to run the GPU square binary:
>
>  docker run --volume $(pwd):$(pwd) -w $(pwd) ghcr.io/gem5/gcn-gpu:v24-0
> gem5/build/VEGA_X86/gem5.opt gem5/configs/example/apu_se.py -n 3 -c
> gem5-resources/src/gpu/square/bin/square
>
> The program exsecuted correctly and create a stats.txt file. I have sent
> this instruction to my class so they could proceed with the experiments
> using the GPU.
>
> I do have a question about the results in the stats.txt file. We noticed
> that the program computed the square operation using the GPU, and then
> compared the result using a CPU only code. When one of my students disabled
> the CPU only code, he did not see a drop in the cpu instructions that would
> have corresponded to that loop. I have them looking at the stats.txt file
> for indications about what resources the GPU had used to perform the
> operation.
>
>
>
> Nick
>
> _______________________________________________
> gem5-users mailing list -- gem5-users@gem5.org
> To unsubscribe send an email to gem5-users-le...@gem5.org
>
>
_______________________________________________
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org

Reply via email to